R – 如何将汇总的行结果放入列中

huangapple go评论64阅读模式
英文:

R - How to put aggregated row results as columns

问题

假设有一个名为df的数据框:

age category
12  A
15  B
12  A
13  C
14  B
14  D

你想使用聚合函数来找出每个年龄(age)中每个类别(category){A, B, C, D}的出现次数。A、B、C和D的计数分别作为列,因此输出的数据框应如下所示:

age A B C D
12  2 0 0 0
13  0 0 1 0
14  0 1 0 1
15  0 1 0 0

你尝试的代码如下:

agdf <- aggregate(df, by=list(df$age, df$category), FUN=length)

但这只会得到以下结果:

age category x
12  A        2
15  B        1
14  B        1
13  C        1
14  D        1

另一个问题是原始的df数据框还有其他列,但为简化起见,它们已被省略。然而,使用FUN=length的聚合方法将所有这些其他列都变成与"x"列相同的计数值。你如何保留这些值?

例如:

age category x  y  z
12  A        2  2  2
15  B        1  1  1
14  B        1  1  1
13  C        1  1  1
14  D        1  1  1

但我只想要y和z保持它们的原始值,只需要一个计数列x。如何将数据框结构调整为所需的形式?

解决方法:

你可以使用reshape2包中的dcast函数来实现你想要的结果。首先,确保你已经加载了reshape2包。然后,可以按照以下方式操作:

library(reshape2)

# 使用dcast函数
result <- dcast(df, age ~ category, value.var = "category", fun.aggregate = length, fill = 0)

# 如果需要保留其他列,可以使用merge函数
# 假设df包括其他列y和z
df <- merge(df, result, by = "age", all = TRUE)

这将生成你所需的数据框,其中"A"、"B"、"C"和"D"的计数作为列,而其他列(例如y和z)也被保留。

英文:

Suppose there is dataframe df

age category
12  A
15  B
12  A
13  C
14  B
14  D

I want to use aggregate to find the number of occurrences in each category {A, B, C, D} for each age. The number of A, B, C and D respectively are established as columns so the output data frame should look like

age A B C D
12  2 0 0 0
13  0 0 1 0
14  0 1 0 1
15  0 1 0 0

Attempt

agdf &lt;- aggregate(df, by=list(df$age, df$category), FUN=length)

But doing this only gives me

age category x
12  A        2
15  B        1
14  B        1
13  C        1
14  D        1

Another problem is that the original df has other columns but they have been omitted for simplicity. Yet with this aggregate approach using FUN=length, it turns all those other columns into the same count value as x. How can I keep those values?

E.g.

age category x  y  z
12  A        2  2  2
15  B        1  1  1
14  B        1  1  1
13  C        1  1  1
14  D        1  1  1

but I want y and z to keep their original values, only need 1 count column x

How to massage it to the desired structure?

答案1

得分: 1

xtabs(~., df1)

    category
age  A B C D
  12 2 0 0 0
  13 0 0 1 0
  14 0 1 0 1
  15 0 1 0 0

table(df1)
category
age A B C D
12 2 0 0 0
13 0 0 1 0
14 0 1 0 1
15 0 1 0 0

reshape2::dcast(df1, age~category)
age A B C D
1 12 2 0 0 0
2 13 0 0 1 0
3 14 0 1 0 1
4 15 0 1 0 0

pivot_wider(df1, id_cols = age, names_from = category,
values_from = category, values_fn = length, values_fill = 0)

A tibble: 4 × 5

age     A     B     C     D


1 12 2 0 0 0
2 15 0 1 0 0
3 13 0 0 1 0
4 14 0 1 0 1

英文:
xtabs(~., df1)

    category
age  A B C D
  12 2 0 0 0
  13 0 0 1 0
  14 0 1 0 1
  15 0 1 0 0

table(df1)
    category
age  A B C D
  12 2 0 0 0
  13 0 0 1 0
  14 0 1 0 1
  15 0 1 0 0

reshape2::dcast(df1, age~category)
  age A B C D
1  12 2 0 0 0
2  13 0 0 1 0
3  14 0 1 0 1
4  15 0 1 0 0

pivot_wider(df1, id_cols = age, names_from = category,
              values_from = category, values_fn = length, values_fill = 0)
# A tibble: 4 &#215; 5
    age     A     B     C     D
  &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1    12     2     0     0     0
2    15     0     1     0     0
3    13     0     0     1     0
4    14     0     1     0     1

huangapple
  • 本文由 发表于 2023年2月24日 15:37:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75553737.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定