如何使用dplyr和purrr计算总和?

huangapple go评论78阅读模式
英文:

How to calculate sum with dplyr and purrr?

问题

使用 {dplyr} 和 {purrr},我想要计算以"eff"开头的每个数值列的总和。

library(dplyr)
library(purrr)

mydf <- tribble(
  ~categ_21, ~categ_22, ~eff_21, ~eff_22,
  "a",  "b",   1,   5,
  "b",  "b",   2,   6,
  "c",  "c",   3,   7,
  "c",  "a",   4,   8
)

我想要的结果是:

result <- tribble(
  ~categ, ~eff_21, ~eff_22,
  "a",  1,   8,
  "b",  2,   11,
  "c",  7,   7
) 

我尝试过,但它创建了多个数据框,并且很冗长,这就是为什么我想要使用 {purrr} 的原因,因为在我的真实工作数据框中,除了"21"和"22"之外,还有更多的列:

mydf %>% 
  group_by(categ_21) %>% 
  summarise(total_21 = sum(eff_21))

mydf %>% 
  group_by(categ_22) %>% 
  summarise(total_22 = sum(eff_22))

谢谢!

英文:

With {dplyr} and {purrr} I would like to calculate the sum of each numerical column that begins with "eff".

library(dplyr)
library(purrr)

mydf &lt;- tribble(
  ~categ_21, ~categ_22, ~eff_21, ~eff_22,
  &quot;a&quot;,  &quot;b&quot;,   1,   5,
  &quot;b&quot;,  &quot;b&quot;,   2,   6,
  &quot;c&quot;,  &quot;c&quot;,   3,   7,
  &quot;c&quot;,  &quot;a&quot;,   4,   8
)

What I want :

result &lt;- tribble(
  ~categ, ~eff_21, ~eff_22,
  &quot;a&quot;,  1,   8,
  &quot;b&quot;,  2,   11,
  &quot;c&quot;,  7,   7
) 

I tried but it creates several data.frames and it is long, that's why I want to use {purrr} because in my real working data.frame I have more columns than "21" and "22" :

mydf %&gt;% 
  group_by(categ_21) %&gt;% 
  summarise(total_21 = sum(eff_21))

mydf %&gt;% 
  group_by(categ_22) %&gt;% 
  summarise(total_22 = sum(eff_22))

Thanks!

答案1

得分: 5

在这种特定情况下,您可能会发现将长格式数据转换为宽格式数据,然后再转回长格式数据更方便:\n\n使用dplyr库和tidyr库\n\n将mydf数据框转为长格式,根据列名中的'cat'和'eff'进行转换,使用'_21'和'_22'作为列名的后缀,并对'eff'进行求和。\n\n输出结果:\n\n categ eff_21 eff_22\n <chr> <dbl> <dbl>\n1 a 1 8\n2 b 2 11\n3 c 7 7\n

英文:

In this particular case, you may find it convenient to pivot long, and then back to wide:

library(dplyr)
library(tidyr)


mydf %&gt;% 
  pivot_longer(everything(),names_to = c(&quot;.value&quot;, &quot;cat&quot;), names_pattern=&quot;(.*)_(.*)&quot;) %&gt;% 
  pivot_wider(categ,names_from = cat,values_from = eff, values_fn = sum,names_prefix = &quot;eff_&quot;)

Output:

  categ eff_21 eff_22
  &lt;chr&gt;  &lt;dbl&gt;  &lt;dbl&gt;
1 a          1      8
2 b          2     11
3 c          7      7

答案2

得分: 1

以下是您要翻译的内容:

为了趣味和完整性,使用 stackaggregatereshape基本 R 方法:

reshape(
  aggregate(. ~ categ + ind, 
    data.frame(categ = stack(mydf[,grep("cate", colnames(mydf))])[[1]], 
                       stack(mydf[,grep("eff", colnames(mydf))])), sum), 
  timevar="ind", idvar="categ", direction="wide")
  categ values.eff_21 values.eff_22
1     a             1             8
2     b             2            11
3     c             7             7
英文:

For fun and completeness a base R approach using stack, aggregate and reshape

reshape(
  aggregate(. ~ categ + ind, 
    data.frame(categ = stack(mydf[,grep(&quot;cate&quot;, colnames(mydf))])[[1]], 
                       stack(mydf[,grep(&quot;eff&quot;, colnames(mydf))])), sum), 
  timevar=&quot;ind&quot;, idvar=&quot;categ&quot;, direction=&quot;wide&quot;)
  categ values.eff_21 values.eff_22
1     a             1             8
2     b             2            11
3     c             7             7

huangapple
  • 本文由 发表于 2023年5月10日 23:12:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76220085.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定