如何使用dplyr和purrr计算总和?

huangapple go评论110阅读模式
英文:

How to calculate sum with dplyr and purrr?

问题

使用 {dplyr} 和 {purrr},我想要计算以"eff"开头的每个数值列的总和。

  1. library(dplyr)
  2. library(purrr)
  3. mydf <- tribble(
  4. ~categ_21, ~categ_22, ~eff_21, ~eff_22,
  5. "a", "b", 1, 5,
  6. "b", "b", 2, 6,
  7. "c", "c", 3, 7,
  8. "c", "a", 4, 8
  9. )

我想要的结果是:

  1. result <- tribble(
  2. ~categ, ~eff_21, ~eff_22,
  3. "a", 1, 8,
  4. "b", 2, 11,
  5. "c", 7, 7
  6. )

我尝试过,但它创建了多个数据框,并且很冗长,这就是为什么我想要使用 {purrr} 的原因,因为在我的真实工作数据框中,除了"21"和"22"之外,还有更多的列:

  1. mydf %>%
  2. group_by(categ_21) %>%
  3. summarise(total_21 = sum(eff_21))
  4. mydf %>%
  5. group_by(categ_22) %>%
  6. summarise(total_22 = sum(eff_22))

谢谢!

英文:

With {dplyr} and {purrr} I would like to calculate the sum of each numerical column that begins with "eff".

  1. library(dplyr)
  2. library(purrr)
  3. mydf &lt;- tribble(
  4. ~categ_21, ~categ_22, ~eff_21, ~eff_22,
  5. &quot;a&quot;, &quot;b&quot;, 1, 5,
  6. &quot;b&quot;, &quot;b&quot;, 2, 6,
  7. &quot;c&quot;, &quot;c&quot;, 3, 7,
  8. &quot;c&quot;, &quot;a&quot;, 4, 8
  9. )

What I want :

  1. result &lt;- tribble(
  2. ~categ, ~eff_21, ~eff_22,
  3. &quot;a&quot;, 1, 8,
  4. &quot;b&quot;, 2, 11,
  5. &quot;c&quot;, 7, 7
  6. )

I tried but it creates several data.frames and it is long, that's why I want to use {purrr} because in my real working data.frame I have more columns than "21" and "22" :

  1. mydf %&gt;%
  2. group_by(categ_21) %&gt;%
  3. summarise(total_21 = sum(eff_21))
  4. mydf %&gt;%
  5. group_by(categ_22) %&gt;%
  6. summarise(total_22 = sum(eff_22))

Thanks!

答案1

得分: 5

在这种特定情况下,您可能会发现将长格式数据转换为宽格式数据,然后再转回长格式数据更方便:\n\n使用dplyr库和tidyr库\n\n将mydf数据框转为长格式,根据列名中的'cat'和'eff'进行转换,使用'_21'和'_22'作为列名的后缀,并对'eff'进行求和。\n\n输出结果:\n\n categ eff_21 eff_22\n <chr> <dbl> <dbl>\n1 a 1 8\n2 b 2 11\n3 c 7 7\n

英文:

In this particular case, you may find it convenient to pivot long, and then back to wide:

  1. library(dplyr)
  2. library(tidyr)
  3. mydf %&gt;%
  4. pivot_longer(everything(),names_to = c(&quot;.value&quot;, &quot;cat&quot;), names_pattern=&quot;(.*)_(.*)&quot;) %&gt;%
  5. pivot_wider(categ,names_from = cat,values_from = eff, values_fn = sum,names_prefix = &quot;eff_&quot;)

Output:

  1. categ eff_21 eff_22
  2. &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
  3. 1 a 1 8
  4. 2 b 2 11
  5. 3 c 7 7

答案2

得分: 1

以下是您要翻译的内容:

为了趣味和完整性,使用 stackaggregatereshape基本 R 方法:

  1. reshape(
  2. aggregate(. ~ categ + ind,
  3. data.frame(categ = stack(mydf[,grep("cate", colnames(mydf))])[[1]],
  4. stack(mydf[,grep("eff", colnames(mydf))])), sum),
  5. timevar="ind", idvar="categ", direction="wide")
  6. categ values.eff_21 values.eff_22
  7. 1 a 1 8
  8. 2 b 2 11
  9. 3 c 7 7
英文:

For fun and completeness a base R approach using stack, aggregate and reshape

  1. reshape(
  2. aggregate(. ~ categ + ind,
  3. data.frame(categ = stack(mydf[,grep(&quot;cate&quot;, colnames(mydf))])[[1]],
  4. stack(mydf[,grep(&quot;eff&quot;, colnames(mydf))])), sum),
  5. timevar=&quot;ind&quot;, idvar=&quot;categ&quot;, direction=&quot;wide&quot;)
  6. categ values.eff_21 values.eff_22
  7. 1 a 1 8
  8. 2 b 2 11
  9. 3 c 7 7

huangapple
  • 本文由 发表于 2023年5月10日 23:12:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76220085.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定