2023年5月10日 23:12:30go评论110阅读模式

英文:

How to calculate sum with dplyr and purrr?

问题

使用 {dplyr} 和 {purrr}，我想要计算以"eff"开头的每个数值列的总和。

library(dplyr)
library(purrr)
mydf <- tribble(
  ~categ_21, ~categ_22, ~eff_21, ~eff_22,
  "a",  "b",   1,   5,
  "b",  "b",   2,   6,
  "c",  "c",   3,   7,
  "c",  "a",   4,   8
)

我想要的结果是：

result <- tribble(
  ~categ, ~eff_21, ~eff_22,
  "a",  1,   8,
  "b",  2,   11,
  "c",  7,   7
)

我尝试过，但它创建了多个数据框，并且很冗长，这就是为什么我想要使用 {purrr} 的原因，因为在我的真实工作数据框中，除了"21"和"22"之外，还有更多的列：

mydf %>% 
  group_by(categ_21) %>% 
  summarise(total_21 = sum(eff_21))
mydf %>% 
  group_by(categ_22) %>% 
  summarise(total_22 = sum(eff_22))

谢谢！

英文:

With {dplyr} and {purrr} I would like to calculate the sum of each numerical column that begins with "eff".

library(dplyr)
library(purrr)
mydf &lt;- tribble(
  ~categ_21, ~categ_22, ~eff_21, ~eff_22,
  &quot;a&quot;,  &quot;b&quot;,   1,   5,
  &quot;b&quot;,  &quot;b&quot;,   2,   6,
  &quot;c&quot;,  &quot;c&quot;,   3,   7,
  &quot;c&quot;,  &quot;a&quot;,   4,   8
)

What I want :

result &lt;- tribble(
  ~categ, ~eff_21, ~eff_22,
  &quot;a&quot;,  1,   8,
  &quot;b&quot;,  2,   11,
  &quot;c&quot;,  7,   7
)

I tried but it creates several data.frames and it is long, that's why I want to use {purrr} because in my real working data.frame I have more columns than "21" and "22" :

mydf %&gt;% 
  group_by(categ_21) %&gt;% 
  summarise(total_21 = sum(eff_21))
mydf %&gt;% 
  group_by(categ_22) %&gt;% 
  summarise(total_22 = sum(eff_22))

Thanks!

答案1

得分: 5

在这种特定情况下，您可能会发现将长格式数据转换为宽格式数据，然后再转回长格式数据更方便：\n\n使用dplyr库和tidyr库\n\n将mydf数据框转为长格式，根据列名中的'cat'和'eff'进行转换，使用'_21'和'_22'作为列名的后缀，并对'eff'进行求和。\n\n输出结果：\n\n categ eff_21 eff_22\n <chr> <dbl> <dbl>\n1 a 1 8\n2 b 2 11\n3 c 7 7\n

英文:

In this particular case, you may find it convenient to pivot long, and then back to wide:

library(dplyr)
library(tidyr)
mydf %&gt;% 
  pivot_longer(everything(),names_to = c(&quot;.value&quot;, &quot;cat&quot;), names_pattern=&quot;(.*)_(.*)&quot;) %&gt;% 
  pivot_wider(categ,names_from = cat,values_from = eff, values_fn = sum,names_prefix = &quot;eff_&quot;)

Output:

  categ eff_21 eff_22
  &lt;chr&gt;  &lt;dbl&gt;  &lt;dbl&gt;
1 a          1      8
2 b          2     11
3 c          7      7

答案2

得分: 1

以下是您要翻译的内容：

为了趣味和完整性，使用 stack、aggregate 和 reshape 的 基本 R 方法：

reshape(
  aggregate(. ~ categ + ind, 
    data.frame(categ = stack(mydf[,grep("cate", colnames(mydf))])[[1]], 
                       stack(mydf[,grep("eff", colnames(mydf))])), sum), 
  timevar="ind", idvar="categ", direction="wide")
  categ values.eff_21 values.eff_22
1     a             1             8
2     b             2            11
3     c             7             7

英文:

For fun and completeness a base R approach using stack, aggregate and reshape

reshape(
  aggregate(. ~ categ + ind, 
    data.frame(categ = stack(mydf[,grep(&quot;cate&quot;, colnames(mydf))])[[1]], 
                       stack(mydf[,grep(&quot;eff&quot;, colnames(mydf))])), sum), 
  timevar=&quot;ind&quot;, idvar=&quot;categ&quot;, direction=&quot;wide&quot;)
  categ values.eff_21 values.eff_22
1     a             1             8
2     b             2            11
3     c             7             7

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用dplyr和purrr计算总和？

问题

答案1

答案2

在R中进行有条件的累加求和，包括处理NA值并跟踪周期波动。

如何循环以下 group_by

访问父文件夹中的扩展功能

在dplyr中使用Group by并确定最近的条目是否重复。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。