英文:
How to calculate sum with dplyr and purrr?
问题
使用 {dplyr} 和 {purrr},我想要计算以"eff"开头的每个数值列的总和。
library(dplyr)
library(purrr)
mydf <- tribble(
~categ_21, ~categ_22, ~eff_21, ~eff_22,
"a", "b", 1, 5,
"b", "b", 2, 6,
"c", "c", 3, 7,
"c", "a", 4, 8
)
我想要的结果是:
result <- tribble(
~categ, ~eff_21, ~eff_22,
"a", 1, 8,
"b", 2, 11,
"c", 7, 7
)
我尝试过,但它创建了多个数据框,并且很冗长,这就是为什么我想要使用 {purrr} 的原因,因为在我的真实工作数据框中,除了"21"和"22"之外,还有更多的列:
mydf %>%
group_by(categ_21) %>%
summarise(total_21 = sum(eff_21))
mydf %>%
group_by(categ_22) %>%
summarise(total_22 = sum(eff_22))
谢谢!
英文:
With {dplyr} and {purrr} I would like to calculate the sum of each numerical column that begins with "eff".
library(dplyr)
library(purrr)
mydf <- tribble(
~categ_21, ~categ_22, ~eff_21, ~eff_22,
"a", "b", 1, 5,
"b", "b", 2, 6,
"c", "c", 3, 7,
"c", "a", 4, 8
)
What I want :
result <- tribble(
~categ, ~eff_21, ~eff_22,
"a", 1, 8,
"b", 2, 11,
"c", 7, 7
)
I tried but it creates several data.frames and it is long, that's why I want to use {purrr} because in my real working data.frame I have more columns than "21" and "22" :
mydf %>%
group_by(categ_21) %>%
summarise(total_21 = sum(eff_21))
mydf %>%
group_by(categ_22) %>%
summarise(total_22 = sum(eff_22))
Thanks!
答案1
得分: 5
在这种特定情况下,您可能会发现将长格式数据转换为宽格式数据,然后再转回长格式数据更方便:\n\n使用dplyr库和tidyr库\n\n将mydf数据框转为长格式,根据列名中的'cat'和'eff'进行转换,使用'_21'和'_22'作为列名的后缀,并对'eff'进行求和。\n
\n输出结果:\n\n categ eff_21 eff_22\n <chr> <dbl> <dbl>\n1 a 1 8\n2 b 2 11\n3 c 7 7\n
英文:
In this particular case, you may find it convenient to pivot long, and then back to wide:
library(dplyr)
library(tidyr)
mydf %>%
pivot_longer(everything(),names_to = c(".value", "cat"), names_pattern="(.*)_(.*)") %>%
pivot_wider(categ,names_from = cat,values_from = eff, values_fn = sum,names_prefix = "eff_")
Output:
categ eff_21 eff_22
<chr> <dbl> <dbl>
1 a 1 8
2 b 2 11
3 c 7 7
答案2
得分: 1
以下是您要翻译的内容:
为了趣味和完整性,使用 stack
、aggregate
和 reshape
的 基本 R 方法:
reshape(
aggregate(. ~ categ + ind,
data.frame(categ = stack(mydf[,grep("cate", colnames(mydf))])[[1]],
stack(mydf[,grep("eff", colnames(mydf))])), sum),
timevar="ind", idvar="categ", direction="wide")
categ values.eff_21 values.eff_22
1 a 1 8
2 b 2 11
3 c 7 7
英文:
For fun and completeness a base R approach using stack
, aggregate
and reshape
reshape(
aggregate(. ~ categ + ind,
data.frame(categ = stack(mydf[,grep("cate", colnames(mydf))])[[1]],
stack(mydf[,grep("eff", colnames(mydf))])), sum),
timevar="ind", idvar="categ", direction="wide")
categ values.eff_21 values.eff_22
1 a 1 8
2 b 2 11
3 c 7 7
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论