英文:
Calculate sum column filtering identical values on multiple columns
问题
我有多列数据(S1,S2,S3),并试图创建一个总和列(result)。我想要对具有S1、S2和S3列中相同值的每一行的值进行求和。以下是示例数据和我想要的结果。
S1 <- c(1,1,1,0,1,0)
S2 <- c(1,1,1,0,1,0)
S3 <- c(1,0,0,0,0,0)
value <- c(9,5,3,2,4,1)
result <- c(9,12,12,3,12,3)
df <- data.frame(S1,S2,S3,value,result)
df
S1 S2 S3 value result
1 1 1 1 9 9
2 1 1 0 5 12
3 1 1 0 3 12
4 0 0 0 2 3
5 1 1 0 4 12
6 0 0 0 1 3
我尝试使用rowwise()
和sapply()
函数,但无法使筛选工作或无法获得每一行的值。我正在寻找一个可用于更多行和列的解决方案。
英文:
I have data for multiple columns (S1,S2,S3) and I'm trying to create a sum column (result). I want to sum values for each row that has identical values in S1, S2 and S3 columns. Here is a sample data and the result I'm looking for.
S1 <- c(1,1,1,0,1,0)
S2 <- c(1,1,1,0,1,0)
S3 <- c(1,0,0,0,0,0)
value <- c(9,5,3,2,4,1)
result <- c(9,12,12,3,12,3)
df <- data.frame(S1,S2,S3,value,result)
df
S1 S2 S3 value result
1 1 1 1 9 9
2 1 1 0 5 12
3 1 1 0 3 12
4 0 0 0 2 3
5 1 1 0 4 12
6 0 0 0 1 3
I tried using rowwise() and sapply() functions but I couldn't get filtering to work or couldn't get values for each row. I'm looking for a solution that can be used for larger amount of rows and columns.
答案1
得分: 1
你可以对 S1 到 S3 进行分组并求和数值:
使用dplyr库:
library(dplyr)
df %>%
group_by(across(S1:S3)) %>%
mutate(result = sum(value)) %>%
ungroup()
S1 S2 S3 value result
1 1 1 1 9 9
2 1 1 0 5 12
3 1 1 0 3 12
4 0 0 0 2 3
5 1 1 0 4 12
6 0 0 0 1 3
或者使用基本的R语法:
df$result <- with(df, ave(value, paste0(S1, S2, S3), FUN = sum))
英文:
You can group_by
S1 to S3 and sum value:
library(dplyr)
df %>%
group_by(across(S1:S3)) %>%
mutate(result = sum(value)) %>%
ungroup()
S1 S2 S3 value result
1 1 1 1 9 9
2 1 1 0 5 12
3 1 1 0 3 12
4 0 0 0 2 3
5 1 1 0 4 12
6 0 0 0 1 3
Or in base R:
df$result <- with(df, ave(value, paste0(S1, S2, S3), FUN = sum))
答案2
得分: 1
使用 `data.table`
```R
library(data.table)
setDT(df)[, result := sum(value), by = .(S1, S2, S3)]
-输出
> df
S1 S2 S3 value result
1: 1 1 1 9 9
2: 1 1 0 5 12
3: 1 1 0 3 12
4: 0 0 0 2 3
5: 1 1 0 4 12
6: 0 0 0 1 3
英文:
Using data.table
library(data.table)
setDT(df)[, result := sum(value), by = .(S1, S2, S3)]
-output
> df
S1 S2 S3 value result
1: 1 1 1 9 9
2: 1 1 0 5 12
3: 1 1 0 3 12
4: 0 0 0 2 3
5: 1 1 0 4 12
6: 0 0 0 1 3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论