计算多列上相同值的列之和,进行筛选。

huangapple go评论104阅读模式
英文:

Calculate sum column filtering identical values on multiple columns

问题

我有多列数据(S1,S2,S3),并试图创建一个总和列(result)。我想要对具有S1、S2和S3列中相同值的每一行的值进行求和。以下是示例数据和我想要的结果。

  1. S1 <- c(1,1,1,0,1,0)
  2. S2 <- c(1,1,1,0,1,0)
  3. S3 <- c(1,0,0,0,0,0)
  4. value <- c(9,5,3,2,4,1)
  5. result <- c(9,12,12,3,12,3)
  6. df <- data.frame(S1,S2,S3,value,result)
  7. df
  8. S1 S2 S3 value result
  9. 1 1 1 1 9 9
  10. 2 1 1 0 5 12
  11. 3 1 1 0 3 12
  12. 4 0 0 0 2 3
  13. 5 1 1 0 4 12
  14. 6 0 0 0 1 3

我尝试使用rowwise()sapply()函数,但无法使筛选工作或无法获得每一行的值。我正在寻找一个可用于更多行和列的解决方案。

英文:

I have data for multiple columns (S1,S2,S3) and I'm trying to create a sum column (result). I want to sum values for each row that has identical values in S1, S2 and S3 columns. Here is a sample data and the result I'm looking for.

  1. S1 &lt;- c(1,1,1,0,1,0)
  2. S2 &lt;- c(1,1,1,0,1,0)
  3. S3 &lt;- c(1,0,0,0,0,0)
  4. value &lt;- c(9,5,3,2,4,1)
  5. result &lt;- c(9,12,12,3,12,3)
  6. df &lt;- data.frame(S1,S2,S3,value,result)
  7. df
  8. S1 S2 S3 value result
  9. 1 1 1 1 9 9
  10. 2 1 1 0 5 12
  11. 3 1 1 0 3 12
  12. 4 0 0 0 2 3
  13. 5 1 1 0 4 12
  14. 6 0 0 0 1 3

I tried using rowwise() and sapply() functions but I couldn't get filtering to work or couldn't get values for each row. I'm looking for a solution that can be used for larger amount of rows and columns.

答案1

得分: 1

你可以对 S1 到 S3 进行分组并求和数值:

使用dplyr库:

  1. library(dplyr)
  2. df %>%
  3. group_by(across(S1:S3)) %>%
  4. mutate(result = sum(value)) %>%
  5. ungroup()
  6. S1 S2 S3 value result
  7. 1 1 1 1 9 9
  8. 2 1 1 0 5 12
  9. 3 1 1 0 3 12
  10. 4 0 0 0 2 3
  11. 5 1 1 0 4 12
  12. 6 0 0 0 1 3

或者使用基本的R语法:

  1. df$result <- with(df, ave(value, paste0(S1, S2, S3), FUN = sum))
英文:

You can group_by S1 to S3 and sum value:

  1. library(dplyr)
  2. df %&gt;%
  3. group_by(across(S1:S3)) %&gt;%
  4. mutate(result = sum(value)) %&gt;%
  5. ungroup()
  6. S1 S2 S3 value result
  7. 1 1 1 1 9 9
  8. 2 1 1 0 5 12
  9. 3 1 1 0 3 12
  10. 4 0 0 0 2 3
  11. 5 1 1 0 4 12
  12. 6 0 0 0 1 3

Or in base R:

  1. df$result &lt;- with(df, ave(value, paste0(S1, S2, S3), FUN = sum))

答案2

得分: 1

  1. 使用 `data.table`
  2. ```R
  3. library(data.table)
  4. setDT(df)[, result := sum(value), by = .(S1, S2, S3)]

-输出

  1. > df
  2. S1 S2 S3 value result
  3. 1: 1 1 1 9 9
  4. 2: 1 1 0 5 12
  5. 3: 1 1 0 3 12
  6. 4: 0 0 0 2 3
  7. 5: 1 1 0 4 12
  8. 6: 0 0 0 1 3
英文:

Using data.table

  1. library(data.table)
  2. setDT(df)[, result := sum(value), by = .(S1, S2, S3)]

-output

  1. &gt; df
  2. S1 S2 S3 value result
  3. 1: 1 1 1 9 9
  4. 2: 1 1 0 5 12
  5. 3: 1 1 0 3 12
  6. 4: 0 0 0 2 3
  7. 5: 1 1 0 4 12
  8. 6: 0 0 0 1 3

huangapple
  • 本文由 发表于 2023年2月24日 17:32:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/75554823.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定