在R中对两个不同数据集的数值进行求和,但要在相同日期下进行操作。

huangapple go评论90阅读模式
英文:

Sum values of two different datasets but following the same date in R

问题

我有两个数据集,分别称为'ro'和'rt',它们的长度不同,每个数据集中都有一个名为'price'的列和一个名为'date'的列。我想要按相同的日期对价格进行求和。

所以我想创建一个新的数据集,在其中,例如,在日期6/1/22(在两个数据集中都有)中,在'ro'中有20美元,在'rt'中有40美元。新数据集将有另一列日期(6/1/22)和另一列为60美元(这是总和)。

当然,如果没有相同的日期,就不会有任何求和;(在'ro'中我们有日期3/5/22,有90美元,但在'rt'中没有相同的日期,在新数据集中将保持为同一行,不进行求和)。

  1. 数据集 'ro'
  2. 日期 价格
  3. 1 2015-01-17 2
  4. 2 2015-01-18 7
  5. 3 2015-01-19 1
  6. 4 2015-01-11 8
  7. 数据集 'rt'
  8. 日期 价格
  9. 1 2015-01-17 1
  10. 2 2015-01-10 2
  11. 3 2015-01-19 1
  12. 4 2015-01-11 1
  13. 5 2015-02-12 5
  14. 6 2015-04-9 2
  15. 新数据集
  16. 价格
  17. 1 2015-01-17 3
  18. 2 2015-01-10 2
  19. 3 2015-01-19 2
  20. 4 2015-01-11 9
  21. 5 2015-01-18 7
  22. 6 2015-02-12 5
  23. 7 2015-04-9 2

这就是我想要的。

英文:

I have two datasets called 'ro' and 'rt' with different length, in each dataset we have a column called 'price' and a col called 'date'. I want to sum the prices following the same date.
so I would like to create a new dataset in which, for example in date 6/1/22 (which is in both dataset) in 'ro' there's 20$ and in 'rt' there's 40$. the new dataset will have another column with the date (6/1/22) and another column with 60$ (which is the sum)

of course if there's not the same date, there won't be any sum; (in 'ro' we have date 3/5/22 with 90$, but there's not the same date 'rt', in the new dataset will be simply the same row, without any sum)

  1. dataset 'ro'
  2. Date A
  3. 1 2015-01-17 2
  4. 2 2015-01-18 7
  5. 3 2015-01-19 1
  6. 4 2015-01-11 8
  7. dataset 'rt'
  8. Date A
  9. 1 2015-01-17 1
  10. 2 2015-01-10 2
  11. 3 2015-01-19 1
  12. 4 2015-01-11 1
  13. 5 2015-02-12 5
  14. 6 2015-04-9 2
  15. new dataset
  16. A
  17. 1 2015-01-17 3
  18. 2 2015-01-10 2
  19. 3 2015-01-19 2
  20. 4 2015-01-11 9
  21. 5 2015-01-18 7
  22. 6 2015-02-12 5
  23. 7 2015-04-9 2

this is what I would like

答案1

得分: 4

  1. rbind(ro, rt) |>
  2. aggregate(A ~ Date, data = _, FUN = sum)
  3. # Date A
  4. # 1 2015-01-10 2
  5. # 2 2015-01-11 9
  6. # 3 2015-01-17 3
  7. # 4 2015-01-18 7
  8. # 5 2015-01-19 2
  9. # 6 2015-02-12 5
  10. # 7 2015-04-9 2
英文:

Base R:

  1. rbind(ro, rt) |>
  2. aggregate(A ~ Date, data = _, FUN = sum)
  3. # Date A
  4. # 1 2015-01-10 2
  5. # 2 2015-01-11 9
  6. # 3 2015-01-17 3
  7. # 4 2015-01-18 7
  8. # 5 2015-01-19 2
  9. # 6 2015-02-12 5
  10. # 7 2015-04-9 2

答案2

得分: 3

We could bind the datasets and do a group by sum

  1. library(dplyr) #版本 >= 1.1.0
  2. bind_rows(ro, rt) %>%
  3. reframe(A = sum(A), .by = Date)

-output

  1. Date A
  2. 1 2015-01-17 3
  3. 2 2015-01-18 7
  4. 3 2015-01-19 2
  5. 4 2015-01-11 9
  6. 5 2015-01-10 2
  7. 6 2015-02-12 5
  8. 7 2015-04-9 2
英文:

We could bind the datasets and do a group by sum

  1. library(dplyr) #version >= 1.1.0
  2. bind_rows(ro, rt) %>%
  3. reframe(A = sum(A), .by = Date)

-output

  1. Date A
  2. 1 2015-01-17 3
  3. 2 2015-01-18 7
  4. 3 2015-01-19 2
  5. 4 2015-01-11 9
  6. 5 2015-01-10 2
  7. 6 2015-02-12 5
  8. 7 2015-04-9 2

答案3

得分: 3

使用data.table,我们可以使用rbindlist函数:

  1. > library(data.table)
  2. > rbindlist(list(ro, rt))[, .(A = sum(A)), Date]
  3. Date A
  4. 1: 2015-01-17 3
  5. 2: 2015-01-18 7
  6. 3: 2015-01-19 2
  7. 4: 2015-01-11 9
  8. 5: 2015-01-10 2
  9. 6: 2015-02-12 5
  10. 7: 2015-04-9 2
英文:

With data.table we can use rbindlist

  1. > library(data.table)
  2. > rbindlist(list(ro, rt))[, .(A = sum(A)), Date]
  3. Date A
  4. 1: 2015-01-17 3
  5. 2: 2015-01-18 7
  6. 3: 2015-01-19 2
  7. 4: 2015-01-11 9
  8. 5: 2015-01-10 2
  9. 6: 2015-02-12 5
  10. 7: 2015-04-9 2

huangapple
  • 本文由 发表于 2023年3月31日 23:29:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75900264.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定