在R中对两个不同数据集的数值进行求和,但要在相同日期下进行操作。

huangapple go评论61阅读模式
英文:

Sum values of two different datasets but following the same date in R

问题

我有两个数据集,分别称为'ro'和'rt',它们的长度不同,每个数据集中都有一个名为'price'的列和一个名为'date'的列。我想要按相同的日期对价格进行求和。

所以我想创建一个新的数据集,在其中,例如,在日期6/1/22(在两个数据集中都有)中,在'ro'中有20美元,在'rt'中有40美元。新数据集将有另一列日期(6/1/22)和另一列为60美元(这是总和)。

当然,如果没有相同的日期,就不会有任何求和;(在'ro'中我们有日期3/5/22,有90美元,但在'rt'中没有相同的日期,在新数据集中将保持为同一行,不进行求和)。

数据集 'ro' 

   日期            价格                           

1  2015-01-17     2  
2  2015-01-18     7   
3  2015-01-19     1       
4  2015-01-11     8      


数据集 'rt'

   日期            价格       

1  2015-01-17     1      
2  2015-01-10     2   
3  2015-01-19     1       
4  2015-01-11     1   
5  2015-02-12     5
6  2015-04-9      2


新数据集       
                  价格

1  2015-01-17     3      
2  2015-01-10     2   
3  2015-01-19     2       
4  2015-01-11     9   
5  2015-01-18     7
6  2015-02-12     5
7  2015-04-9      2

这就是我想要的。

英文:

I have two datasets called 'ro' and 'rt' with different length, in each dataset we have a column called 'price' and a col called 'date'. I want to sum the prices following the same date.
so I would like to create a new dataset in which, for example in date 6/1/22 (which is in both dataset) in 'ro' there's 20$ and in 'rt' there's 40$. the new dataset will have another column with the date (6/1/22) and another column with 60$ (which is the sum)

of course if there's not the same date, there won't be any sum; (in 'ro' we have date 3/5/22 with 90$, but there's not the same date 'rt', in the new dataset will be simply the same row, without any sum)

dataset 'ro' 

   Date           A                           

1  2015-01-17     2  
2  2015-01-18     7   
3  2015-01-19     1       
4  2015-01-11     8      


dataset 'rt'

   Date           A       

1  2015-01-17     1      
2  2015-01-10     2   
3  2015-01-19     1       
4  2015-01-11     1   
5  2015-02-12     5
6  2015-04-9      2


new dataset       
                  A

1  2015-01-17     3      
2  2015-01-10     2   
3  2015-01-19     2       
4  2015-01-11     9   
5  2015-01-18     7
6  2015-02-12     5
7  2015-04-9      2

this is what I would like

答案1

得分: 4

rbind(ro, rt) |>
  aggregate(A ~ Date, data = _, FUN = sum)
#         Date A
# 1 2015-01-10 2
# 2 2015-01-11 9
# 3 2015-01-17 3
# 4 2015-01-18 7
# 5 2015-01-19 2
# 6 2015-02-12 5
# 7 2015-04-9 2
英文:

Base R:

rbind(ro, rt) |>
  aggregate(A ~ Date, data = _, FUN = sum)
#         Date A
# 1 2015-01-10 2
# 2 2015-01-11 9
# 3 2015-01-17 3
# 4 2015-01-18 7
# 5 2015-01-19 2
# 6 2015-02-12 5
# 7  2015-04-9 2

答案2

得分: 3

We could bind the datasets and do a group by sum

library(dplyr) #版本 >= 1.1.0
bind_rows(ro, rt) %>%
   reframe(A = sum(A), .by = Date)

-output

        Date A
1 2015-01-17 3
2 2015-01-18 7
3 2015-01-19 2
4 2015-01-11 9
5 2015-01-10 2
6 2015-02-12 5
7  2015-04-9 2
英文:

We could bind the datasets and do a group by sum

library(dplyr) #version >= 1.1.0
bind_rows(ro, rt) %>% 
   reframe(A = sum(A), .by = Date)

-output

        Date A
1 2015-01-17 3
2 2015-01-18 7
3 2015-01-19 2
4 2015-01-11 9
5 2015-01-10 2
6 2015-02-12 5
7  2015-04-9 2

答案3

得分: 3

使用data.table,我们可以使用rbindlist函数:

> library(data.table)

> rbindlist(list(ro, rt))[, .(A = sum(A)), Date]
         Date A
1: 2015-01-17 3
2: 2015-01-18 7
3: 2015-01-19 2
4: 2015-01-11 9
5: 2015-01-10 2
6: 2015-02-12 5
7:  2015-04-9 2
英文:

With data.table we can use rbindlist

> library(data.table)

> rbindlist(list(ro, rt))[, .(A = sum(A)), Date]
         Date A
1: 2015-01-17 3
2: 2015-01-18 7
3: 2015-01-19 2
4: 2015-01-11 9
5: 2015-01-10 2
6: 2015-02-12 5
7:  2015-04-9 2

huangapple
  • 本文由 发表于 2023年3月31日 23:29:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75900264.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定