基于最后日期合并两个数据框。

huangapple go评论63阅读模式
英文:

Merging two dataframes based on last date

问题

我有两个数据框,我想根据ID和日期合并它们。

第一个数据框如下所示:

   ID      Date       EoM_Val
---------------------------------------
  AAA   2021-06-30    1417744
  BBB   2021-06-30    3946750
  AAA   2021-07-31    2792182
  BBB   2021-07-31    81073822

而第二个数据框类似于这样:

   ID      Date       Day_Val
---------------------------------------
  AAA   2021-06-05    14
  AAA   2021-06-12    11
  AAA   2021-06-21    15
  BBB   2021-06-06    33
  BBB   2021-06-18    35
  BBB   2021-06-27    55
  AAA   2021-07-08    6
  AAA   2021-07-12    8
  BBB   2021-07-15    9
  BBB   2021-07-31    10

(还请注意Date列是String类型)

我想要做的是将这两个数据框合并在一起,以便每个月和每个ID的Date列的最后一个值将具有EoM_Val,以便最终的合并看起来像这样:

   ID      Date       Day_Val    EoM_Val
----------------------------------------------
  AAA   2021-06-05    14
  AAA   2021-06-12    11
  AAA   2021-06-21    15         1417744
  BBB   2021-06-06    33
  BBB   2021-06-18    35
  BBB   2021-06-27    55         3946750
  AAA   2021-07-08    6
  AAA   2021-07-12    8          2792182
  BBB   2021-07-15    9
  BBB   2021-07-31    10         81073822

不幸的是,我在这方面遇到了相当大的困难,所以如果有人可以帮助我,我将非常感激。谢谢!

英文:

I have two dataframes which I would like to merge together based on the ids and the dates

The first dataframe looks like this:

   ID      Date       EoM_Val
---------------------------------------
  AAA   2021-06-30    1417744
  BBB   2021-06-30    3946750
  AAA   2021-07-31    2792182
  BBB   2021-07-31    81073822

While the second dataframe looks similar to this:

   ID      Date       Day_Val
---------------------------------------
  AAA   2021-06-05    14
  AAA   2021-06-12    11
  AAA   2021-06-21    15
  BBB   2021-06-06    33
  BBB   2021-06-18    35
  BBB   2021-06-27    55
  AAA   2021-07-08    6
  AAA   2021-07-12    8
  BBB   2021-07-15    9
  BBB   2021-07-31    10

(Note too that the Date column are String types)

What I would like to do is merge the two dataframes together such that the last value in the Date column for each month and for each ID will have the EoM_Val, so that the final merge looks like this:

   ID      Date       Day_Val    EoM_Val
----------------------------------------------
  AAA   2021-06-05    14
  AAA   2021-06-12    11
  AAA   2021-06-21    15         1417744
  BBB   2021-06-06    33
  BBB   2021-06-18    35
  BBB   2021-06-27    55         3946750
  AAA   2021-07-08    6
  AAA   2021-07-12    8          2792182
  BBB   2021-07-15    9
  BBB   2021-07-31    10         81073822

Unfortunately, I'm having quite a bit of difficulty with it, so if anyone could help I would greatly appreciate it. Thanks!

答案1

得分: 1

假设你的第一个数据框命名为 df1,第二个数据框命名为 df2

为每个数据框创建一个月份列:

df1['月份'] = pd.to_datetime(df1['日期']).dt.month

然后,将两个数据框按照 ID月份 分组,取每组中的 最后一次 出现的值,例如对于 df2

df2_分组 = df2.groupby(['ID', '月份']).last()

这将得到:

日期      	Day_Val
ID 	    月份		
AAA 	6 	2021-06-21    15
BBB 	6 	2021-06-27    55
AAA     7   2021-07-12    8
BBB     7   2021-07-31    10

这样你可以识别出需要显示 EoM_Value 的行。

然后,你可以将 df1_分组df2_分组 合并成一个名为 df_merged 的数据框,其中包含 IDDay_ValEoM_Value

最后,在 df2 中创建一个名为 EoM_Val 的列,并用 NaN 值填充。唯一需要做的是将这个更新后的 df2df_merged 合并,使用外连接以保留最终数据框中的所有行。

英文:

Let us assume that your first dataframe is named df1 and you second one, df2.

Create a month column for each dataframe:

df1['Month'] = pd.to_datetime(df1['Date']).dt.month

Then group both dataframes by ID and Month, taking the last occurrence in each group, e.g. for df2:

df2_grouped = df2.groupby(['ID', 'Month']).last()

This yields:

Date 	Day_Val
ID 	Month 		
AAA 	6 	2021-06-21    15
BBB 	6 	2021-06-27    55
AAA     7   2021-07-12    8
BBB     7   2021-07-31    10

This allows you to identify the rows for which you want the EoM_value to be displayed.

You can then merge df1_grouped and df2_grouped into df_merged, which will contain ID, Day_Val and EoM_Value.

Finally, create a EoM_Val column in df2, and populate it with NaN values. The only thing left to do is to merge this updated df2 with df_merged, using an outer merge to retain all rows in the final dataframe.

huangapple
  • 本文由 发表于 2023年2月8日 10:26:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75380847.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定