设置每个组的最后一行的列为前一行的列。

huangapple go评论70阅读模式
英文:

How to set the column of the last row of each group to the previous row's column

问题

df.loc[df[df.time == 10], "start_quantity"] = df.loc[df[df.time == 9], "end_quantity"]
英文:

I have a dataframe df, that has columns id, time, start_quantity, end_quantity. For each id and time, the start_quantity is equal to the end_quantity at the previous time.

Here's an example:

id   time       start_quantity      end_quantity
0    1          0                   10
0    2          10                  15
.....
23   1          55                  87       
23   2          87                  90
.....

There's a degenerate case in df, where the last row of each id (note that dataframe is pregrouped by id and within each id group its sorted in ascending order based on time), has an incorrect start_quantity. The last row for each id is always time = 10.

For each of these rows, I would like to make the correction, but when I tried

df.loc[df[df.time == 10], "start_quantity"] = df.loc[df[df.time == 9], "end_quantity"]

It makes the start_quantity for those rows NaN.

答案1

得分: 2

使用loc进行布尔索引,将末尾数量按ID移动,然后更新time == 10的数值。

s = df.groupby('id')['end_quantity'].shift()
df.loc[df['time'] == 10, 'start_quantity'] = s

备选方法:

df.loc[df['time'] == 10, 'start_quantity'] = df.loc[df['time'] == 9, 'end_quantity'].tolist()
英文:

Shift the values in the end quantity per id, then use boolean indexing with loc to update the values where time == 10

s = df.groupby('id')['end_quantity'].shift()
df.loc[df['time'] == 10, 'start_quantity'] = s

Alternative approach:

df.loc[df['time'] == 10, 'start_quantity'] = df.loc[df['time'] == 9, 'end_quantity'].tolist()

huangapple
  • 本文由 发表于 2023年6月6日 09:19:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76410874.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定