英文:
How to set the column of the last row of each group to the previous row's column
问题
df.loc[df[df.time == 10], "start_quantity"] = df.loc[df[df.time == 9], "end_quantity"]
英文:
I have a dataframe df
, that has columns id
, time
, start_quantity
, end_quantity
. For each id
and time
, the start_quantity
is equal to the end_quantity
at the previous time.
Here's an example:
id time start_quantity end_quantity
0 1 0 10
0 2 10 15
.....
23 1 55 87
23 2 87 90
.....
There's a degenerate case in df
, where the last row of each id
(note that dataframe is pregrouped by id
and within each id
group its sorted in ascending order based on time
), has an incorrect start_quantity
. The last row for each id
is always time = 10
.
For each of these rows, I would like to make the correction, but when I tried
df.loc[df[df.time == 10], "start_quantity"] = df.loc[df[df.time == 9], "end_quantity"]
It makes the start_quantity
for those rows NaN
.
答案1
得分: 2
使用loc
进行布尔索引,将末尾数量按ID移动,然后更新time == 10
的数值。
s = df.groupby('id')['end_quantity'].shift()
df.loc[df['time'] == 10, 'start_quantity'] = s
备选方法:
df.loc[df['time'] == 10, 'start_quantity'] = df.loc[df['time'] == 9, 'end_quantity'].tolist()
英文:
Shift the values in the end quantity per id, then use boolean indexing with loc
to update the values where time == 10
s = df.groupby('id')['end_quantity'].shift()
df.loc[df['time'] == 10, 'start_quantity'] = s
Alternative approach:
df.loc[df['time'] == 10, 'start_quantity'] = df.loc[df['time'] == 9, 'end_quantity'].tolist()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论