英文:
Groupby and transform across a group, not within it
问题
df['CumulativeTotal'] = df.groupby('group')['weeklyTotal'].cumsum()
英文:
I'm trying to get the third column of this dataframe, given the first two columns.
I can't work out what to search for, it's like a within version of groupby('group')['weeklyTotal'].cumsum()
??
I know I could pull out those two columns, make them distinct, then do the groupby cumsum, but would much prefer to have this within the same dataframe.
To save a tiny bit of pain, here's an example dataframe:
df = pd.DataFrame({'group':['A','A','A','B','B','B','C','C','C'], 'weeklyTotal':[1,1,1,3,3,3,2,2,2]})
Group | WeeklyTotal | CumulativeTotal |
---|---|---|
A | 1 | 1 |
A | 1 | 1 |
A | 1 | 1 |
B | 3 | 4 |
B | 3 | 4 |
B | 3 | 4 |
C | 2 | 6 |
C | 2 | 6 |
C | 2 | 6 |
答案1
得分: 1
使用drop_duplicates
函数,每个组保留一行,然后计算cumsum
和map
这些值:
df['CumulativeTotal'] = df['group'].map(df.drop_duplicates(subset='group')
.set_index('group')['weeklyTotal']
.cumsum()
)
或者,使用mask
和duplicated
:
df['CumulativeTotal'] = (df['weeklyTotal']
.mask(df['group'].duplicated(), 0)
.cumsum()
)
输出:
group weeklyTotal CumulativeTotal
0 A 1 1
1 A 1 1
2 A 1 1
3 B 3 4
4 B 3 4
5 B 3 4
6 C 2 6
7 C 2 6
8 C 2 6
请注意,这是用Python Pandas进行数据处理的代码示例。
英文:
Keep only one row per group with drop_duplicates
, compute the cumsum
and map
the values:
df['CumulativeTotal'] = df['group'].map(df.drop_duplicates(subset='group')
.set_index('group')['weeklyTotal']
.cumsum()
)
Or, using a mask
and duplicated
:
df['CumulativeTotal'] = (df['weeklyTotal']
.mask(df['group'].duplicated(), 0)
.cumsum()
)
Output:
group weeklyTotal CumulativeTotal
0 A 1 1
1 A 1 1
2 A 1 1
3 B 3 4
4 B 3 4
5 B 3 4
6 C 2 6
7 C 2 6
8 C 2 6
答案2
得分: 0
这是另一种方式:
m = df['group'].ne(df['group'].shift())
m.mul(df['weeklyTotal']).cumsum()
输出:
0 1
1 1
2 1
3 4
4 4
5 4
6 6
7 6
8 6
英文:
Here is another way:
m = df['group'].ne(df['group'].shift())
m.mul(df['weeklyTotal']).cumsum()
Output:
0 1
1 1
2 1
3 4
4 4
5 4
6 6
7 6
8 6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论