英文:
Rolling average base on two columns in Pandas
问题
我想要根据周来获得每个月的滚动总和。以下是我的数据框(df)的样子:
我想要根据每个颜色和月份获取每个月的滚动总和,基于该月的周。对于红色和二月份,第一周是30,第二周是50(30+20)。对于三月份的红色,它应该是15, 25, 45, 65。
我尝试使用分组和滚动窗口,包括使用lambda函数,但没有成功。
任何帮助将不胜感激。提前感谢!
以下是要获取的结果:
number color mon 0 rolling
30 red Feb 1 30
20 red Feb 2 50
15 red Mar 1 15
10 red Mar 2 25
20 red Mar 3 45
20 red Mar 4 65
5 blue Feb 1 5
15 blue Feb 2 20
20 blue Mar 1 20
10 blue Mar 2 30
10 blue Mar 3 40
30 blue Mar 4 70
20 Green Feb 1 20
15 Green Mar 1 15
10 Green Mar 2 25
希望这能帮助你。
英文:
I want to get a rolling sum for each month base on weeks.
Here is what my df looks like
df = pd.DataFrame([1, 2, 1, 2, 3, 4, 1, 2, 1, 2, 3, 4, 1, 1, 2], index=pd.MultiIndex.from_arrays([[ 30, 20,15, 10, 20, 20,5,15,20,10,10, 30, 20,15, 10], ['red','red','red','red', 'red', 'red', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue','Green', 'Green', 'Green'],['Feb', 'Feb', 'Mar', 'Mar', 'Mar', 'Mar', 'Feb', 'Feb', 'Mar', 'Mar', 'Mar', 'Mar', 'Feb', 'Mar', 'Mar']], names=['number', 'color', 'Mon'])).reset_index()
df
number color mon 0
30 red Feb 1
20 red Feb 2
15 red Mar 1
10 red Mar 2
20 red Mar 3
20 red Mar 4
5 blue Feb 1
15 blue Feb 2
20 blue Mar 1
10 blue Mar 2
10 blue Mar 3
30 blue Mar 4
20 Green Feb 1
15 Green Mar 1
10 Green Mar 2
I want to get the rolling sum for each month for each color base on that month's week.
For the color red and mon Feb 1st week 30 and 2nd week 50 (30+20). For-Mar it should be for red 15,25,45,65
number color mon 0 rolling
30 red Feb 1 30
20 red Feb 2 50
15 red Mar 1 15
10 red Mar 2 25
20 red Mar 3 45
20 red Mar 4 65
5 blue Feb 1 5
15 blue Feb 2 20
20 blue Mar 1 20
10 blue Mar 2 30
10 blue Mar 3 40
30 blue Mar 4 70
20 Green Feb 1 20
15 Green Mar 1 15
10 Green Mar 2 25
I'm trying to use group by with rolling window with and without lambda but it didn't work out
df.groupby(by=['color','Mon']).rolling(window=2).sum()
# also below command
df.groupby(by=['color','Mon']).apply(lambda x: x.rolling(2).sum())
Any help would be greatly appreciated.
Thanks in advance!!
答案1
得分: 1
我认为你想要使用 .cumsum()
函数:
df['rolling'] = df.groupby(['color', 'Mon'])['number'].cumsum()
print(df)
打印结果:
number color Mon 0 rolling
0 30 red Feb 1 30
1 20 red Feb 2 50
2 15 red Mar 1 15
3 10 red Mar 2 25
4 20 red Mar 3 45
5 20 red Mar 4 65
6 5 blue Feb 1 5
7 15 blue Feb 2 20
8 20 blue Mar 1 20
9 10 blue Mar 2 30
10 10 blue Mar 3 40
11 30 blue Mar 4 70
12 20 Green Feb 1 20
13 15 Green Mar 1 15
14 10 Green Mar 2 25
英文:
I think you want .cumsum()
:
df['rolling'] = df.groupby(['color', 'Mon'])['number'].cumsum()
print(df)
Prints:
number color Mon 0 rolling
0 30 red Feb 1 30
1 20 red Feb 2 50
2 15 red Mar 1 15
3 10 red Mar 2 25
4 20 red Mar 3 45
5 20 red Mar 4 65
6 5 blue Feb 1 5
7 15 blue Feb 2 20
8 20 blue Mar 1 20
9 10 blue Mar 2 30
10 10 blue Mar 3 40
11 30 blue Mar 4 70
12 20 Green Feb 1 20
13 15 Green Mar 1 15
14 10 Green Mar 2 25
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论