英文:
apply function in pandas hierarchical index
问题
points tiger lion bear
0 425 72 1878 1.40000
1 441 -211 -5238 -4.00000
2 1048 47 -1266 2.90000
英文:
I have a pandas dataframe as below.
df = pd.DataFrame({'team' : ['A', 'B', 'A', 'B', 'A', 'B'],
'tiger' : [87, 159, 351, 140, 72, 119],
'lion' : [1843, 3721, 6905, 1667, 2865, 1599],
'bear' : [1.9, 3.3, 6.3, 2.3, 1.2, 4.1],
'points' : [425, 425, 441, 441, 1048, 1048]})
grouped = df.groupby(['points', 'team'])[['tiger', 'lion', 'bear']].median()
print(grouped)
tiger lion bear
points team
425 A 87.00000 1843.00000 1.90000
B 159.00000 3721.00000 3.30000
441 A 351.00000 6905.00000 6.30000
B 140.00000 1667.00000 2.30000
1048 A 72.00000 2865.00000 1.20000
B 119.00000 1599.00000 4.10000
I would like to take the difference between teams A and B for each of the animal (tiger, lion, bear) and points levels. So the difference between team A (87) and B (159) within points 425 and tiger. I'm not sure how to do this with an hierarchical index. It would look something like below. Thanks.
points tiger lion bear
0 425 72 1878 1.40000
1 441 -211 -5238 -4.00000
2 1048 47 -1266 2.90000
</details>
# 答案1
**得分**: 1
你可以使用 [`swaplevel`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.swaplevel.html) 和切片:
```python
grouped = (df.groupby(['points', 'team'])[['tiger', 'lion', 'bear']].median()
.swaplevel()
)
grouped.loc['A'] - grouped.loc['B']
或者使用 xs
:
grouped = df.groupby(['points', 'team'])[['tiger', 'lion', 'bear']].median()
grouped.xs('A', level='team') - grouped.xs('B', level='team')
输出结果:
tiger lion bear
points
425 -72.0 -1878.0 -1.4
441 211.0 5238.0 4.0
1048 -47.0 1266.0 -2.9
英文:
You can swaplevel
and slice:
grouped = (df.groupby(['points', 'team'])[['tiger', 'lion', 'bear']].median()
.swaplevel()
)
grouped.loc['A']-grouped.loc['B']
Or use xs
:
grouped = df.groupby(['points', 'team'])[['tiger', 'lion', 'bear']].median()
grouped.xs('A', level='team')-grouped.xs('B', level='team')
Output:
tiger lion bear
points
425 -72.0 -1878.0 -1.4
441 211.0 5238.0 4.0
1048 -47.0 1266.0 -2.9
答案2
得分: 0
grouped.groupby(level=0).apply(lambda dd:dd.diff().tail(1)).droplevel([1,2])
out
老虎 狮子 熊
points
425 72.0 1878.0 1.4
441 -211.0 -5238.0 -4.0
1048 47.0 -1266.0 2.9
英文:
grouped.groupby(level=0).apply(lambda dd:dd.diff().tail(1)).droplevel([1,2])
out
tiger lion bear
points
425 72.0 1878.0 1.4
441 -211.0 -5238.0 -4.0
1048 47.0 -1266.0 2.9
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论