英文:
adding total to bottom of multiindex groups
问题
我正在尝试在我的多级索引数据框中为每个分组添加总和
计数
州 车型 状态
得克萨斯 公民 新 11
未受损 11
损坏 10
报废 5
弗吉尼亚 公民 新 10
未受损 20
损坏 10
报废 5
我想它看起来像:
计数
州 车型 状态
得克萨斯 公民 新 11
损坏 10
报废 5
未受损 11
总计 37
弗吉尼亚 公民 新 10
损坏 10
报废 5
未受损 20
总计 45
我尝试过
s = test.groupby(level=[0,1]).sum()
s.index = pd.MultiIndex.from_product(展开收缩])
df_out = df_full.append(s).sort_index()
但是它会抛出
> 未实现的错误: MultiIndex没有定义isna
英文:
I am trying to add a sum to my multiindex dataframe by each grouping
Count
state car status
texas civic New 11
undamaged 11
damaged 10
totalled 5
virginia civic New 10
undamaged 20
damaged 10
totalled 5
I want it to look like:
Count
state car status
texas civic New 11
damaged 10
totalled 5
undamaged 11
total 37
virginia civic New 10
damaged 10
totalled 5
undamaged 20
total 45
I have tried
s = test.groupby(level=[0,1]).sum()
s.index = pd.MultiIndex.from_product(展开收缩])
df_out = df_full.append(s).sort_index()
but it throws
> NotImplementedError: isna is not defined for MultiIndex
答案1
得分: 0
你的问题是 pd.MultiIndex.from_product
不支持多级索引和列表之间的乘积操作,你可以使用 pd.MultiIndex.from_frame
替代。
s = df.groupby(level=[0,1]).sum()
s.index = pd.MultiIndex.from_frame(s.index.to_frame().assign(status='total'))
out = df.append(s).sort_index()
print(out)
Count
state car status
texas civic New 11
damaged 10
total 37
totalled 5
undamaged 11
virginia civic New 10
damaged 10
total 45
totalled 5
undamaged 20
然而,.sort_index()
会改变索引顺序,你可以尝试以下方式代替:
df_ = df['Count'].unstack()
df_['total'] = df_.sum(axis=1)
df_ = df_.stack().to_frame('Count')
# 或者在一行中完成
df_ = (df['Count'].unstack()
.pipe(lambda d: d.assign(total=d.sum(axis=1)))
.stack().to_frame('Count'))
print(df_)
Count
state car status
texas civic New 11
damaged 10
totalled 5
undamaged 11
total 37
virginia civic New 10
damaged 10
totalled 5
undamaged 20
total 45
英文:
You problem is that pd.MultiIndex.from_product
doesn't support product between multindex and list, instead you can use pd.MultiIndex.from_frame
s = df.groupby(level=[0,1]).sum()
s.index = pd.MultiIndex.from_frame(s.index.to_frame().assign(status='total'))
out = df.append(s).sort_index()
print(out)
Count
state car status
texas civic New 11
damaged 10
total 37
totalled 5
undamaged 11
virginia civic New 10
damaged 10
total 45
totalled 5
undamaged 20
However, .sort_index()
will change the index order, you can try following instead
df_ = df['Count'].unstack()
df_['total'] = df_.sum(axis=1)
df_ = df_.stack().to_frame('Count')
# or in one line
df_ = (df['Count'].unstack()
.pipe(lambda d: d.assign(total=d.sum(axis=1)))
.stack().to_frame('Count'))
print(df_)
Count
state car status
texas civic New 11
damaged 10
totalled 5
undamaged 11
total 37
virginia civic New 10
damaged 10
totalled 5
undamaged 20
total 45
答案2
得分: 0
An easy way I’ve implemented this in my workflow is to use the Sidetables package. Link
You can use it like: test.groupby(level=[0,1]).sum().stb.subtotal(sub_level=2) will accomplish what you’re looking for.
英文:
An easy way I’ve implemented this in my workflow is to use the Sidetables package. Link
You can use it like: test.groupby(level=[0,1]).sum().stb.subtotal(sub_level=2) will accomplish what you’re looking for.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论