英文:
Add columns to a new level in multiindex dataframe
问题
以下是您要翻译的内容:
My dataframe looks like this:
data = {
'WholesalerID': {0: 121, 1: 121, 2: 42, 3: 42, 4: 54, 5: 43, 6: 432, 7: 4245, 8: 4245, 9: 4245, 10: 457},
'Brand': {0: 'Vans', 1: 'Nike', 2: 'Nike', 3: 'Vans',4: 'Vans', 5: 'Nike', 6: 'Puma', 7: 'Vans', 8: 'Nike', 9: 'Puma', 10: 'Converse'},
'Shop 1': {0: 'Yes', 1: 'No', 2: 'Yes', 3: 'Maybe', 4: 'Yes', 5: 'No', 6: 'Yes', 7: 'Yes', 8: 'Maybe', 9: 'Maybe', 10: 'No'}
}
df = pd.DataFrame.from_dict(data)
df = df.assign(count=1)
pivoted_df = pd.pivot_table(
df,
index=["Brand"],
columns=["Shop 1"],
values=["count"],
aggfunc={"count": "count"},
fill_value=0,
margins=True,
margins_name="N",
)
pivoted_df.columns = pd.MultiIndex.from_product(
[pivoted_df.columns, ["N", "count", "prop"]]
)
<details>
<summary>英文:</summary>
My dataframe looks like this:
```python
data = {
'WholesalerID': {0: 121, 1: 121, 2: 42, 3: 42, 4: 54, 5: 43, 6: 432, 7: 4245, 8: 4245, 9: 4245, 10: 457},
'Brand': {0: 'Vans', 1: 'Nike', 2: 'Nike', 3: 'Vans',4: 'Vans', 5: 'Nike', 6: 'Puma', 7: 'Vans', 8: 'Nike', 9: 'Puma', 10: 'Converse'},
'Shop 1': {0: 'Yes', 1: 'No', 2: 'Yes', 3: 'Maybe', 4: 'Yes', 5: 'No', 6: 'Yes', 7: 'Yes', 8: 'Maybe', 9: 'Maybe', 10: 'No'}
}
df = pd.DataFrame.from_dict(data)
df = df.assign(count=1)
pivoted_df = pd.pivot_table(
df,
index=["Brand"],
columns=["Shop 1"],
values=["count"],
aggfunc={"count": "count"},
fill_value=0,
margins=True,
margins_name="N",
)
I need to add columns N, Count, Prop
on the first level, I am trying the following, but It does not work:
pivoted_df.columns = pd.MultiIndex.from_product(
[pivoted_df.columns, ["N", "count", "prop"]]
)
答案1
得分: 2
以下是使用Pandas的一种方法进行操作,使用索引和连接:
dfs = []
for col in ("Yes", "Maybe", "No"):
tmp = (
pivoted_df.loc["Vans", [("count", col), ("count", "N")]]
.rename(columns={col: "count"})
.rename(columns={"count": col}, level=0)
)
tmp[(col, "prop")] = tmp[(col, "count")] / tmp[(col, "N")]
dfs.append(tmp.reindex(columns=[(col, "N"), (col, "count"), (col, "prop")]))
new_df = pd.concat(dfs, axis=1)
然后:
print(new_df)
# 输出
Yes Maybe No
Shop 1 N count prop N count prop N count prop
Brand
Converse 1 0 0.00 1 0 0.00 1 1 1.0
Nike 4 1 0.25 4 1 0.25 4 2 0.5
Puma 2 1 0.50 2 1 0.50 2 0 0.0
Vans 4 3 0.75 4 1 0.25 4 0 0.0
英文:
Here is one way to do it with Pandas indexing and concat:
dfs = []
for col in ("Yes", "Maybe", "No"):
tmp = (
pivoted_df.loc[:"Vans", [("count", col), ("count", "N")]]
.rename(columns={col: "count"})
.rename(columns={"count": col}, level=0)
)
tmp[(col, "prop")] = tmp[(col, "count")] / tmp[(col, "N")]
dfs.append(tmp.reindex(columns=[(col, "N"), (col, "count"), (col, "prop")]))
new_df = pd.concat(dfs, axis=1)
Then:
print(new_df)
# Output
Yes Maybe No
Shop 1 N count prop N count prop N count prop
Brand
Converse 1 0 0.00 1 0 0.00 1 1 1.0
Nike 4 1 0.25 4 1 0.25 4 2 0.5
Puma 2 1 0.50 2 1 0.50 2 0 0.0
Vans 4 3 0.75 4 1 0.25 4 0 0.0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论