英文:
How to separate the groupedby data in diferent columns of a dataframe?
问题
如何将这个表格转换成这个样子?
object | Name1 | Name2 | Name3 | Color1 | Color2 | Color3 |
---|---|---|---|---|---|---|
Fruit | Banana | Apple | Melon | Yellow | Red | Green |
Car | Fiat | BMW | NaN | White | Black | NaN |
我查阅了pandas文档,但未找到解决方案。
阅读pandas文档,尝试了一些不同的groupby方法。
英文:
How do i turn this
object | Name | Color |
---|---|---|
Fruit | Banana | Yellow |
Fruit | Apple | Red |
Fruit | Melon | Green |
Car | Fiat | White |
Car | BMW | Black |
Car | NaN | NaN |
In to this?
object | Name1 | Name2 | Name3 | Color1 | Color2 | Color3 |
---|---|---|---|---|---|---|
Fruit | Banana | Apple | Melon | Yellow | Red | Green |
Car | Fiat | BMW | NaN | White | Black | NaN |
I've searched the pandas documentation, but couldn't find a solution to this
Read the pandas documentation, tried some diferent methods of groupby
答案1
得分: 2
print(final_df)
输出:
object Color1 Color2 Color3 Name1 Name2 Name3
0 Car White Black NaN Fiat BMW NaN
1 Fruit Yellow Red Green Banana Apple Melon
英文:
Feels inefficient, but you can first create a new column to keep track of the number of times each item is listed before melt
ing, creating the new column names, then pivot
ing back.
import pandas as pd
import numpy as np
#original df
df = pd.DataFrame({
'object': ['Fruit', 'Fruit', 'Fruit', 'Car', 'Car', 'Car'],
'Name': ['Banana', 'Apple', 'Melon', 'Fiat', 'BMW', np.nan],
'Color': ['Yellow', 'Red', 'Green', 'White', 'Black', np.nan],
})
#add an 'object_count' column to df
df['object_count'] = df.groupby('object').cumcount().add(1)
#melt df to long form
long_df = df.melt(id_vars=['object','object_count'])
#append 'object_count' to the variable column
long_df['variable'] += long_df['object_count'].astype(str)
#pivot the table back to wide form
final_df = long_df.pivot(
index='object',
columns='variable',
values='value',
).reset_index()
final_df.columns.name = None #get rid of the 'variable' text at the top right of the table
#note, the output table isn't sorted by row or col the same as your expected output
#(it's sorted alphabetically for both)
#but you can do this or find help if it's important
print(final_df)
Output
object Color1 Color2 Color3 Name1 Name2 Name3
0 Car White Black NaN Fiat BMW NaN
1 Fruit Yellow Red Green Banana Apple Melon
答案2
得分: 1
以下是翻译好的部分:
df["N"] = df.assign(N=1).groupby("object")["N"].cumsum().map("Name{}".format)
df["C"] = df.assign(C=1).groupby("object")["C"].cumsum().map("Color{}".format)
out = df.pivot(index=["object"], columns=["N", "C"], values=["Name", "Color"])
out.columns = [t[1] if t[0] == "Name" else t[2] for t in out.columns]
print(out)
Name1 Name2 Name3 Color1 Color2 Color3
object
Car Fiat BMW NaN White Black NaN
Fruit Banana Apple Melon Yellow Red Green
希望这对你有帮助。
英文:
With inspiration from comment by @mitoRibo, here is an answer:
df["N"] = df.assign(N=1).groupby("object")["N"].cumsum().map("Name{}".format)
df["C"] = df.assign(C=1).groupby("object")["C"].cumsum().map("Color{}".format)
out = df.pivot(index=["object"], columns=["N", "C"], values=["Name", "Color"])
out.columns = [t[1] if t[0] == "Name" else t[2] for t in out.columns]
print(out)
Name1 Name2 Name3 Color1 Color2 Color3
object
Car Fiat BMW NaN White Black NaN
Fruit Banana Apple Melon Yellow Red Green
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论