如何将数据框中的分组数据分开放入不同列?

huangapple go评论73阅读模式
英文:

How to separate the groupedby data in diferent columns of a dataframe?

问题

如何将这个表格转换成这个样子?

object Name1 Name2 Name3 Color1 Color2 Color3
Fruit Banana Apple Melon Yellow Red Green
Car Fiat BMW NaN White Black NaN

我查阅了pandas文档,但未找到解决方案。

阅读pandas文档,尝试了一些不同的groupby方法。

英文:

How do i turn this

object Name Color
Fruit Banana Yellow
Fruit Apple Red
Fruit Melon Green
Car Fiat White
Car BMW Black
Car NaN NaN

In to this?

object Name1 Name2 Name3 Color1 Color2 Color3
Fruit Banana Apple Melon Yellow Red Green
Car Fiat BMW NaN White Black NaN

I've searched the pandas documentation, but couldn't find a solution to this

Read the pandas documentation, tried some diferent methods of groupby

答案1

得分: 2

print(final_df)

输出:

      object  Color1 Color2 Color3   Name1  Name2  Name3
    0    Car   White  Black    NaN    Fiat    BMW    NaN
    1  Fruit  Yellow    Red  Green  Banana  Apple  Melon
英文:

Feels inefficient, but you can first create a new column to keep track of the number of times each item is listed before melting, creating the new column names, then pivoting back.

import pandas as pd
import numpy as np

#original df
df = pd.DataFrame({
    'object': ['Fruit', 'Fruit', 'Fruit', 'Car', 'Car', 'Car'],
    'Name': ['Banana', 'Apple', 'Melon', 'Fiat', 'BMW', np.nan],
    'Color': ['Yellow', 'Red', 'Green', 'White', 'Black', np.nan],
})

#add an 'object_count' column to df
df['object_count'] = df.groupby('object').cumcount().add(1)

#melt df to long form
long_df = df.melt(id_vars=['object','object_count'])

#append 'object_count' to the variable column
long_df['variable'] += long_df['object_count'].astype(str)

#pivot the table back to wide form
final_df = long_df.pivot(
    index='object',
    columns='variable',
    values='value',
).reset_index()

final_df.columns.name = None #get rid of the 'variable' text at the top right of the table

#note, the output table isn't sorted by row or col the same as your expected output
#(it's sorted alphabetically for both)
#but you can do this or find help if it's important

print(final_df)

Output

  object  Color1 Color2 Color3   Name1  Name2  Name3
0    Car   White  Black    NaN    Fiat    BMW    NaN
1  Fruit  Yellow    Red  Green  Banana  Apple  Melon

答案2

得分: 1

以下是翻译好的部分:

    df["N"] = df.assign(N=1).groupby("object")["N"].cumsum().map("Name{}".format)
    df["C"] = df.assign(C=1).groupby("object")["C"].cumsum().map("Color{}".format)
    out = df.pivot(index=["object"], columns=["N", "C"], values=["Name", "Color"])
    out.columns = [t[1] if t[0] == "Name" else t[2] for t in out.columns]
    print(out)

             Name1  Name2  Name3  Color1 Color2 Color3
    object                                            
    Car       Fiat    BMW    NaN   White  Black    NaN
    Fruit   Banana  Apple  Melon  Yellow    Red  Green

希望这对你有帮助。

英文:

With inspiration from comment by @mitoRibo, here is an answer:

df["N"] = df.assign(N=1).groupby("object")["N"].cumsum().map("Name{}".format)
df["C"] = df.assign(C=1).groupby("object")["C"].cumsum().map("Color{}".format)
out = df.pivot(index=["object"], columns=["N", "C"], values=["Name", "Color"])
out.columns = [t[1] if t[0] == "Name" else t[2] for t in out.columns]
print(out)

         Name1  Name2  Name3  Color1 Color2 Color3
object                                            
Car       Fiat    BMW    NaN   White  Black    NaN
Fruit   Banana  Apple  Melon  Yellow    Red  Green

huangapple
  • 本文由 发表于 2023年2月24日 01:08:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/75548102.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定