2023年2月24日 01:08:46go评论157阅读模式

英文:

How to separate the groupedby data in diferent columns of a dataframe?

问题

如何将这个表格转换成这个样子？

object	Name1	Name2	Name3	Color1	Color2	Color3
Fruit	Banana	Apple	Melon	Yellow	Red	Green
Car	Fiat	BMW	NaN	White	Black	NaN

我查阅了pandas文档，但未找到解决方案。

阅读pandas文档，尝试了一些不同的groupby方法。

英文:

How do i turn this

object	Name	Color
Fruit	Banana	Yellow
Fruit	Apple	Red
Fruit	Melon	Green
Car	Fiat	White
Car	BMW	Black
Car	NaN	NaN

In to this?

object	Name1	Name2	Name3	Color1	Color2	Color3
Fruit	Banana	Apple	Melon	Yellow	Red	Green
Car	Fiat	BMW	NaN	White	Black	NaN

I've searched the pandas documentation, but couldn't find a solution to this

Read the pandas documentation, tried some diferent methods of groupby

答案1

得分: 2

print(final_df)

输出：

      object  Color1 Color2 Color3   Name1  Name2  Name3
    0    Car   White  Black    NaN    Fiat    BMW    NaN
    1  Fruit  Yellow    Red  Green  Banana  Apple  Melon

英文:

Feels inefficient, but you can first create a new column to keep track of the number of times each item is listed before melting, creating the new column names, then pivoting back.

import pandas as pd
import numpy as np

#original df
df = pd.DataFrame({
    &#39;object&#39;: [&#39;Fruit&#39;, &#39;Fruit&#39;, &#39;Fruit&#39;, &#39;Car&#39;, &#39;Car&#39;, &#39;Car&#39;],
    &#39;Name&#39;: [&#39;Banana&#39;, &#39;Apple&#39;, &#39;Melon&#39;, &#39;Fiat&#39;, &#39;BMW&#39;, np.nan],
    &#39;Color&#39;: [&#39;Yellow&#39;, &#39;Red&#39;, &#39;Green&#39;, &#39;White&#39;, &#39;Black&#39;, np.nan],
})

#add an &#39;object_count&#39; column to df
df[&#39;object_count&#39;] = df.groupby(&#39;object&#39;).cumcount().add(1)

#melt df to long form
long_df = df.melt(id_vars=[&#39;object&#39;,&#39;object_count&#39;])

#append &#39;object_count&#39; to the variable column
long_df[&#39;variable&#39;] += long_df[&#39;object_count&#39;].astype(str)

#pivot the table back to wide form
final_df = long_df.pivot(
    index=&#39;object&#39;,
    columns=&#39;variable&#39;,
    values=&#39;value&#39;,
).reset_index()

final_df.columns.name = None #get rid of the &#39;variable&#39; text at the top right of the table

#note, the output table isn&#39;t sorted by row or col the same as your expected output
#(it&#39;s sorted alphabetically for both)
#but you can do this or find help if it&#39;s important

print(final_df)

Output

  object  Color1 Color2 Color3   Name1  Name2  Name3
0    Car   White  Black    NaN    Fiat    BMW    NaN
1  Fruit  Yellow    Red  Green  Banana  Apple  Melon

答案2

得分: 1

以下是翻译好的部分：

    df["N"] = df.assign(N=1).groupby("object")["N"].cumsum().map("Name{}&quot;.format)
    df["C"] = df.assign(C=1).groupby("object")["C"].cumsum().map("Color{}&quot;.format)
    out = df.pivot(index=["object"], columns=["N", "C"], values=["Name", "Color"])
    out.columns = [t[1] if t[0] == "Name" else t[2] for t in out.columns]
    print(out)

             Name1  Name2  Name3  Color1 Color2 Color3
    object                                            
    Car       Fiat    BMW    NaN   White  Black    NaN
    Fruit   Banana  Apple  Melon  Yellow    Red  Green

希望这对你有帮助。

英文:

With inspiration from comment by @mitoRibo, here is an answer:

df[&quot;N&quot;] = df.assign(N=1).groupby(&quot;object&quot;)[&quot;N&quot;].cumsum().map(&quot;Name{}&quot;.format)
df[&quot;C&quot;] = df.assign(C=1).groupby(&quot;object&quot;)[&quot;C&quot;].cumsum().map(&quot;Color{}&quot;.format)
out = df.pivot(index=[&quot;object&quot;], columns=[&quot;N&quot;, &quot;C&quot;], values=[&quot;Name&quot;, &quot;Color&quot;])
out.columns = [t[1] if t[0] == &quot;Name&quot; else t[2] for t in out.columns]
print(out)

         Name1  Name2  Name3  Color1 Color2 Color3
object                                            
Car       Fiat    BMW    NaN   White  Black    NaN
Fruit   Banana  Apple  Melon  Yellow    Red  Green

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将数据框中的分组数据分开放入不同列？

问题

答案1

答案2

如何在 Python 中删除具有 2000 万行的一个特定列中的重复项

在计算的条件下将两个Pyspark数据框连接起来。

Splitting "Check all that apply" survey column from Google Forms

创建一个计数器，在列的变化时递增。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论