2023年2月8日 22:22:35go评论94阅读模式

英文:

Python: Table where identical ID/Numbers with different values to being them on one line where the different values are appended to the right

问题

我有一个带有一些在多行上相同的ID的Pandas表格，但分配的值不同。如何将ID仅显示一次在一行上，并将各种值附加在多个列中？

起始点：

ID	Column 1
1	blue
1	red
2	gray
3	yellow
4	orange
1	pink
2	white

期望的解决方案：

ID	Column 1	Column 2	Column 3
1	blue	red	pink
2	gray	white
3	yellow
4	orange

英文:

I have a Pandas Table with some IDs that are identical on several lines but the assigned value is different. How is it possible to get a result where the ID is only shown once on one line and append the various values in multiple columns?

Starting point:

ID	Column 1
1	blue
1	red
2	gray
3	yellow
4	orange
1	pink
2	white

Desired solution:

ID	Column 1	Column 2	Column 3
1	blue	red	pink
2	gray	white
3	yellow
4	orange

答案1

得分: 0

按照ID分组，然后计算唯一的数值

df.groupby("ID")["Column 1"].apply(lambda x: pd.Series(x.unique())).unstack()

英文:

Groupby the ID and then compute the unique values

df.groupby(&quot;ID&quot;)[&quot;Column 1&quot;].apply(lambda x: pd.Series(x.unique())).unstack()

答案2

得分: 0

你可以使用向量化的方式重塑你的数据框架：

(df.assign(col=df.groupby('ID').cumcount().add(1))
   .set_index(['ID', 'col'])['Column 1']
   .unstack('col').add_prefix('Column ')
   .reset_index().rename_axis(columns=None))
   ID Column 1 Column 2 Column 3
0   1     blue      red     pink
1   2     gray    white      NaN
2   3   yellow      NaN      NaN
3   4   orange      NaN      NaN

使用 pivot_table：

(df.pivot_table(index='ID', values='Column 1', aggfunc='first', fill_value='',
               columns='Column ' + df.groupby('ID').cumcount().add(1).astype(str))
  .reset_index())
   ID Column 1 Column 2 Column 3
0   1     blue      red     pink
1   2     gray    white        
2   3   yellow                  
3   4   orange

英文:

You can reshape your dataframe in a vectorized way:

&gt;&gt;&gt; (df.assign(col=df.groupby(&#39;ID&#39;).cumcount().add(1))
       .set_index([&#39;ID&#39;, &#39;col&#39;])[&#39;Column 1&#39;]
       .unstack(&#39;col&#39;).add_prefix(&#39;Column &#39;)
       .reset_index().rename_axis(columns=None))
   ID Column 1 Column 2 Column 3
0   1     blue      red     pink
1   2     gray    white      NaN
2   3   yellow      NaN      NaN
3   4   orange      NaN      NaN

With pivot_table:

&gt;&gt;&gt; (df.pivot_table(index=&#39;ID&#39;, values=&#39;Column 1&#39;, aggfunc=&#39;first&#39;, fill_value=&#39;&#39;,
                   columns=&#39;Column &#39; + df.groupby(&#39;ID&#39;).cumcount().add(1).astype(str))
      .reset_index())
   ID Column 1 Column 2 Column 3
0   1     blue      red     pink
1   2     gray    white         
2   3   yellow                  
3   4   orange

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Python: Table where identical ID/Numbers with different values to being them on one line where the different values are appended to the right

问题

答案1

答案2

如何遍历列表以生成URL的元素。

Python自动售货机程序- 我有两个问题

在Linux（CentOS）中安装Modin Pandas。

Getting requirements to build wheel-error Pygame on Windows

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。