2023年7月12日 20:40:42go评论105阅读模式

英文:

Group by and aggregate the columns in pandas dataframe

问题

我有以下的数据框，我想按照某一列进行分组，并以分隔符“ | ”聚合该行中其他列的唯一值。以下是示例行：

col1                                                col2            col3                col4
THREE M SYNDROME 1	{3-M syndrome 1, 273750 (3)}	3-m syndrome 1	{3-M syndrome 1}	273750
THREE M SYNDROME 1	{3-M syndrome 1, 273750 (3)}	3-m syndrome 2	{3-M syndrome 2}	273750

我想按照'col1'列进行分组，并聚合其他唯一值。预期的数据框如下：

col1                                                col2            col3                       col4
THREE M SYNDROME 1	{3-M syndrome 1, 273750 (3)}	3-m syndrome 1 | 3-m syndrome 2	{3-M syndrome 1} | {3-M syndrome 2}	273750

我正在使用以下代码行：

join_unique = lambda x: ' | '.join(x.unique())
df2= df.groupby(['col1'], as_index=False).agg(join_unique)

我得到了输出，但是'col4'没有包含在输出中。

希望能有所帮助。

英文:

I have following dataframe which I would like to group by a certain column and aggregate the uniques values in other column of respective row by a separator like ' | '. Below is the sample rows:

col1                                                col2            col3                col4
THREE M SYNDROME 1	{3-M syndrome 1, 273750 (3)}	3-m syndrome 1	{3-M syndrome 1}	273750
THREE M SYNDROME 1	{3-M syndrome 1, 273750 (3)}	3-m syndrome 2	{3-M syndrome 2}	273750

I would like to group by 'col1' and aggregate the other unique values. The expected df is:

col1                                                col2            col3                       col4
THREE M SYNDROME 1	{3-M syndrome 1, 273750 (3)}	3-m syndrome 1 | 3-m syndrome 2	{3-M syndrome 1} | {3-M syndrome 2}	273750

I am using following lines of code.

join_unique = lambda x: &#39; | &#39;.join(x.unique())
df2= df.groupby([&#39;preferred_title_symbol&#39;], as_index=False).agg(join_unique)

I get output but col4 is not included in the output.

col1                                                col2            col3                       
THREE M SYNDROME 1	{3-M syndrome 1, 273750 (3)}	3-m syndrome 1 | 3-m syndrome 2	{3-M syndrome 1} | {3-M syndrome 2}

Any help is highly appreciated.

答案1

得分: 0

可能是因为col4包含整数，所以连接不起作用。您可以尝试像这样使用if/else：

data = {'col1': {0: '三M综合症1 {三M综合症1，273750 (3)}',
  1: '三M综合症1 {三M综合症1，273750 (3)}'},
 'col2': {0: '3-M综合症1', 1: '3-M综合症2'},
 'col3': {0: '{三M综合症1}', 1: '{三M综合症2}'},
 'col4': {0: 273750, 1: 273750}}
df = pd.DataFrame(data)
>> df.groupby('col1').agg(lambda x: ' | '.join(x.unique()) if x.nunique()>1 else x.unique()[0] )
Out:
                                             col2        col3   col4
col1
三M综合症1 {三M综合症1，273750 (3)}  3-M综合症1 | 3-M综合症2  {三M综合症1} | {三M综合症2}  273750

英文:

It could be because col4 contains integers, therefore the join doesn't work. You could try with an if/else like this:

data = {&#39;col1&#39;: {0: &#39;THREE M SYNDROME 1  {3-M syndrome 1, 273750 3)}&#39;,
  1: &#39;THREE M SYNDROME 1  {3-M syndrome 1, 273750 (3)}&#39;},
 &#39;col2&#39;: {0: &#39;3-m syndrome 1&#39;, 1: &#39;3-m syndrome 2&#39;},
 &#39;col3&#39;: {0: &#39;{3-M syndrome 1}&#39;, 1: &#39;{3-M syndrome 2}&#39;},
 &#39;col4&#39;: {0: 273750, 1: 273750}}
df = pd.DataFrame(data)
&gt;&gt;&gt; df.groupby(&#39;col1&#39;).agg(lambda x: &#39; | &#39;.join(x.unique()) if x.nunique()&gt;1 else x.unique()[0]   )
Out:
                                                col2	col3	col4
col1			
THREE M SYNDROME 1 {3-M syndrome 1, 273750 (3)}	3-m syndrome 1 | 3-m syndrome 2	{3-M syndrome 1} | {3-M syndrome 2}	273750

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

按照 pandas 数据框中的列进行分组和聚合。

问题

答案1

有没有办法计算仅在几列中（仅唯一值）计算运行总数？

自定义API错误代码的最佳实践

Python从CSV文件中读取数据集

Python函数来反转一个链表

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。