How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

huangapple go评论76阅读模式
英文:

How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

问题

我有这样特性的虚构数据框架:

df = pd.DataFrame({
    'brand': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
    'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
    'flavour': [chili, chicken, chili, beef, cheese]
})
df
    brand style  flavour
0  Yum Yum   cup     chili
1  ByRice   cup     chicken
2  LuxSoba   cup     chili
3  Indomie  pack    beef
4  Indomie  pack     cheese

我的目标是以这种方式更改数据框架,即删除所有品牌的重复条目,但如果有多种口味,将它们全部附加到第一个条目的一个列中。因此,数据框架应该如下所示:

    brand style  flavour
0  Yum Yum   cup     chili
1  ByRice   cup     chicken
2  LuxSoba   cup     chili
3  Indomie  pack    beef, cheese

我不确定如何解决这个问题。

英文:

I have an imaginary dataframe of such nature:

df = pd.DataFrame({
    'brand': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
    'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
    'flavour': [chili, chicken, chili, beef, cheese]
})
df
    brand style  flavour
0  Yum Yum   cup     chili
1  ByRice   cup     chicken
2  LuxSoba   cup     chili
3  Indomie  pack    beef
4  Indomie  pack     cheese

My goal is to change dataframe in such manner, that all duplicate entries of brands are deleted, but if there are several flavours, they all are appended into one column, to the first entry. So dataframe should look like this:

    brand style  flavour
0  Yum Yum   cup     chili
1  ByRice   cup     chicken
2  LuxSoba   cup     chili
3  Indomie  pack    beef, cheese

I'm not sure how to approach this problem.

答案1

得分: 1

你可以这样做:

df = pd.DataFrame({
    '品牌': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
    '风味': ['杯装', '杯装', '杯装', '包装', '包装'],
    '口味': ['辣椒', '鸡肉', '辣椒', '牛肉', '芝士']
})
df2 = df.groupby(['品牌', '风味'])['口味'].agg(lambda x: ', '.join(x)).reset_index()

结果:

How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

英文:

You can do this:

df = pd.DataFrame({
    'brand': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
    'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
    'flavour': ['chili', 'chicken', 'chili', 'beef', 'cheese']
})
df2 = df.groupby(['brand', 'style'])['flavour'].agg(lambda x: ', '.join(x)).reset_index()

Result:

How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

答案2

得分: 0

你可以使用 groupby_agg

>>> df.groupby(['brand', 'style'], sort=False, as_index=False)['flavour'].agg(', '.join)
     brand style       flavour
0  Yum Yum   杯子         辣椒味
1   ByRice   杯子       鸡肉味
2  LuxSoba   杯子         辣椒味
3  Indomie  包装袋  牛肉, 芝士味
英文:

You can use groupby_agg:

>>> df.groupby(['brand', 'style'], sort=False, as_index=False)['flavour'].agg(', '.join)
     brand style       flavour
0  Yum Yum   cup         chili
1   ByRice   cup       chicken
2  LuxSoba   cup         chili
3  Indomie  pack  beef, cheese

huangapple
  • 本文由 发表于 2023年4月6日 21:18:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/75950002.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定