How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

huangapple go评论100阅读模式
英文:

How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

问题

我有这样特性的虚构数据框架:

  1. df = pd.DataFrame({
  2. 'brand': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
  3. 'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
  4. 'flavour': [chili, chicken, chili, beef, cheese]
  5. })
  6. df
  7. brand style flavour
  8. 0 Yum Yum cup chili
  9. 1 ByRice cup chicken
  10. 2 LuxSoba cup chili
  11. 3 Indomie pack beef
  12. 4 Indomie pack cheese

我的目标是以这种方式更改数据框架,即删除所有品牌的重复条目,但如果有多种口味,将它们全部附加到第一个条目的一个列中。因此,数据框架应该如下所示:

  1. brand style flavour
  2. 0 Yum Yum cup chili
  3. 1 ByRice cup chicken
  4. 2 LuxSoba cup chili
  5. 3 Indomie pack beef, cheese

我不确定如何解决这个问题。

英文:

I have an imaginary dataframe of such nature:

  1. df = pd.DataFrame({
  2. 'brand': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
  3. 'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
  4. 'flavour': [chili, chicken, chili, beef, cheese]
  5. })
  6. df
  7. brand style flavour
  8. 0 Yum Yum cup chili
  9. 1 ByRice cup chicken
  10. 2 LuxSoba cup chili
  11. 3 Indomie pack beef
  12. 4 Indomie pack cheese

My goal is to change dataframe in such manner, that all duplicate entries of brands are deleted, but if there are several flavours, they all are appended into one column, to the first entry. So dataframe should look like this:

  1. brand style flavour
  2. 0 Yum Yum cup chili
  3. 1 ByRice cup chicken
  4. 2 LuxSoba cup chili
  5. 3 Indomie pack beef, cheese

I'm not sure how to approach this problem.

答案1

得分: 1

你可以这样做:

  1. df = pd.DataFrame({
  2. '品牌': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
  3. '风味': ['杯装', '杯装', '杯装', '包装', '包装'],
  4. '口味': ['辣椒', '鸡肉', '辣椒', '牛肉', '芝士']
  5. })
  6. df2 = df.groupby(['品牌', '风味'])['口味'].agg(lambda x: ', '.join(x)).reset_index()

结果:

How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

英文:

You can do this:

  1. df = pd.DataFrame({
  2. 'brand': ['Yum Yum', 'ByRice', 'LuxSoba', 'Indomie', 'Indomie'],
  3. 'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
  4. 'flavour': ['chili', 'chicken', 'chili', 'beef', 'cheese']
  5. })
  6. df2 = df.groupby(['brand', 'style'])['flavour'].agg(lambda x: ', '.join(x)).reset_index()

Result:

How can i remove all duplicated elements from list, but updating first entrys value in other column, if not the same

答案2

得分: 0

你可以使用 groupby_agg

  1. >>> df.groupby(['brand', 'style'], sort=False, as_index=False)['flavour'].agg(', '.join)
  2. brand style flavour
  3. 0 Yum Yum 杯子 辣椒味
  4. 1 ByRice 杯子 鸡肉味
  5. 2 LuxSoba 杯子 辣椒味
  6. 3 Indomie 包装袋 牛肉, 芝士味
英文:

You can use groupby_agg:

  1. >>> df.groupby(['brand', 'style'], sort=False, as_index=False)['flavour'].agg(', '.join)
  2. brand style flavour
  3. 0 Yum Yum cup chili
  4. 1 ByRice cup chicken
  5. 2 LuxSoba cup chili
  6. 3 Indomie pack beef, cheese

huangapple
  • 本文由 发表于 2023年4月6日 21:18:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/75950002.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定