显示pandas中的重复项

huangapple go评论75阅读模式
英文:

Displaying duplicates in pandas

问题

我想展示数据框的重复行,以便更好地理解。我想要按重复行进行分组。

这个示例希望能够澄清我的意图。假设我们有以下数据框:

CC BF FA WC Strength
1  2  3  4   1
2  3  4  5   6
1  2  3  4   8
1  2  3  4   4
2  3  4  5   7

在去除 Strength 列后,行1,3,4和行2,5是重复的。我想要获得一个新的数据框,显示如下:

CC BF FA WC Strength_min Strength_max Count
1  2  3  4  1            8             3
2  3  4  5  6            7             2
英文:

I would like to display the duplicates of a dataframe in order to get a better understanding. I would like to groupby the duplicated rows

This example hopefully clarifies what I want to do. Assume we have given the dataframe below


CC BF FA WC Strength
1  2  3  4   1
2  3  4  5   6
1  2  3  4   8
1  2  3  4   4
2  3  4  5   7

Here rows 1,3,4 and 2,5 are duplicates after removing Strength. I would like to get a new dataframe that displays

CC BF FA WC Strength_min Strength_max Count
1  2  3  4  1            8             3
2  3  4  5  6            7             2

答案1

得分: 4

你需要一个自定义的 groupby.agg,其中使用 Index.difference 的输出作为分组依据:

(df.groupby(list(df.columns.difference(['Strength'], sort=False)))[['Strength']]
   .agg({'Strength_min': 'min', 'Strength_max': 'max', 'Count': 'count'})
   .reset_index()
)

输出:

   CC  BF  FA  WC  Strength_min  Strength_max  Count
0   1   2   3   4             1             8      3
1   2   3   4   5             6             7      2
英文:

You need a custom groupby.agg with the output from Index.difference as grouper:

(df.groupby(list(df.columns.difference(['Strength'], sort=False)))['Strength']
   .agg(**{'Strength_min': 'min', 'Strength_max': 'max', 'Count': 'count'})
   .reset_index()
)

Output:

   CC  BF  FA  WC  Strength_min  Strength_max  Count
0   1   2   3   4             1             8      3
1   2   3   4   5             6             7      2

huangapple
  • 本文由 发表于 2023年3月3日 18:10:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75625708.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定