Python / 在满足条件的情况下从每个组中移除重复项

huangapple go评论62阅读模式
英文:

Python / remove duplicate from each group if condition meets in the group

问题

我想要删除每个组中仅在某一列的特定值中存在重复的行。

df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'B', 'C', 'C'],
    'channel': ['X', 'Y', 'Y', 'X', 'X', 'A', 'X'],
    'value': [1, 2, 3, 3, 4, 5, 5]
})

我想要保留每个组中仅保留X的第一个出现,并删除其他行,如下所示:

我尝试了以下代码,但这会删除所有具有重复通道值的行,而不管通道是否为X:

df = df.groupby('group').apply(lambda x: x.drop_duplicates(subset='channel', keep='first') if 'X' in x['channel'].values else x)
英文:

I want to delete the rows in each group if there is the duplicate only in particular value of a column

df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'B', 'C', 'C'],
    'channel':['X','Y','Y','X','X','A','X'],
    'value': [1, 2, 3, 3, 4, 5, 5]
})

Python / 在满足条件的情况下从每个组中移除重复项

I want to keep the rows in each group only with first occurence of X and delete others as below

Python / 在满足条件的情况下从每个组中移除重复项

I tried below code but this delets all rows that have duplicate channel value irrespective of whther channel = X or not

df = df.groupby('group').apply(lambda x: x.drop_duplicates(subset='channel', keep='first') if 'X' in x['channel'].values else x)

答案1

得分: 1

只返回翻译好的部分:

让我们创建一个用于筛选所需行的布尔掩码

mask = df['channel'].eq('X') & df.duplicated(subset=['group', 'channel'])

结果

df[~mask]

      group channel  value
    0     A       X      1
    1     A       Y      2
    2     B       Y      3
    3     B       X      3
    5     C       A      5
    6     C       X      5
英文:

Lets create a boolean mask for filtering the required rows

mask = df['channel'].eq('X') & df.duplicated(subset=['group', 'channel'])

Result

df[~mask]

  group channel  value
0     A       X      1
1     A       Y      2
2     B       Y      3
3     B       X      3
5     C       A      5
6     C       X      5

huangapple
  • 本文由 发表于 2023年7月24日 16:25:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76752630.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定