从DataFrame中根据特定条件提取特定组。

huangapple go评论95阅读模式
英文:

Extracting Specific Groups from a DataFrame under Specific Conditions

问题

I want to extract groups that contain a row with (Name: J and Age: 33). Here's the translated code:

  1. 我想提取包含(Name: J Age: 33)的行的组以下是翻译好的代码
  2. df = pd.DataFrame({
  3. 'X': [1,1,1,1,2,2,2,2],
  4. 'Y': [3,3,4,4,3,3,4,4],
  5. 'Name': ['J', 'A', 'B', 'X', 'V', 'P', 'J', 'V'],
  6. 'Age': [33,47,53,22,33,80,33,93]
  7. })

I hope this helps!

英文:

I have a DataFrame as below where I want to extract groups that contain a row with (Name: J and Age: 33)

X Y Name Age
1 3 J 33
1 3 A 47
1 4 B 53
1 4 X 22
2 3 J 33
2 3 P 80
2 4 V 90
2 4 V 93

Overall it would produce the table below, because the 1, 3 X/Y group contains J, 33 and the 2,3 X/Y group also contains a J, 33 row.

X Y Name Age
1 3 J 33
1 3 A 47
2 3 J 33
2 3 P 80

I've been approaching this by iterating over the rows which has been far too slow and was wondering if there was a much faster way by using the groupby and apply/pipe methods in Pandas. Any help is appreciated

Example DF below:

  1. df = pd.DataFrame({
  2. 'X': [1,1,1,1,2,2,2,2],
  3. 'Y': [3,3,4,4,3,3,4,4],
  4. 'Name': ['J', 'A', 'B', 'X', 'V', 'P', 'J', 'V'],
  5. 'Age': [33,47,53,22,33,80,33,93]
  6. })

答案1

得分: 1

以下是翻译好的内容:

使用groupby的一个选项:

  1. # 获取等于('J',33)的行
  2. check = df.loc(axis=1)[['Name','Age']].eq(('J', 33)).all(axis=1)
  3. # 运行groupby并获取在该组中的任何行存在True的组
  4. check = check.groupby([df.X, df.Y]).transform('any')
  5. # 过滤原始数据框
  6. df.loc[check]
  7. X Y Name Age
  8. 0 1 3 J 33
  9. 1 1 3 A 47
  10. 4 2 3 J 33
  11. 5 2 3 P 80

另一个选项,仍然使用groupby:

  1. group = df.groupby(['X','Y'])
  2. cond1 = group.Name.transform(lambda x: any(x == 'J'))
  3. cond2 = group.Age.transform(lambda x: any(x == 33))
  4. df.loc[cond1 & cond2]
  5. X Y Name Age
  6. 0 1 3 J 33
  7. 1 1 3 A 47
  8. 4 2 3 J 33
  9. 5 2 3 P 80
英文:

One option is with a groupby:

  1. # Get rows equal to ('J', 33)
  2. check = df.loc(axis=1)[['Name','Age']].eq(('J', 33)).all(axis=1)
  3. # run a groupby and get groups where True exists for any row in that group
  4. check = check.groupby([df.X, df.Y]).transform('any')
  5. #filter original dataframe
  6. df.loc[check]
  7. X Y Name Age
  8. 0 1 3 J 33
  9. 1 1 3 A 47
  10. 4 2 3 J 33
  11. 5 2 3 P 80

Another option, still with a groupby:

  1. group = df.groupby(['X','Y'])
  2. cond1 = group.Name.transform(lambda x: any(x == 'J'))
  3. cond2 = group.Age.transform(lambda x: any(x == 33))
  4. df.loc[cond1 & cond2]
  5. X Y Name Age
  6. 0 1 3 J 33
  7. 1 1 3 A 47
  8. 4 2 3 J 33
  9. 5 2 3 P 80

答案2

得分: 0

以下是您要翻译的内容:

  1. mask = (df['Name'] == 'J') & (df['Age'] == 33)
  2. unique_x = df.loc[mask, 'X'].unique()
  3. unique_y = df.loc[mask, 'Y'].unique()
  4. print(df[df['X'].isin(unique_x) & df['Y'].isin(unique_y)])

打印结果:

  1. X Y Name Age
  2. 0 1 3 J 33
  3. 1 1 3 A 47
  4. 4 2 3 J 33
  5. 5 2 3 P 80

编辑:根据更新的问题,以下是没有使用 .groupby 的解决方案:

  1. mask = (df['Name'] == 'J') & (df['Age'] == 33)
  2. t = set(df.loc[mask, ['X', 'Y']].drop_duplicates().apply(tuple, 1))
  3. out = df[df.loc[:, ['X', 'Y']].apply(lambda x: tuple(x) in t, axis=1)]
  4. print(out)

打印结果:

  1. X Y Name Age
  2. 0 1 3 J 33
  3. 1 1 3 A 47
  4. 4 2 3 J 33
  5. 5 2 3 P 80
英文:

You can search for all unique X and Y values that contain Name == J and Age == 33 and filter the dataframe afterwards:

  1. mask = (df['Name'] == 'J') & (df['Age'] == 33)
  2. unique_x = df.loc[mask, 'X'].unique()
  3. unique_y = df.loc[mask, 'Y'].unique()
  4. print(df[df['X'].isin(unique_x) & df['Y'].isin(unique_y)])

Prints:

  1. X Y Name Age
  2. 0 1 3 J 33
  3. 1 1 3 A 47
  4. 4 2 3 J 33
  5. 5 2 3 P 80

EDIT: With updated question, a solution without .groupby:

  1. mask = (df['Name'] == 'J') & (df['Age'] == 33)
  2. t = set(df.loc[mask, ['X', 'Y']].drop_duplicates().apply(tuple, 1))
  3. out = df[df.loc[:, ['X', 'Y']].apply(lambda x: tuple(x) in t, axis=1)]
  4. print(out)

Prints:

  1. X Y Name Age
  2. 0 1 3 J 33
  3. 1 1 3 A 47
  4. 4 2 3 J 33
  5. 5 2 3 P 80

huangapple
  • 本文由 发表于 2023年6月5日 06:52:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76402712.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定