英文:
Find combination of two list in Pandas dataframe
问题
I want to find the combination of the 2 lists in a data frame but without combinations inside the lists:
我想在数据框中找到两个列表的组合,但不包括列表内部的组合:
With these combinations I want to check whether they are inside two columns of a data frame and if yes, extract the rows:
我想使用这些组合来检查它们是否在数据框的两列中,如果是的话,提取行:
Return:
返回:
How can I extract the rows?
如何提取行?
英文:
I have two lists:
List1:
123
456
789
List2:
321
654
987
I want to find the combination of the 2 lists in a data frame but without combinations inside the lists:
123-321
123-654
123-987
456-321
456-654
456-987
789-321
789-654
789-987
321-123
321-456
321-789
654-123
654-456
654-789
987-123
987-456
987-789
With these combinations I want to check whether they are inside two columns of a data frame and if yes, extract the rows:
A B Value
123 321 0.5
456 111 0.4
987 654 0.3
Return:
A B Value
123 321 0.5
How can I extract the rows?
答案1
得分: 1
你可以在这两个列表构建的两列之间进行交叉合并。然后,使用NumPy的广播功能检查这个合并数据帧中是否存在df
的A
和B
列。
a = [123, 456, 789]
b = [321, 654, 987]
m = pd.DataFrame({'A': a}).merge(pd.DataFrame({'B': b}), how='cross').to_numpy()[:, None] == df[['A', 'B']].to_numpy()
out = df[m.all(axis=-1).any(axis=0)]
print(out)
A B Value
0 123 321 0.5
英文:
You can do a cross merge between two columns constructed from the two lists. Then check the existence of df
A, B
columns in that merges dataframe with numpy broadcasting.
a = [123, 456, 789]
b = [321, 654, 987]
m = pd.DataFrame({'A': a}).merge(pd.DataFrame({'B': b}), how='cross').to_numpy()[:, None] == df[['A', 'B']].to_numpy()
out = df[m.all(axis=-1).any(axis=0)]
print(out)
A B Value
0 123 321 0.5
答案2
得分: 0
import pandas as pd
a = [123, 456, 789]
b = [321, 654, 987]
df = pd.DataFrame({'A': [123, 456, 987], 'B': [321, 111, 654], 'value': [0.5, 0.4, 0.3]})
print(df[(df.A.isin(a) & df.B.isin(b) & ~df.A.isin(b) & ~df.B.isin(a)) | (df.A.isin(b) & df.B.isin(a) & ~df.A.isin(a) & ~df.B.isin(b))])
英文:
import pandas as pd
a = [123,456,789]
b = [321, 654, 987]
df = pd.DataFrame({'A': [123, 456, 987], 'B': [321,111,654], 'value': [0.5, 0.4, 0.3]
})
print(df[(df.A.isin(a) & df.B.isin(b) & ~df.A.isin(b) & ~df.B.isin(a)) | (df.A.isin(b) & df.B.isin(a) & ~df.A.isin(a) & ~df.B.isin(b))])
Returns:
A B value
0 123 321 0.5
it works by using a boolean mask, that checks that either:
- column A is in list a, but not in list b and column B is in list b but in in list a
- or the other way around
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论