英文:
Left Outer Join with two single columned dataframes
问题
我在Pandas Merging 101中没有看到下面提到的情况。
我在理解Pandas文档中如何执行左外连接方面遇到了困难。
import pandas as pd
left_df = pd.DataFrame({
'user_id': ['Peter', 'John', 'Robert', 'Anna']
})
right_df = pd.DataFrame({'user_id': ['Paul', 'Mary', 'John', 'Anna']})
pd.merge(left_df, right_df, on='user_id', how='left')
输出是:
user_id
0 Peter
1 John
2 Robert
3 Anna
预期输出是:
user_id
0 Peter
1 Robert
我漏掉了什么?indicator = True
参数是否必需(用于创建一个_merge列以进行过滤)以执行左外连接?
英文:
I don't see the below case mentioned in Pandas Merging 101.
<br> <br>
I'm having trouble understanding the Pandas documentation for doing a left outer join.
import pandas as pd
left_df = pd.DataFrame({
'user_id': ['Peter', 'John', 'Robert', 'Anna']
})
right_df = pd.DataFrame({'user_id': ['Paul', 'Mary', 'John',
'Anna']
})
pd.merge(left_df, right_df, on = 'user_id', how = 'left')
Output is: <br>
user_id
0 Peter
1 John
2 Robert
3 Anna
Expected output:
user_id
0 Peter
1 Robert
What am I missing? Is the indicator = True
parameter a must (to create a _merge column to filter on) for left outer joins?
答案1
得分: 1
你可以使用 merge
并设置 indicator=True
,然后只保留值为 left_only
的行,但这不是最佳方法。你可以使用 isin
来获取一个布尔掩码,然后反转它:
left_df[~left_df['user_id'].isin(right_df['user_id'])]
user_id
0 Peter
2 Robert
使用 merge
:
(left_df.merge(right_df, on='user_id', how='left', indicator='present')
.loc[lambda x: x.pop('present') == 'left_only'])
user_id
0 Peter
2 Robert
英文:
You can use merge
with indicator=True
and keep only rows where value is set to left_only
but it's not the best way. You can use isin
to get a boolean mask then invert it:
>>> left_df[~left_df['user_id'].isin(right_df['user_id'])]
user_id
0 Peter
2 Robert
With merge
:
>>> (left_df.merge(right_df, on='user_id', how='left', indicator='present')
.loc[lambda x: x.pop('present') == 'left_only'])
user_id
0 Peter
2 Robert
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论