英文:
Conditional merging of two dataframes in python3.7
问题
以下是您要翻译的内容:
我有以下的数据框
col1 term1 term2
ab|a ab a
cd cd
我想要将这个数据框与另一个数据框(df2)合并,使用“term1”和“term2”两列,但在值为None时跳过/忽略(就像在第2行中一样)。我试图在一个for循环中使用if/else条件来实现这一目标。请查看下面的伪代码(这不是一个功能性代码,因为它也显示错误)。
这是正确的方法吗,还有更好的方法吗?
df1 = pd.concat([df["col1"], df["col1"].str.split("|", expand=True)], axis=1)
df1.rename(columns={0: 'term1', 1: 'term2'}, inplace=True)
for index, row in df1.iterrows():
if row['term1'] is None:
break
else:
row = row.to_frame()
print(row)
row.merge(df2, how='inner', left_on='term1', right_on='STR')
<details>
<summary>英文:</summary>
I have following dataframe
col1 term1 term2
ab|a ab a
cd cd
I would like to merge this dataframe to another dataframe (df2) using both the columns "term1" and "term2" but skip/ignore when it is None (like in row 2). I am trying to use if/else condition here in a for loop. Please see the pseudocode below (this is not a functional code as it is showing error as well).
Is it a right approach or there is nicer way to do this.
df1 = pd.concat([df["col1
"],df["col1"].str.split("|", expand=True)], axis=1)
df1.rename(columns={0: 'term1', 1: 'term2'}, inplace=True)
for index, row in df1.iterrows():
if row['term1'] is None:
break
else:
row = row.to_frame()
print (row)
row.merge(df2, how = 'inner', left_on = 'term1', right_on = 'STR')
</details>
# 答案1
**得分**: 1
在 pandas 数据框中使用循环是一种代码异味。为了排除具有空值的行,在合并之前,只需删除它们。您可以首先使用 pandas 的 dropna ([文档][1])。类似这样:
```python
df1 = df1.dropna(subset=["term1", "term2"])
然后应用 pandas 的 merge (文档):
df = df1.merge(df2, on=["term1", "term2"])
为了使代码更简短,您可以首先定义 merge_columns = ["term1", "term2"]
以在 dropna 和 merge 方法中使用。您还可以在合并中直接进行筛选,我只是分步进行以便清晰明了。
希望对您有所帮助。
英文:
A loop in pandas dataframes is a code smell. In order to exclude rows with null values, just drop them before merging. You could first use pandas' dropna (doc). Something like this:
df1 = df1.dropna(subset=["term1", "term2"])
And then apply pandas' merge (doc):
df = df1.merge(df2, on=["term1", "term2"])
In order to make the code shorter, you could first define merge_columns = ["term1", "term2"]
to be used in the dropna and merge method. You could also make the filter right inside the merge, I just did it step by step to be clear.
Hope it helps.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论