英文:
Find rows that have changed in the latest dataframe
问题
以下是翻译好的部分:
我有两个如下的pandas数据框:
df1=pd.DataFrame({'A':[1,2,3],'B':[2,3,4]})
df2=pd.DataFrame({'A':[1,2,4],'B':[2,2,5]})
以下代码只给出了两个数据框之间的差异,如何仅获取df2中的更改?
changed_rows = pd.concat([df1, df2]).drop_duplicates(keep=False)
英文:
I have two pandas dataframes as follows:
df1=pd.DataFrame({'A':[1,2,3],'B':[2,3,4]})
df2=pd.DataFrame({'A':[1,2,4],'B':[2,2,5]})
Following code gives me just a difference between 2 dataframes, how to get only changes from the df2?
changed_rows = pd.concat([df1, df2]).drop_duplicates(keep=False)
答案1
得分: 1
# 在所有列上进行左连接,并使用合并指示器标识已修改的行
df2.merge(df1, how='left', indicator=True).query("_merge != 'both'")
A B _merge
1 2 2 left_only
2 4 5 left_only
英文:
Left merge on all columns and use merge indicator to identify the modified rows
df2.merge(df1, how='left', indicator=True).query("_merge != 'both'")
A B _merge
1 2 2 left_only
2 4 5 left_only
答案2
得分: 1
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.compare.html#pandas.DataFrame.compare
我认为你想要的是类似这样的东西。Self 是 df1,df2 是 'other',使用 df1.compare(df2)
。
d = df1.compare(df2, align_axis=0, keep_shape=True, keep_equal=False)
print(d)
A B
0 self NaN NaN
other NaN NaN
1 self NaN 3.0
other NaN 2.0
2 self 3.0 4.0
other 4.0 5.0
英文:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.compare.html#pandas.DataFrame.compare
I think something like this is what you're after. Self is df1, df2 is 'other' df1.compare(df2)
.
d = df1.compare(df2, align_axis=0, keep_shape=True, keep_equal=False)
print(d)
A B
0 self NaN NaN
other NaN NaN
1 self NaN 3.0
other NaN 2.0
2 self 3.0 4.0
other 4.0 5.0
答案3
得分: 1
另一个可能的解决方案:
df2[~df1.eq(df2).all(axis=1)]
输出:
A B
1 2 2
2 4 5
英文:
Another possible solution:
df2[~df1.eq(df2).all(axis=1)]
Output:
A B
1 2 2
2 4 5
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论