问题

我有两个DataFrame，df1和df2。在我的代码中，我使用Pandas.concat方法来查找它们之间的差异。

df1 = pd.read_excel(latest_file, 0)
df2 = pd.read_excel(latest_file, 1)

读取电子表格中的第一个和第二个工作表。

new_dataframe = pd.concat([df1, df2]).drop_duplicates(keep=False)


这个方法运行得很好，但是我想知道哪些行来自df1，哪些来自df2。为了显示这一点，我想在new_dataframe中添加一列，如果它来自df1，则在新列中写入"Removed"，如果来自df2，则写入"Added"。我似乎找不到如何做到这一点的文档。在此先提前感谢任何帮助。
编辑：在我的当前代码中，它删除了每个DataFrame中相同的行。解决方案仍然需要删除共同的行。
<details>
<summary>英文:</summary>
I have two DataFrames, df1 and df2. In my code I used Pandas.concat method to find the differences between them.

df1 = pd.read_excel(latest_file, 0)
df2 = pd.read_excel(latest_file, 1)
#Reads first and second sheet inside spreadsheet.

new_dataframe = pd.concat([df1,df2]).drop_duplicates(keep=False)


This works perfectly, however I want to know which rows are coming from df1, and which are coming from df2. to show this I want to add a column to new_dataframe, if it&#39;s from df1 to say &quot;Removed&quot; in the new column, and to say &#39;Added&#39; if it&#39;s from df2. I can&#39;t seem to find any documentation on how to do this. Thanks in advance for any help.
Edit: In my current code it removed all columns which are identical in each DataFrame. The solution has to still remove the common rows.
</details>
# 答案1
**得分**: 1
考虑使用 `pd.merge` 并将 `indicator=True` 一同使用。这将创建一个名为 `_merge` 的新列，指示了值来自哪一列。您可以将其修改为表示 "Removed" 和 "Added"。
```python
df1 = pd.DataFrame({'col1': [1, 2, 3, 4, 5]})
df2 = pd.DataFrame({'col1': [3, 4, 5, 6, 7})
m = {'left_only': 'Removed', 'right_only': 'Added'}
new_dataframe = pd.merge(df1, df2, how='outer', indicator=True) \
                  .query('_merge != "both"') \
                  .replace({'_merge': m})

输出结果：

   col1   _merge
0     1  Removed
1     2  Removed
5     6    Added
6     7    Added

英文:

Consider using pd.merge with indicator=True instead. This will create a new column named _merge that indicates which value came from which column. You can modify this to say Removed and Added

df1 = pd.DataFrame({&#39;col1&#39;: [1,2,3,4,5]})
df2 = pd.DataFrame({&#39;col1&#39;: [3,4,5,6,7]})
m = {&#39;left_only&#39;: &#39;Removed&#39;, &#39;right_only&#39;: &#39;Added&#39;}
new_dataframe = pd.merge(df1, df2, how=&#39;outer&#39;, indicator=True) \
                  .query(&#39;_merge != &quot;both&quot;&#39;) \ 
                  .replace({&#39;_merge&#39;: m})

Output:

   col1   _merge
0     1  Removed
1     2  Removed
5     6    Added
6     7    Added

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在合并后的数据帧中添加新列，基于预先合并的数据帧。

问题

读取电子表格中的第一个和第二个工作表。

如何在Python中拆分字符串，包括空格？

vscode一直输出与我的第一行代码相同的内容。

寻找百万位数中连续1位数的最快方法

Python的POST请求在控制台上无法接收POST数据，但在Postman上运行正常。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。