2023年5月7日 17:46:06go评论68阅读模式

英文:

Update one Pandas dataframe from another and append rows if needed

问题

以下是翻译好的内容：

我在Pandas中有以下的数据框：

df1：
索引   列
1      A1
2      A2

df2：
索引   列
2      A2_new
3      A3

我想要获得如下结果：

索引   列
1      A1
2      A2_new
3      A3

我该如何实现这个目标？

df1.update(df2)不够有用，因为我想在结果中看到索引为3的行。

英文:

I have the following dataframes in Pandas:

df1:
index  column
1         A1
2         A2

df2:
index  column
2         A2_new
3         A3

I want to get the result:

index  column
1         A1
2         A2_new
3         A3

How do I can achieve this?

df1.update(df2) is not helpful, because I want to see row with index 3 in the result.

答案1

得分: 1

df1

    column
1	A1
2	A2

df2

	column
2	A2_new
3	A3

Code

df2.combine_first(df1)

output

    column
1	A1
2	A2_new
3	A3

英文:

Example

df1 = pd.DataFrame([&#39;A1&#39;, &#39;A2&#39;], columns=[&#39;column&#39;], index=[1, 2])
df2 = pd.DataFrame([&#39;A2_new&#39;, &#39;A3&#39;], columns=[&#39;column&#39;], index=[2, 3])

df1

    column
1	A1
2	A2

df2

	column
2	A2_new
3	A3

Code

df2.combine_first(df1)

output

    column
1	A1
2	A2_new
3	A3

答案2

得分: 0

Sure, here is the translated code:

@Ars ML
您可以垂直连接这两个DataFrame，并从'index'列中删除重复项，仅保留每个索引值的最后一次出现

df1 = pd.DataFrame({'index': [1, 2], 'column': ['A1', 'A2']})
df2 = pd.DataFrame({'index': [2, 3], 'column': ['A2_new', 'A3']})

merged_df = pd.concat([df1, df2]).drop_duplicates(subset=['index'], keep='last')
merged_df.set_index('index', inplace=True)

输出如您所期望的那样。

1          A1
2      A2_new
3          A3

您还可以使用merge，它更为复杂，但可以产生您期望的结果。

merge_chain = pd.merge(df1, df2, on='index', how='outer') \
                .assign(column=lambda x: x['column_y'].fillna(x['column_x'])) \
                .drop(['column_x', 'column_y'], axis=1) \
                .set_index('index')

希望这对您有帮助。

英文:

@Ars ML
You can concatenate the two DataFrames vertically and remove duplicates from 'index' column, keeping only the last occurrence of each index value

df1 = pd.DataFrame({&#39;index&#39;: [1, 2], &#39;column&#39;: [&#39;A1&#39;, &#39;A2&#39;]})
df2 = pd.DataFrame({&#39;index&#39;: [2, 3], &#39;column&#39;: [&#39;A2_new&#39;, &#39;A3&#39;]})

merged_df = pd.concat([df1, df2]).drop_duplicates(subset=[&#39;index&#39;], keep=&#39;last&#39;)
merged_df.set_index(&#39;index&#39;, inplace=True)

outputs as per your desired outcome.

1          A1
2      A2_new
3          A3

You can also use merge, it is more involved but produces your desired outcome.

merge_chain = pd.merge(df1, df2, on=&#39;index&#39;, how=&#39;outer&#39;) \
                .assign(column=lambda x: x[&#39;column_y&#39;].fillna(x[&#39;column_x&#39;])) \
                .drop([&#39;column_x&#39;, &#39;column_y&#39;], axis=1) \
                .set_index(&#39;index&#39;)

答案3

得分: 0

另一个可能的解决方案：

out = pd.concat([df1, df2])
out[~out.index.duplicated(keep='last')]

输出：

     column
1        A1
2    A2_new
3        A3

英文:

Another possible solution:

out = pd.concat([df1, df2])
out[~out.index.duplicated(keep=&#39;last&#39;)]

Output:

   column
1      A1
2  A2_new
3      A3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

更新一个Pandas数据框从另一个数据框，并在需要时附加行。

问题

答案1

答案2

答案3

Python中与Golang的defer语句相对应的是什么？

检查列表中提到的所有文件是否存在于输入目录中。

在R中动态地对每一列执行特定的数学函数。

Max retries exceeded with url error Python Requests

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论