英文:
Update one Pandas dataframe from another and append rows if needed
问题
以下是翻译好的内容:
我在Pandas中有以下的数据框:
df1:
索引 列
1 A1
2 A2
df2:
索引 列
2 A2_new
3 A3
我想要获得如下结果:
索引 列
1 A1
2 A2_new
3 A3
我该如何实现这个目标?
df1.update(df2)不够有用,因为我想在结果中看到索引为3的行。
英文:
I have the following dataframes in Pandas:
df1:
index column
1 A1
2 A2
df2:
index column
2 A2_new
3 A3
I want to get the result:
index column
1 A1
2 A2_new
3 A3
How do I can achieve this?
df1.update(df2) is not helpful, because I want to see row with index 3 in the result.
答案1
得分: 1
df1
column
1 A1
2 A2
df2
column
2 A2_new
3 A3
Code
df2.combine_first(df1)
output
column
1 A1
2 A2_new
3 A3
英文:
Example
df1 = pd.DataFrame(['A1', 'A2'], columns=['column'], index=[1, 2])
df2 = pd.DataFrame(['A2_new', 'A3'], columns=['column'], index=[2, 3])
df1
column
1 A1
2 A2
df2
column
2 A2_new
3 A3
Code
df2.combine_first(df1)
output
column
1 A1
2 A2_new
3 A3
答案2
得分: 0
Sure, here is the translated code:
@Ars ML
您可以垂直连接这两个DataFrame,并从'index'列中删除重复项,仅保留每个索引值的最后一次出现
df1 = pd.DataFrame({'index': [1, 2], 'column': ['A1', 'A2']})
df2 = pd.DataFrame({'index': [2, 3], 'column': ['A2_new', 'A3']})
merged_df = pd.concat([df1, df2]).drop_duplicates(subset=['index'], keep='last')
merged_df.set_index('index', inplace=True)
输出如您所期望的那样。
1 A1
2 A2_new
3 A3
您还可以使用merge
,它更为复杂,但可以产生您期望的结果。
merge_chain = pd.merge(df1, df2, on='index', how='outer') \
.assign(column=lambda x: x['column_y'].fillna(x['column_x'])) \
.drop(['column_x', 'column_y'], axis=1) \
.set_index('index')
希望这对您有帮助。
英文:
@Ars ML
You can concatenate the two DataFrames vertically and remove duplicates from 'index' column, keeping only the last occurrence of each index value
df1 = pd.DataFrame({'index': [1, 2], 'column': ['A1', 'A2']})
df2 = pd.DataFrame({'index': [2, 3], 'column': ['A2_new', 'A3']})
merged_df = pd.concat([df1, df2]).drop_duplicates(subset=['index'], keep='last')
merged_df.set_index('index', inplace=True)
outputs as per your desired outcome.
1 A1
2 A2_new
3 A3
You can also use merge
, it is more involved but produces your desired outcome.
merge_chain = pd.merge(df1, df2, on='index', how='outer') \
.assign(column=lambda x: x['column_y'].fillna(x['column_x'])) \
.drop(['column_x', 'column_y'], axis=1) \
.set_index('index')
答案3
得分: 0
另一个可能的解决方案:
out = pd.concat([df1, df2])
out[~out.index.duplicated(keep='last')]
输出:
column
1 A1
2 A2_new
3 A3
英文:
Another possible solution:
out = pd.concat([df1, df2])
out[~out.index.duplicated(keep='last')]
Output:
column
1 A1
2 A2_new
3 A3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论