英文:
Compare two dataframes using Python
问题
Sure, here is the translated content you requested:
我有两个数据框,每个都有三列,列名相同。我想根据第一列将两个数据框的值放在一起进行比较。
我尝试使用外连接、左连接甚至内连接。输出实际上是相同的,也是正确的。但我只希望在第一列匹配时将第二列和第三列的值列出(并排显示)。使用连接方法会导致重复。附带的图片是我得到的输出,也是我期望的输出。
# 在列A上合并两个表
merged_table = pd.merge(dfA, dfB, on='水果', how='outer', suffixes=('来自dfA', '来自dfB'))
# 按列B对合并后的表进行排序
merged_table = merged_table.sort_values('日期来自dfA')
# 显示输出
print(merged_table)
英文:
I have two dataframes, each has three columns with same column name. I would like to compare the values of two dataframes by putting them together based on the first column.
I tried using outer join, left join and even inner join. The output are actually the same and also correct. But I only want the values of second and third columns to be listed out (side by side) if there are match on the first column. Using join method would result in duplications. Attached picture is the output I get and also my targeted output.
# Merge the two tables on column A
merged_table = pd.merge(dfA, dfB, on='Fruit', how='outer', suffixes=('_from dfA', '_from dfB'))
# Sort the merged table by column B
merged_table = merged_table.sort_values('Date_from dfA')
# Display the output
print(merged_table)
答案1
得分: 1
以下是翻译好的内容:
"一种方法是在数据框上基于水果组添加另一列,然后基于水果和新列进行合并,代码如下:"
dfA=dfA.assign(num=(dfA.groupby(['水果']).cumcount()+1))
dfB=dfB.assign(num=(dfB.groupby(['水果']).cumcount()+1))
merged_table = pd.merge(dfA, dfB, on=['水果','num'], how='outer', suffixes=('来自dfA', '来自dfB'))
英文:
One way of doing this is to add another column based on the fruit group to data frames and then merge based on both fruit and the new column, the code would look like this:
dfA=dfA.assign(num=(dfA.groupby(['Fruit']).cumcount()+1))
dfB=dfB.assign(num=(dfB.groupby(['Fruit']).cumcount()+1))
merged_table = pd.merge(dfA, dfB, on=['Fruit','num'], how='outer', suffixes=('_from dfA', '_from dfB'))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论