英文:
Is there any way to use 'left_index' and 'left_on' at once in pandas?
问题
我正在使用pandas合并两个数据。
我想要合并的不仅仅是'姓名'列,还有'数学'列。
所以我尝试同时使用'left_index'和'left_on'。
但是它不起作用,因为'left_on'和'left_index'不能同时使用。
我通过以下方式解决了这个问题。
它起作用了,但我想知道是否有其他方法来处理这个问题,而不是使用set_index()函数,而是使用merge()函数。
英文:
I'm using pandas to merge two datas.
import pandas as pd
df3 = pd.DataFrame({'name':['Nimitz','McArthur','George','Eisenhower'],
'math':[50,100,60,70],
'birth':['01','01','02','03']})
df4 = pd.DataFrame({'title':['Nimitz','George','McArthur'],
'math':[50,100,60],
'tardiness':[0,1,2]})
df3_sp = df3.set_index('name')
pd.merge(df3_sp, df4, left_index=True, right_on='title')
I wanted to merge not only column 'name' but also 'math'.
So I tried to use left_index and left_on at once.
pd.merge(df3_sp, df4, left_index=True, left_on='math', right_on=['math','name'])
But it didn't work as 'left_on' and 'left_index' can not be used at once.
I solved this problem this way.
df3_sp2 = df3.set_index(['name','math'])
pd.merge(df3_sp2, df4, left_index=True, right_on=['name','math'])
It worked but I wanted to know if there's any other way to deal with this problem
not by using set_index() function but using merge() function.
答案1
得分: 1
是的,您可以使用df3_sp
的索引,因为它已命名,并且没有具有相同名称的列:
pd.merge(df3_sp, df4, left_on=['math', 'name'], right_on=['math', 'title'])
输出:
math birth title tardiness
0 50 01 Nimitz 0
但请注意,合并将丢失索引(这可能不是问题,因为您仍然拥有“title”列)。
要避免此问题,您可以在合并之前使用reset_index
:
pd.merge(df3_sp.reset_index(), df4,
left_on=['math', 'name'], right_on=['math', 'title'])
输出:
name math birth title tardiness
0 Nimitz 50 01 Nimitz 0
如果两个数据帧中的列名相同,那么所有列都将保留:
pd.merge(df3_sp, df4.rename(columns={'title': 'name'}), on=['math', 'name'])
输出:
math name birth tardiness
0 50 Nimitz 01 0
英文:
Yes, you can as your df3_sp
index is named and there is no column with the same name:
pd.merge(df3_sp, df4, left_on=['math', 'name'], right_on=['math', 'title'])
Output:
math birth title tardiness
0 50 01 Nimitz 0
Note however, that a merge will lose the index (which might not be an issue as you still have the "title" column).
To avoid this issue, you can reset_index
before the merge:
pd.merge(df3_sp.reset_index(), df4,
left_on=['math', 'name'], right_on=['math', 'title'])
Output:
name math birth title tardiness
0 Nimitz 50 01 Nimitz 0
If the column names are identical in both DataFrames, then all columns are kept:
pd.merge(df3_sp, df4.rename(columns={'title': 'name'}), on=['math', 'name'])
Output:
math name birth tardiness
0 50 Nimitz 01 0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论