有没有办法在 pandas 中同时使用 ‘left_index’ 和 ‘left_on’?

huangapple go评论72阅读模式
英文:

Is there any way to use 'left_index' and 'left_on' at once in pandas?

问题

我正在使用pandas合并两个数据

我想要合并的不仅仅是'姓名'还有'数学'
所以我尝试同时使用'left_index''left_on'
但是它不起作用因为'left_on''left_index'不能同时使用
我通过以下方式解决了这个问题
它起作用了但我想知道是否有其他方法来处理这个问题而不是使用set_index()函数而是使用merge()函数
英文:

I'm using pandas to merge two datas.

import pandas as pd

df3 = pd.DataFrame({'name':['Nimitz','McArthur','George','Eisenhower'],
                    'math':[50,100,60,70],
                    'birth':['01','01','02','03']})
df4 = pd.DataFrame({'title':['Nimitz','George','McArthur'],
                    'math':[50,100,60],
                    'tardiness':[0,1,2]})

df3_sp = df3.set_index('name')

pd.merge(df3_sp, df4, left_index=True, right_on='title')

I wanted to merge not only column 'name' but also 'math'.
So I tried to use left_index and left_on at once.

pd.merge(df3_sp, df4, left_index=True, left_on='math', right_on=['math','name'])

But it didn't work as 'left_on' and 'left_index' can not be used at once.
I solved this problem this way.

df3_sp2 = df3.set_index(['name','math'])
pd.merge(df3_sp2, df4, left_index=True, right_on=['name','math'])

It worked but I wanted to know if there's any other way to deal with this problem
not by using set_index() function but using merge() function.

答案1

得分: 1

是的,您可以使用df3_sp的索引,因为它已命名,并且没有具有相同名称的列:

pd.merge(df3_sp, df4, left_on=['math', 'name'], right_on=['math', 'title'])

输出:

   math birth   title  tardiness
0    50    01  Nimitz          0

但请注意,合并将丢失索引(这可能不是问题,因为您仍然拥有“title”列)。

要避免此问题,您可以在合并之前使用reset_index

pd.merge(df3_sp.reset_index(), df4,
         left_on=['math', 'name'], right_on=['math', 'title'])

输出:

     name  math birth   title  tardiness
0  Nimitz    50    01  Nimitz          0

如果两个数据帧中的列名相同,那么所有列都将保留:

pd.merge(df3_sp, df4.rename(columns={'title': 'name'}), on=['math', 'name'])

输出:

   math    name birth  tardiness
0    50  Nimitz    01          0
英文:

Yes, you can as your df3_sp index is named and there is no column with the same name:

pd.merge(df3_sp, df4, left_on=['math', 'name'], right_on=['math', 'title'])

Output:

   math birth   title  tardiness
0    50    01  Nimitz          0

Note however, that a merge will lose the index (which might not be an issue as you still have the "title" column).

To avoid this issue, you can reset_index before the merge:

pd.merge(df3_sp.reset_index(), df4,
         left_on=['math', 'name'], right_on=['math', 'title'])

Output:

     name  math birth   title  tardiness
0  Nimitz    50    01  Nimitz          0

If the column names are identical in both DataFrames, then all columns are kept:

pd.merge(df3_sp, df4.rename(columns={'title': 'name'}), on=['math', 'name'])

Output:

   math    name birth  tardiness
0    50  Nimitz    01          0

huangapple
  • 本文由 发表于 2023年7月20日 15:44:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76727687.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定