英文:
how to align rows in datafrane after pd.concat() Python pandas
问题
在使用pd.concat
进行左连接后,第一列的值与第二列不对齐。我知道我可以对其进行排序,但我想知道如何执行类似于SQL的左连接。
import pandas as pd
df1 = pd.DataFrame({'city':['ABC','NEW','TWIN','KING']})
df2 = pd.DataFrame({'city':['NEW','ABC']})
result = df1.merge(df2, on='city', how='left')
print(result)
这段代码将执行左连接操作,得到您期望的结果。
英文:
I have 2 single column dataframes, after perform a LEFT JOIN using pd.conca, first column value doesn't align with the second one
import pandas as pd
df1 = pd.DataFrame({ 'city':['ABC','NEW','TWIN','KING']})
df2 = pd.DataFrame({ 'city':['NEW','ABC']})
pd.concat([df1, df2], axis=1)
my expected result
I know I can sort it, but I wanna know how to do something similar to SQL LEFT JOIN.
答案1
得分: 2
你可以这样做:
pd.concat([e.set_index(e['city'])
for e in [df1, df2]], axis=1).reset_index(drop=True)
输出:
city city
0 ABC ABC
1 NEW NEW
2 TWIN NaN
3 KING NaN
为每个数据框创建一个以城市列为索引的索引,然后使用pd.concat
将数据基于索引对齐,最后删除索引。
请注意,set_index(dataframe[column])
复制 列到索引中,而 set_index(column)
移动 列到索引中。
英文:
You can do it this way:
pd.concat([e.set_index(e['city'])
for e in [df1, df2]], axis=1).reset_index(drop=True)
Output:
city city
0 ABC ABC
1 NEW NEW
2 TWIN NaN
3 KING NaN
Creating a index with the city columns for each dataframe, then pd.concat which pandas aligns the data based on index and lastly drop the index.
Note, set_index(dataframe[column])
duplicates the column in the index, where set_index(column)
moves the column to the index.
答案2
得分: 1
另一个可能的解决方案,使用 pd.merge
而不是 pd.concat
:
df1.merge(df2.assign(city2=df2['city']), on='city', how='left')
输出:
city city2
0 ABC ABC
1 NEW NEW
2 TWIN NaN
3 KING NaN
英文:
Another possible solution, which uses pd.merge
instead of pd.concat
:
df1.merge(df2.assign(city2=df2['city']), on='city', how='left')
Output:
city city2
0 ABC ABC
1 NEW NEW
2 TWIN NaN
3 KING NaN
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论