英文:
Merge two DataFrames based on a string aggregated column
问题
我有以下包含的数据框(DF):
Lookup | Definition |
---|---|
A | Apple |
B | Banana |
C | Carrot |
我还有另一个数据框(DF2):
SNo | Lookup Values |
---|---|
1 | ['A', 'B'] |
2 | ['A', 'C'] |
3 | ['B', 'C'] |
注意:"Lookup Values"列是一个字符串列表
执行JOIN操作以获取以下结果的最简单方法是什么:
SNo | Lookup Values | Lookup Definitions |
---|---|---|
1 | A, B | Apple, Banana |
2 | A, C | Apple, Carrot |
3 | B, C | Banana, Carrot |
英文:
I have the below DF that contains
Lookup | Definition |
---|---|
A | Apple |
B | Banana |
C | Carrot |
I have another DF2:
SNo | Lookup Values |
---|---|
1 | ['A', 'B'] |
2 | ['A', 'C'] |
3 | ['B', 'C'] |
Note: "Lookup Values" column is a list of strings
What is the most simple way of performing the JOIN to get the below:
SNo | Lookup Values | Lookup Definitions |
---|---|---|
1 | A, B | Apple, Banana |
2 | A, C | Apple, Carrot |
3 | B, C | Banana, Carrot |
答案1
得分: 4
这是使用map()
函数的一种方法:
(df2.explode('Lookup Values')
.dropna()
.assign(**{'Lookup Definitions': lambda x: x['Lookup Values'].map(dict(df.to_records(index=False)))})
.groupby('SNo')
.agg(', '.join)
.reset_index())
输出结果:
SNo Lookup Values Lookup Definitions
0 1 A, B Apple, Banana
1 2 A, C Apple, Carrot
2 3 B, C Banana, Carrot
英文:
Here is a way with using map()
(df2.explode('Lookup Values')
.dropna()
.assign(**{'Lookup Definitions':lambda x: x['Lookup Values'].map(dict(df.to_records(index = False)))})
.groupby('SNo')
.agg(', '.join)
.reset_index())
Output:
SNo Lookup Values Lookup Definitions
0 1 A, B Apple, Banana
1 2 A, C Apple, Carrot
2 3 B, C Banana, Carrot
答案2
得分: 3
尝试使用explode、merge和groupby函数:
(df2.explode('Lookup Values')
.merge(df1, left_on='Lookup Values', right_on='Lookup', how='left')
.groupby('SNo', as_index=False)
.agg({'Lookup':','.join, 'Definition':','.join})
)
输出结果:
SNo Lookup Definition
0 1 A,B Apple,Banana
1 2 B,C Banana,Carrot
2 3 C,A Carrot,Apple
英文:
Try explode, merge, then groupby:
(df2.explode('Lookup Values')
.merge(df1, left_on='Lookup Values', right_on='Lookup', how='left')
.groupby('SNo', as_index=False)
.agg({'Lookup':','.join, 'Definition':','.join})
)
Output:
SNo Lookup Definition
0 1 A,B Apple,Banana
1 2 B,C Banana,Carrot
2 3 C,A Carrot,Apple
答案3
得分: 3
explode
列表列merge
获取定义- 根据需要进行
groupby
和agg
>>> df2.explode("Lookup Values").merge(df, left_on="Lookup Values", right_on="Lookup").groupby("SNo")[["Lookup Values", "Definition"]].agg(", ".join)
Lookup Values Definition
SNo
1 A, B Apple, Banana
2 A, C Apple, Carrot
3 B, C Banana, Carrot
英文:
explode
the list columnsmerge
to get the definitionsgroupby
andagg
as needed
>>> df2.explode("Lookup Values").merge(df, left_on="Lookup Values", right_on="Lookup").groupby("SNo")[["Lookup Values", "Definition"]].agg(", ".join)
Lookup Values Definition
SNo
1 A, B Apple, Banana
2 A, C Apple, Carrot
3 B, C Banana, Carrot
答案4
得分: 2
另一种解决方案:
df2["查找定义"] = df2["查找值"].apply(
lambda x, m=df1.set_index("查找")["定义"].to_dict(): ", ".join(
m[v] for v in x if v in m
)
)
print(df2)
输出结果:
序号 查找值 查找定义
0 1 [A, B] 苹果, 香蕉
1 2 [A, C] 苹果, 胡萝卜
2 3 [B, C] 香蕉, 胡萝卜
英文:
Another solution:
df2["Lookup Definitions"] = df2["Lookup Values"].apply(
lambda x, m=df1.set_index("Lookup")["Definition"].to_dict(): ", ".join(
m[v] for v in x if v in m
)
)
print(df2)
Prints:
SNo Lookup Values Lookup Definitions
0 1 [A, B] Apple, Banana
1 2 [A, C] Apple, Carrot
2 3 [B, C] Banana, Carrot
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论