英文:
Merge dataframes having array
问题
isActive,trackedSearchId,Primary Keyword Group(s)
True,53436615,SEO - Directory-Deployment&SEO - Other-Glossary
True,53434228,SEO - Other
True,53434229,SEO - Glossary
英文:
I have two data frames.
DF1
isActive,trackedSearchId
True,53436615
True,53434228
True,53434229
DF2
trackedSearchIds,Primary Keyword Group(s)
"[53436613, 53436615, 53437436, 53436506]",SEO - Directory-Deployment
"[53435887, 53437509, 53437441, 53436615, 53438685, 53437636]",SEO - Other-Glossary
"[53437504, 53435090, 53435887, 53434228]",SEO - Other
"[53437504, 53435090, 53434229]",SEO - Glossary
I want to check for each row of DF1 for column trackedSearchId and check with each row in DF2 with each value of trackedSearchIds array and if the value of DF1 column is present in DF2 append it with DF1.
The output should look like:
isActive,trackedSearchId,Primary Keyword Group(s)
True,53436615,SEO - Directory-Deployment&SEO - Other-Glossary
True,53434228,SEO - Other
True,53434229,SEO - Glossary
答案1
得分: 1
Sure, here is the translated code part:
import ast
df1['trackedSearchId'] = df1['trackedSearchId'].astype(str)
out = df1.merge((df2.assign(trackedSearchIds=df2['trackedSearchIds'].str.findall(r'(\d+)')).explode('trackedSearchIds')
.groupby('trackedSearchIds').agg('&'.join)
),
left_on='trackedSearchId', right_on='trackedSearchIds')
And here is the translated output:
isActive trackedSearchId Primary Keyword Group(s)
0 True 53436615 SEO - Directory-Deployment&SEO - Other-Glossary
1 True 53434228 SEO - Other
2 True 53434229 SEO - Glossary
Please note that I have retained the code and output formatting as it is.
英文:
Assuming strings representations of lists in df2, use ast.literal_eval
, then groupby.agg
, explode
, and merge
:
import ast
df1['trackedSearchId'] = df1['trackedSearchId'].astype(str)
out = df1.merge((df2.assign(trackedSearchIds=df2['trackedSearchIds'].str.findall(r'(\d+)')).explode('trackedSearchIds')
.groupby('trackedSearchIds').agg('&'.join)
),
left_on='trackedSearchId', right_on='trackedSearchIds')
Output:
isActive trackedSearchId Primary Keyword Group(s)
0 True 53436615 SEO - Directory-Deployment&SEO - Other-Glossary
1 True 53434228 SEO - Other
2 True 53434229 SEO - Glossary
答案2
得分: 1
You can use DataFrame.explode
来展开已转换为列表的值,如果必要的话,使用 ast.literal_eval
进行转换,使用 GroupBy.agg
中的 join
进行聚合,在 DataFrame.join
中添加新列:
import ast
df = (df1.join(df2.assign(trackedSearchIds=df2['trackedSearchIds'].apply(ast.literal_eval))
.explode('trackedSearchIds')
.groupby('trackedSearchIds')['Primary Keyword Group(s)']
.agg('&'.join), on='trackedSearchId'))
print(df)
isActive trackedSearchId Primary Keyword Group(s)
0 True 53436615 SEO - Directory-Deployment&SEO - Other-Glossary
1 True 53434228 SEO - Other
2 True 53434229 SEO - Glossary
如果值是列表:
df = (df1.join(df2
.explode('trackedSearchIds')
.groupby('trackedSearchIds')['Primary Keyword Group(s)']
.agg('&'.join), on='trackedSearchId'))
英文:
You can use DataFrame.explode
of converted values to list by ast.literal_eval
if necessary, aggregate join
in GroupBy.agg
and append new column in DataFrame.join
:
import ast
df = (df1.join(df2.assign(trackedSearchIds=df2['trackedSearchIds'].apply(ast.literal_eval))
.explode('trackedSearchIds')
.groupby('trackedSearchIds')['Primary Keyword Group(s)']
.agg('&'.join), on='trackedSearchId'))
print (df)
isActive trackedSearchId Primary Keyword Group(s)
0 True 53436615 SEO - Directory-Deployment&SEO - Other-Glossary
1 True 53434228 SEO - Other
2 True 53434229 SEO - Glossary
If values are lists:
df = (df1.join(df2
.explode('trackedSearchIds')
.groupby('trackedSearchIds')['Primary Keyword Group(s)']
.agg('&'.join), on='trackedSearchId'))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论