英文:
Python Pandas Merging two columns but drop duplicated value
问题
我在Python Pandas中遇到了以下示例数据框的问题。
列A | 列B |
---|---|
a | b |
c | c |
d | e |
g | g |
我希望有类似这样的东西
列A | 列B | 列C |
---|---|---|
a | b | ab |
c | c | c |
d | e | de |
g | g | g |
有人可以帮忙吗?非常感谢。
英文:
I have encountered a problem with following example dataframe in Python Pandas.
Column A | Column B |
---|---|
a | b |
c | c |
d | e |
g | g |
I would love to have something like this
Column A | Column B | Column C |
---|---|---|
a | b | ab |
c | c | c |
d | e | de |
g | g | g |
Could someone please help? Much appreciated.
答案1
得分: 2
使用自定义聚合函数与 agg
,使用 dict.fromkeys
去除重复值并保持顺序,以及 str.join
连接:
df['Column C'] = df.agg(lambda r: ''.join(dict.fromkeys(r)), axis=1)
# 或仅限于特定列:
cols = ['Column A', 'Column B']
df['Column C'] = df[cols].agg(lambda r: ''.join(dict.fromkeys(r)), axis=1)
或者,如果只有两列:
df['Column C'] = (df['Column A'].add(df['Column B'])
.where(df['Column A'].ne(df['Column B']), df['Column A'])
)
输出:
Column A Column B Column C
0 a b ab
1 c c c
2 d e de
3 g g g
(注意:代码部分未被翻译)
英文:
Use a custom aggregation with agg
, using dict.from_keys
to remove the duplicates while keeping order, and str.join
to concatenate:
df['Column C'] = df.agg(lambda r: ''.join(dict.fromkeys(r)), axis=1)
# or limiting to specific columns:
cols = ['Column A', 'Column B']
df['Column C'] = df[cols].agg(lambda r: ''.join(dict.fromkeys(r)), axis=1)
Or, if only two columns:
df['Column C'] = (df['Column A'].add(df['Column B'])
.where(df['Column A'].ne(df['Column B']), df['Column A'])
)
Output:
Column A Column B Column C
0 a b ab
1 c c c
2 d e de
3 g g g
答案2
得分: 1
你可以使用 apply
并设置 axis=1
来迭代遍历 Pandas 每一行,然后使用 pandas.Series.unique
和 ''.join
来获得结果。
df['Column C'] = df[['Column A', 'Column B']].apply(lambda x: ''.join(x.unique()), axis=1)
Column A Column B Column C
0 a b ab
1 c c c
2 d e de
3 g g g
英文:
You can use apply
with axis=1
to iterate on each row of pandas then use pandas.Series.unique
and ''.join
to get the result.
df['Column C'] = df[['Column A', 'Column B']].apply(lambda x: ''.join(x.unique()), axis=1)
Column A Column B Column C
0 a b ab
1 c c c
2 d e de
3 g g g
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论