2023年4月11日 14:51:19go评论96阅读模式

英文:

Match values in pandas dataframe and replace with matched values from master table

问题

我想要将主表中的值与映射表中的详细信息进行匹配和替换，而不使用for循环。

主表：

Case	Path1	Path2	Path3
1	a	c	d
2	b	c	a
3	c	a	e
4	b	d	e
5	d	b	a

映射表：

factor	detail
a	样本A
b	样本B
c	样本C
d	样本D
e	样本E
f	样本F

我希望输出如下所示。

结果：

Case	Path1	Path2	Path3
1	样本A	样本C	样本D
2	样本B	样本C	样本A
3	样本C	样本A	样本E
4	样本B	样本D	样本E
5	样本D	样本B	样本A

英文:

I would like to match and replace values from Main Table to detail in Mapping Table without using for-loop.

Main Table:

Case	Path1	Path2	Path3
1	a	c	d
2	b	c	a
3	c	a	e
4	b	d	e
5	d	b	a

Mapping Table:

factor	detail
a	sample A
b	sample B
c	sample C
d	sample D
e	sample E
f	sample F

I would like the output to be like this.

Result:

Case	Path1	Path2	Path3
1	sample A	sample C	sample D
2	sample B	sample C	sample A
3	sample C	sample A	sample E
4	sample B	sample D	sample E
5	sample D	sample B	sample A

答案1

得分: 2

你可以使用replace：

# df -> 主表
# dmap -> 映射表
cols = df.filter(like='Path').columns
df[cols] = df[cols].replace(dmap.set_index('factor')['detail'])
print(df)
# 输出
   Case     Path1     Path2     Path3
0     1  sample A  sample C  sample D
1     2  sample B  sample C  sample A
2     3  sample C  sample A  sample E
3     4  sample B  sample D  sample E
4     5  sample D  sample B  sample A

英文:

You can use replace:

# df -&gt; main table
# dmap -&gt; map table
cols = df.filter(like=&#39;Path&#39;).columns
df[cols] = df[cols].replace(dmap.set_index(&#39;factor&#39;)[&#39;detail&#39;])
print(df)
# Output
   Case     Path1     Path2     Path3
0     1  sample A  sample C  sample D
1     2  sample B  sample C  sample A
2     3  sample C  sample A  sample E
3     4  sample B  sample D  sample E
4     5  sample D  sample B  sample A

答案2

得分: 1

Use Series.map by all columns without first - if no match get NaNs:

df1.iloc[:, 1:] = df1.iloc[:, 1:].apply(lambda x: x.map(df2.set_index('factor')['detail']))

Or DataFrame.replace - if no match get original value:

df1.iloc[:, 1:] = df1.iloc[:, 1:].replace(df2.set_index('factor')['detail'])
print (df1)
   Case     Path1     Path2     Path3
0     1  sample A  sample C  sample D
1     2  sample B  sample C  sample A
2     3  sample C  sample A  sample E
3     4  sample B  sample D  sample E
4     5  sample D  sample B  sample A

If want update only columns starting Path use Series.update with DataFrame.filter and DataFrame.replace:

df1.update(df1.filter(regex=r'^Path').replace(df2.set_index('factor')['detail']))
print (df1)
   Case     Path1     Path2     Path3
0     1  sample A  sample C  sample D
1     2  sample B  sample C  sample A
2     3  sample C  sample A  sample E
3     4  sample B  sample D  sample E
4     5  sample D  sample B  sample A

英文:

Use Series.map by all columns without first - if no match get NaNs:

df1.iloc[:, 1:] = df1.iloc[:, 1:].apply(lambda x: x.map(df2.set_index(&#39;factor&#39;)[&#39;detail&#39;]))

Or DataFrame.replace - if no match get original value:

df1.iloc[:, 1:] = df1.iloc[:, 1:].replace(df2.set_index(&#39;factor&#39;)[&#39;detail&#39;])
print (df1)
   Case     Path1     Path2     Path3
0     1  sample A  sample C  sample D
1     2  sample B  sample C  sample A
2     3  sample C  sample A  sample E
3     4  sample B  sample D  sample E
4     5  sample D  sample B  sample A

If want update only columns starting Path use Series.update with DataFrame.filter and DataFrame.replace:

df1.update(df1.filter(regex=r&#39;^Path&#39;).replace(df2.set_index(&#39;factor&#39;)[&#39;detail&#39;]))
print (df1)
   Case     Path1     Path2     Path3
0     1  sample A  sample C  sample D
1     2  sample B  sample C  sample A
2     3  sample C  sample A  sample E
3     4  sample B  sample D  sample E
4     5  sample D  sample B  sample A

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

匹配 pandas 数据框中的值，并用主表中匹配的值替换。

问题

答案1

答案2

Python：网络爬虫 Pandas 数据框在数据之间返回多个空行

如何将Traceback恢复为正常？

问题出在 if 语句吗？不等规则不起作用吗？

实现用Tkinter制作的秒表按钮释放功能

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。