问题

我尝试清理我的数据列，只取文本的一部分。不幸的是，我无法理解。

我尝试使用 pandas 系列中的 .replace 方法，但似乎没有起作用。

英文:

I have been trying to clean my data column by taking a part of the text out. Unfortunately cannot get my head around it.

I tried using the .replace method in pandas series, but that did not seem to have worked

df[&#39;Salary Estimate&#39;].str.replace(&#39; (Glassdoor est.)&#39;, &#39;&#39;,regex=True)


0       $53K-$91K (Glassdoor est.)
1      $63K-$112K (Glassdoor est.)
2       $80K-$90K (Glassdoor est.)
3       $56K-$97K (Glassdoor est.)
4      $86K-$143K (Glassdoor est.)
                  ...             
922                             -1
925                             -1
928    $59K-$125K (Glassdoor est.)
945    $80K-$142K (Glassdoor est.)
948    $62K-$113K (Glassdoor est.)
Name: Salary Estimate, Length: 600, dtype: object

What I expected was



0       $53K-$91K
1      $63K-$112K
2       $80K-$90K
3       $56K-$97K
4      $86K-$143K
                  ...             
922                             -1
925                             -1
928    $59K-$125K
945    $80K-$142K
948    $62K-$113K
Name: Salary Estimate, Length: 600, dtype: object`

答案1

得分: 3

如果您启用正则表达式，必须转义正则表达式符号，如(，)或.：

import re

&gt;&gt;&gt; df['Salary Estimate'].str.replace(re.escape(r' (Glassdoor est.)'), '', regex=True)
0     $53K-$91K
1    $63K-$112K
2     $80K-$90K
3     $56K-$97K
4    $86K-$143K
Name: Salary Estimate, dtype: object

# 或者不导入re模块
&gt;&gt;&gt; df['Salary Estimate'].str.replace(r' \(Glassdoor est\.\)', '', regex=True)
0     $53K-$91K
1    $63K-$112K
2     $80K-$90K
3     $56K-$97K
4    $86K-$143K
Name: Salary Estimate, dtype: object

您也可以提取数字：

&gt;&gt;&gt; df['Salary Estimate'].str.extract(r'$(?P<min>\d+)K-$(?P<max>\d+)K')
  min  max
0  53   91
1  63  112
2  80   90
3  56   97
4  86  143

英文:

If you enable regex, you have to escape regex symbol like (, ) or .:

import re

&gt;&gt;&gt; df[&#39;Salary Estimate&#39;].str.replace(re.escape(r&#39; (Glassdoor est.)&#39;), &#39;&#39;,regex=True)
0     $53K-$91K
1    $63K-$112K
2     $80K-$90K
3     $56K-$97K
4    $86K-$143K
Name: Salary Estimate, dtype: object

# Or without import re module
&gt;&gt;&gt; df[&#39;Salary Estimate&#39;].str.replace(r&#39; \(Glassdoor est\.\)&#39;, &#39;&#39;,regex=True)
0     $53K-$91K
1    $63K-$112K
2     $80K-$90K
3     $56K-$97K
4    $86K-$143K
Name: Salary Estimate, dtype: object

You can also extract numbers:

&gt;&gt;&gt; df[&#39;Salary Estimate&#39;].str.extract(r&#39;$(?P&lt;min&gt;\d+)K-$(?P&lt;max&gt;\d+)K&#39;)
  min  max
0  53   91
1  63  112
2  80   90
3  56   97
4  86  143

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Pandas数据列中替换字符串的一部分，替换不起作用。

问题

答案1

elementwise division with panel data df.div(level=?). index level

第一个方法为什么错误，而第二个正确？

将嵌套的for循环结果存储为单个连接的字符串。

如何在tkinter python中在特定时间后更改标签文本而不使用after()函数？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论