问题

我有一列包含名称变体的数据，我想清理它们。我在使用正则表达式来删除逗号后的第一个单词时遇到问题。

已尝试的正则表达式:

x['names'] = [re.sub(r',\s+[^\s,]+', ',', str(x)) for x in x['names']]

期望的输出:

['smith,john', 'smith, john', 'brown, bob', 'brown, bob']

不确定为什么我的正则表达式不起作用，但任何帮助都将不胜感激。

英文:

I have a column that has name variations that I'd like to clean up. I'm having trouble with the regex expression to remove everything after the first word following a comma.

d = {&#39;names&#39;:[&#39;smith,john s&#39;,&#39;smith, john&#39;, &#39;brown, bob s&#39;, &#39;brown, bob&#39;]}
x = pd.DataFrame(d)

Tried:
x[&#39;names&#39;] =  [re.sub(r&#39;/.\s+[^\s,]+/&#39;,&#39;&#39;, str(x)) for x in x[&#39;names&#39;]]

Desired Output:
[&#39;smith,john&#39;,&#39;smith, john&#39;, &#39;brown, bob&#39;, &#39;brown, bob&#39;]

Not sure why my regex isn't working, but any help would be appreciated.

答案1

得分: 1

你可以尝试使用一个正则表达式，它查找逗号，然后是一个可选的空格，然后只保留剩下的单词：

x["names"].str.replace(r"^([^,]*,\s*[^\s]*).*", r"")

0 smith,john
1 smith, john
2 brown, bob
3 brown, bob
Name: names, dtype: object

英文:

You could try using a regex that looks for a comma, then an optional space, then only keeps the remaining word:

x[&quot;names&quot;].str.replace(r&quot;^([^,]*,\s*[^\s]*).*&quot;, r&quot;&quot;)

0     smith,john
1    smith, john
2     brown, bob
3     brown, bob
Name: names, dtype: object

答案2

得分: 0

尝试 re.sub(r'/(,\s*\w+).*$', '$1', str(x))... 将触发的模式放入捕获组 1 中，然后在替换的内容中恢复它。

英文:

Try re.sub(r'/(,\s*\w+).*$/','$1', str(x))...

Put the triggered pattern into capture group 1 and then restore it in what gets replaced.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

删除逗号后第一个单词之后的所有内容。

问题

答案1

答案2

防止Matplotlib删除坐标轴上的数字的方法

提取自数据框列中每个值之间的两个分隔符之间的自定义文本字符串。

密码装饰器

Golang JSON解析Python字符串

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论