2023年8月10日 23:25:23go评论190阅读模式

英文:

Pandas extract only capturing first character

问题

df["type"] = df["callsign"].str.extractall(r'([^W0-9-])').groupby(level=0).apply(''.join)

英文:

I want to do an extract to capture all characters that match a regular expression and add those extracted characters to another column. When I run the code below, it only captures the first character. I want to capture all the letters, except W, and also no numbers or any dashes.

Here's the code:

df[&quot;type&quot;] = df[&quot;callsign&quot;].str.extract(r&#39;([^W0-9-])&#39;)

Currently the data frame shows the below result.

callsign	type
1AB3-W9	A
23DC-W0	D

But I need it to produce:

callsign	type
1AB3-W9	AB
23DC-W0	DC

答案1

得分: 1

使用 replace() 替换不需要的字符为一个空字符串，不要使用 extract()。

df["type"] = df["callsign"].str.replace(r'[W0-9-]', '')

英文:

Don't use extract(), use replace() to replace the unwanted characters with an empty string.

df[&quot;type&quot;] = df[&quot;callsign&quot;].str.replace(r&#39;[W0-9-]&#39;, &#39;&#39;)

答案2

得分: 1

假设您想要提取在“-W”之前的字母，请使用：

df["type"] = df["callsign"].str.extract(r'([a-zA-Z]+)-W')

对于第一组不包括“W”的字母，您漏掉了一个“+”：

df["callsign"].str.extract(r'([^W0-9-]+)')

英文:

Assuming you want to extract the letters right before the -W, use:

df[&quot;type&quot;] = df[&quot;callsign&quot;].str.extract(r&#39;([a-zA-Z]+)-W&#39;)

For the first set of letters that are not W, you're missing a +:

df[&quot;callsign&quot;].str.extract(r&#39;([^W0-9-]+)&#39;)

答案3

得分: 0

另一种有效的方法是使用 findall：

df["callsign"].str.findall(r'([^W0-9-])')

这将为您提供一个包含所有匹配项的列表，然后您可以将它们连接起来：

df["type"] = df["callsign"].str.findall(r'([^W0-9-])').str.join("")

英文:

Alternatively to the other valid answers, you can use findall:

df[&quot;callsign&quot;].str.findall(r&#39;([^W0-9-])&#39;)

This will give you a list will all the matches, you can then join it:

df[&quot;type&quot;] = df[&quot;callsign&quot;].str.findall(r&#39;([^W0-9-])&#39;).str.join(&quot;&quot;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

提取Pandas中的第一个字符

问题

答案1

答案2

答案3

如何从包中的另一个文件夹导入文件

Why is the behaviour of stumpy.stump changing so abruptly? Why is it unable to match constant intervals as the same shape?

Django Rest Framework List API View 的筛选后端

无法处理的实体，使用 fastapi 发送 POST 请求？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。