提取Pandas中的第一个字符

huangapple go评论141阅读模式
英文:

Pandas extract only capturing first character

问题

df["type"] = df["callsign"].str.extractall(r'([^W0-9-])').groupby(level=0).apply(''.join)
英文:

I want to do an extract to capture all characters that match a regular expression and add those extracted characters to another column. When I run the code below, it only captures the first character. I want to capture all the letters, except W, and also no numbers or any dashes.

Here's the code:

df["type"] = df["callsign"].str.extract(r'([^W0-9-])')

Currently the data frame shows the below result.

callsign type
1AB3-W9 A
23DC-W0 D

But I need it to produce:

callsign type
1AB3-W9 AB
23DC-W0 DC

答案1

得分: 1

使用 replace() 替换不需要的字符为一个空字符串,不要使用 extract()

df["type"] = df["callsign"].str.replace(r'[W0-9-]', '')
英文:

Don't use extract(), use replace() to replace the unwanted characters with an empty string.

df["type"] = df["callsign"].str.replace(r'[W0-9-]', '')

答案2

得分: 1

假设您想要提取在“-W”之前的字母,请使用:

df["type"] = df["callsign"].str.extract(r'([a-zA-Z]+)-W')

对于第一组不包括“W”的字母,您漏掉了一个“+”:

df["callsign"].str.extract(r'([^W0-9-]+)')
英文:

Assuming you want to extract the letters right before the -W, use:

df["type"] = df["callsign"].str.extract(r'([a-zA-Z]+)-W')

For the first set of letters that are not W, you're missing a +:

df["callsign"].str.extract(r'([^W0-9-]+)')

答案3

得分: 0

另一种有效的方法是使用 findall

df["callsign"].str.findall(r'([^W0-9-])')

这将为您提供一个包含所有匹配项的列表,然后您可以将它们连接起来:

df["type"] = df["callsign"].str.findall(r'([^W0-9-])').str.join("")
英文:

Alternatively to the other valid answers, you can use findall:

df["callsign"].str.findall(r'([^W0-9-])')

This will give you a list will all the matches, you can then join it:

df["type"] = df["callsign"].str.findall(r'([^W0-9-])').str.join("")

huangapple
  • 本文由 发表于 2023年8月10日 23:25:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76877190.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定