2023年5月24日 22:50:45go评论64阅读模式

英文:

Extract substrings from a column of strings and place them in a list

问题

从列b中，对于每个项目，我需要提取第一个空格之前的子字符串。因此，我需要以下结果：

list_of_strings = [abc, abd1, abce, abe]

英文:

I have the following data frame:

   a    b             x  
0  id1  abc 123 tr    2  
1  id2  abd1 124 tr   6 
2  id3  abce 126 af   9 
3  id4  abe 128 nm    12

From column b, for each item, I need to extract the substrings before the first space. Hence, I need the following result:

list_of_strings = [abc, abd1, abce, abe]

Please advise

答案1

得分: 2

使用正则表达式 `^\S+`（以非空格字符开头）和 [`str.extract`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.extract.html)：

df['b'].str.extract(r'^(\S+)', expand=False)

输出：

0 abc
1 abd1
2 abce
3 abe
Name: b, dtype: object

对于一个列表：

list_of_strings = df['b'].str.extract(r'^(\S+)', expand=False).tolist()

['abc', 'abd1', 'abce', 'abe']


[正则表达式演示](https://regex101.com/r/R4BgiT/1)

英文:

Use a regex with ^\S+ (non-space characters anchored to the start of string) and str.extract:

df[&#39;b&#39;].str.extract(r&#39;^(\S+)&#39;, expand=False)

Output:

0     abc
1    abd1
2    abce
3     abe
Name: b, dtype: object

For a list:

list_of_strings = df[&#39;b&#39;].str.extract(r&#39;^(\S+)&#39;, expand=False).tolist()
# [&#39;abc&#39;, &#39;abd1&#39;, &#39;abce&#39;, &#39;abe&#39;]

regex demo

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

提取字符串列中的子字符串并将它们放入一个列表中。

问题

答案1

['abc', 'abd1', 'abce', 'abe']

Can I use a boolean mask to find if a DateTime value falls between two other DateTime values in a different dataframe

Nextcord (discord.py分支) – 如何告诉Discord当前正在运行的命令？

如何在仅找到一个匹配项时，为特定单元格的“findall”结果添加分隔符？

Go through a text file. Count the emails sent by each distinct email address and print the email address along with the count of emails

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论