英文:
Extract substrings from a column of strings and place them in a list
问题
从列b中,对于每个项目,我需要提取第一个空格之前的子字符串。因此,我需要以下结果:
list_of_strings = [abc, abd1, abce, abe]
英文:
I have the following data frame:
a b x
0 id1 abc 123 tr 2
1 id2 abd1 124 tr 6
2 id3 abce 126 af 9
3 id4 abe 128 nm 12
From column b, for each item, I need to extract the substrings before the first space. Hence, I need the following result:
list_of_strings = [abc, abd1, abce, abe]
Please advise
答案1
得分: 2
使用正则表达式 `^\S+`(以非空格字符开头)和 [`str.extract`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.extract.html):
df['b'].str.extract(r'^(\S+)', expand=False)
输出:
0 abc
1 abd1
2 abce
3 abe
Name: b, dtype: object
对于一个列表:
list_of_strings = df['b'].str.extract(r'^(\S+)', expand=False).tolist()
['abc', 'abd1', 'abce', 'abe']
[正则表达式演示](https://regex101.com/r/R4BgiT/1)
英文:
Use a regex with ^\S+
(non-space characters anchored to the start of string) and str.extract
:
df['b'].str.extract(r'^(\S+)', expand=False)
Output:
0 abc
1 abd1
2 abce
3 abe
Name: b, dtype: object
For a list:
list_of_strings = df['b'].str.extract(r'^(\S+)', expand=False).tolist()
# ['abc', 'abd1', 'abce', 'abe']
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论