英文:
How to obtain multiple partial strings from a dataframe?
问题
我正在尝试从我的数据框中获取多个部分字符串,并将这些部分字符串作为添加列放入我的数据框。下面是一个简单的数据样本:
我想获得以下数据框:
英文:
I am trying to obtain multiple partial strings from my dataframe and put those partial strings as added columns to my dataframe. Below you will find a simple data sample:
df
Serienummer
15 SAA VKS MSI A1 R 7,500 1
29 SAA VKS MSI A1P 7,500 1
36 SAA VKS MSI A1 R 14,370 5
I want to obtain the following dataframe:
Serienummer column1 column2 column3
15 SAA VKS MSI A1 R 7,500 1 A1 7,500 1
29 SAA VKS MSI A2P 7,500 1 A2 7,500 1
36 SAA VKS MSI A1 R 14,370 5 A1 14,370 5
Any help is appriciated.
答案1
得分: 0
使用pd.Series.str.extract
(用于提取正则表达式模式组)和pd.concat
:
new_df = (pd.concat([df, df['Serienummer'].str
.extract(r'(\b[A-Z]\d+)[\sA-Z]+(\d+,\d+) (\d+)', expand=True)],
axis=1))
Serienummer 0 1 2
0 SAA VKS MSI A1 R 7,500 1 A1 7,500 1
1 SAA VKS MSI A1P 7,500 1 A1 7,500 1
2 SAA VKS MSI A1 R 14,370 5 A1 14,370 5
英文:
With pd.Series.str.extract
(to extract regex pattern groups) and pd.concat
:
new_df = (pd.concat([df, df['Serienummer'].str
.extract(r'(\b[A-Z]\d+)[\sA-Z]+(\d+,\d+) (\d+)', expand=True)],
axis=1))
Serienummer 0 1 2
0 SAA VKS MSI A1 R 7,500 1 A1 7,500 1
1 SAA VKS MSI A1P 7,500 1 A1 7,500 1
2 SAA VKS MSI A1 R 14,370 5 A1 14,370 5
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论