英文:
split column with values inside and outside brackets python
问题
-
"I need to split (in python code) my column "code" into 2 columns:"
- "我需要在Python代码中拆分我的列"code"为2个列:"
-
""outside" with value outside the brackets"
- "“outside”列的值为括号外的部分"
-
""inside" with value inside the brackets"
- "“inside”列的值为括号内的部分"
-
"I'd create a "prepared" column by adding a "+" separator after each letter before the number."
- "我会创建一个"prepared"列,通过在每个字母前添加"+"分隔符来实现。"
-
"| id | code | outside| inside| prepared"
- "| 编号 | 代码 | 外部| 内部| 准备好"
-
"| 1 | -(83C24H) | - | 83C24H | 83C + 24H"
- "| 1 | -(83C24H) | - | 83C24H | 83C + 24H"
-
"| 2 | 30(30C14H) | 30 | 30C14H | 30C + 14H"
- "| 2 | 30(30C14H) | 30 | 30C14H | 30C + 14H"
-
"| 3 | 25 | 25 | 0 | 0"
- "| 3 | 25 | 25 | 0 | 0"
-
"Thank u!"
- "谢谢!"
英文:
I need to split (in python code) my column "code" into 2 columns:
- "outside" with value outside the brackets
- "inside" with value inside the brackets
I'd create a "prepared" column by adding a "+" separator after each letter before the number.
id | code | outside | inside | prepared |
---|---|---|---|---|
1 | -(83C24H) | - | 83C24H | 83C + 24H |
2 | 30(30C14H) | 30 | 30C14H | 30C + 14H |
3 | 25 | 25 | 0 | 0 |
Thank u!
答案1
得分: 1
尝试:
df['outside'] = df['code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['code'].str.extract(r'\(([^)]+)')
print(df)
打印:
id code outside inside
0 1 -(83C24H) - 83C24H
1 2 30(30C14H) 30 30C14H
编辑:使用更新后的数据框:
mask = df['code'].str.contains(r'\(.*\)', regex=True)
df['inside'] = df.loc[mask, 'code'].str.extract(r'\(([^)]+)')
df['outside'] = df.loc[mask, 'code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['inside'].fillna(df['code'])
df['outside'] = df['outside'].fillna('0')
print(df)
打印:
id code inside outside
0 1 -(83C24H) 83C24H -
1 2 30(30C14H) 30C14H 30
2 3 25 25 0
英文:
Try:
df['outside'] = df['code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['code'].str.extract(r'\(([^)]+)')
print(df)
Prints:
id code outside inside
0 1 -(83C24H) - 83C24H
1 2 30(30C14H) 30 30C14H
EDIT: With updated dataframe:
mask = df['code'].str.contains(r'\(.*\)', regex=True)
df['inside'] = df.loc[mask, 'code'].str.extract(r'\(([^)]+)')
df['outside'] = df.loc[mask, 'code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['inside'].fillna(df['code'])
df['outside'] = df['outside'].fillna('0')
print(df)
Prints:
id code inside outside
0 1 -(83C24H) 83C24H -
1 2 30(30C14H) 30C14H 30
2 3 25 25 0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论