分割包含括号内外值的列 Python

huangapple go评论68阅读模式
英文:

split column with values inside and outside brackets python

问题

  • "I need to split (in python code) my column "code" into 2 columns:"

    • "我需要在Python代码中拆分我的列"code"为2个列:"
  • ""outside" with value outside the brackets"

    • "“outside”列的值为括号外的部分"
  • ""inside" with value inside the brackets"

    • "“inside”列的值为括号内的部分"
  • "I'd create a "prepared" column by adding a "+" separator after each letter before the number."

    • "我会创建一个"prepared"列,通过在每个字母前添加"+"分隔符来实现。"
  • "| id | code | outside| inside| prepared"

    • "| 编号 | 代码 | 外部| 内部| 准备好"
  • "| 1 | -(83C24H) | - | 83C24H | 83C + 24H"

    • "| 1 | -(83C24H) | - | 83C24H | 83C + 24H"
  • "| 2 | 30(30C14H) | 30 | 30C14H | 30C + 14H"

    • "| 2 | 30(30C14H) | 30 | 30C14H | 30C + 14H"
  • "| 3 | 25 | 25 | 0 | 0"

    • "| 3 | 25 | 25 | 0 | 0"
  • "Thank u!"

    • "谢谢!"
英文:

I need to split (in python code) my column "code" into 2 columns:

  • "outside" with value outside the brackets
  • "inside" with value inside the brackets

I'd create a "prepared" column by adding a "+" separator after each letter before the number.

id code outside inside prepared
1 -(83C24H) - 83C24H 83C + 24H
2 30(30C14H) 30 30C14H 30C + 14H
3 25 25 0 0

Thank u!

答案1

得分: 1

尝试:

df['outside'] = df['code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['code'].str.extract(r'\(([^)]+)')
print(df)

打印:

   id        code outside  inside
0   1   -(83C24H)       -  83C24H
1   2  30(30C14H)      30  30C14H

编辑:使用更新后的数据框:

mask = df['code'].str.contains(r'\(.*\)', regex=True)

df['inside'] = df.loc[mask, 'code'].str.extract(r'\(([^)]+)')
df['outside'] = df.loc[mask, 'code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['inside'].fillna(df['code'])
df['outside'] = df['outside'].fillna('0')

print(df)

打印:

   id        code  inside outside
0   1   -(83C24H)  83C24H       -
1   2  30(30C14H)  30C14H      30
2   3          25      25       0
英文:

Try:

df['outside'] = df['code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['code'].str.extract(r'\(([^)]+)')

print(df)

Prints:

   id        code outside  inside
0   1   -(83C24H)       -  83C24H
1   2  30(30C14H)      30  30C14H

EDIT: With updated dataframe:

mask = df['code'].str.contains(r'\(.*\)', regex=True)

df['inside'] = df.loc[mask, 'code'].str.extract(r'\(([^)]+)')
df['outside'] = df.loc[mask, 'code'].str.replace(r'\([^)]*\)', '', regex=True)
df['inside'] = df['inside'].fillna(df['code'])
df['outside'] = df['outside'].fillna('0')

print(df)

Prints:

   id        code  inside outside
0   1   -(83C24H)  83C24H       -
1   2  30(30C14H)  30C14H      30
2   3          25      25       0

huangapple
  • 本文由 发表于 2023年6月8日 06:17:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76427427.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定