替换列中的字符串部分

huangapple go评论53阅读模式
英文:

Replace part of a string in column

问题

I want to delete part of a string in a pandas dataframe in Python. The column contains invoice lines with a period specified. See below. I want to delete the part between the [*].

Image dataset (https://i.stack.imgur.com/fdCnV.png)

I have tried the method str.replace. df['Description_GLEntry'].str.replace('[Jan 01, 2022 - Jan 31, 2022]', ''). This removes the text but leaves me with [-] and does not specifically delete [*]. My plan was to make it work with a loop in a def function.

英文:

I want to delete part of a string in a pandas dataframe in Python. The column contain invoice line with a period specified. See below. I want to delete this part the part between the [*].
Image dataset (https://i.stack.imgur.com/fdCnV.png)

I have tried method str replace. df['Description_GLEntry'].str.replace('[Jan 01, 2022 - Jan 31, 2022]', ''). This removes the text, but leaves me with [-] and it does not specifically delete [*]. My plan was to make it work with a loop in a def function.

答案1

得分: 0

我们可以使用以下正则表达式替换:

df["Description_GLEntry"] = df["Description_GLEntry"].str.replace(r'\s*\[\w{3} \d{2}, \d{4} - \w{3} \d{2}, \d{4}\]$', '', regex=True)

这里是一个 正则表达式演示,展示了替换逻辑对测试数据的工作效果。

英文:

We can use the following regex replacement:

<!-- language: python -->

df[&quot;Description_GLEntry&quot;] = df[&quot;Description_GLEntry&quot;].str.replace(r&#39;\s*\[\w{3} \d{2}, \d{4} - \w{3} \d{2}, \d{4}\]$&#39;, &#39;&#39;, regex=True)

Here is a regex demo showing that the replacement logic is working against your test data.

huangapple
  • 本文由 发表于 2023年5月7日 19:48:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76193736.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定