英文:
Replace part of a string in column
问题
I want to delete part of a string in a pandas dataframe in Python. The column contains invoice lines with a period specified. See below. I want to delete the part between the [*].
Image dataset (https://i.stack.imgur.com/fdCnV.png)
I have tried the method str.replace
. df['Description_GLEntry'].str.replace('[Jan 01, 2022 - Jan 31, 2022]', '')
. This removes the text but leaves me with [-] and does not specifically delete [*]. My plan was to make it work with a loop in a def function.
英文:
I want to delete part of a string in a pandas dataframe in Python. The column contain invoice line with a period specified. See below. I want to delete this part the part between the [*].
Image dataset (https://i.stack.imgur.com/fdCnV.png)
I have tried method str replace. df['Description_GLEntry'].str.replace('[Jan 01, 2022 - Jan 31, 2022]', '')
. This removes the text, but leaves me with [-] and it does not specifically delete [*]. My plan was to make it work with a loop in a def function.
答案1
得分: 0
我们可以使用以下正则表达式替换:
df["Description_GLEntry"] = df["Description_GLEntry"].str.replace(r'\s*\[\w{3} \d{2}, \d{4} - \w{3} \d{2}, \d{4}\]$', '', regex=True)
这里是一个 正则表达式演示,展示了替换逻辑对测试数据的工作效果。
英文:
We can use the following regex replacement:
<!-- language: python -->
df["Description_GLEntry"] = df["Description_GLEntry"].str.replace(r'\s*\[\w{3} \d{2}, \d{4} - \w{3} \d{2}, \d{4}\]$', '', regex=True)
Here is a regex demo showing that the replacement logic is working against your test data.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论