删除按正则表达式规则匹配的内容。

huangapple go评论60阅读模式
英文:

delete by regex rule

问题

I have some data. I want to delete part of it by a regex rule.
我有一些数据。我想通过正则表达式规则删除其中的一部分。

I want to delete any character except for the period between two numbers and the number.
我想删除除两个数字之间的句点和数字之外的任何字符。

Data as follows:
数据如下:

str1 = 'ABC.5,696.05'
str2 = 'xxx3,769.01'

The result should be '5696.05' and '3769.01'.
结果应为'5696.05'和'3769.01'。

I use re.sub(r'[^\d\.]', '', str1). But it cannot delete the first '.'.
我使用 re.sub(r'[^\d\.]', '', str1)。但它无法删除第一个'.'。

英文:

I have some data. I want to delete part of it by a regex rule.
I want to delete any character except for
the period between two number and the number.
Data as follows:

str1 = 'ABC.5,696.05'
str2 = 'xxx3,769.01'

The result should be '5696.05' and '3769.01' .
I use re.sub(r'[^\d\.]', '', str1). But it can not delete the first '.'.

答案1

得分: 1

我不是正则表达式的专家,所以你可以链式调用方法:

>>> float(re.sub('^[^\d]+', '', str1).replace(',', ''))
5696.05

>>> float(re.sub('^[^\d]+', '', str2).replace(',', ''))
3769.01

正则表达式用于删除字符串开头的非数字前缀,然后使用简单的替换来删除千位分隔符。

英文:

I'm not an expert of regex so you can chain methods:

>>> float(re.sub('^[^\d]+', '', str1).replace(',', ''))
5696.05

>>> float(re.sub('^[^\d]+', '', str2).replace(',', ''))
3769.01

A regex to remove non numeric prefix at the start of the strings and a simple substitution to remove thousands separators.

答案2

得分: 1

以下是翻译好的部分:

这可以分为两个阶段完成:

  1. 找到以数字开头和以数字结尾的片段,
  2. 替换其中不是数字或点的所有内容。

您可以将回调函数传递给 sub

print(re.sub(r'.*?(\d.+\d).*', lambda x: re.sub(r'[^\d.]|(?<!\d)\.|\.(?!\d)','',x.group()),'ABC.5,696.05'))
# 5696.05

在这里,外部的 sub 捕获了第一个和最后一个数字之间的所有内容,并将其传递给 lambda 函数。

Lambda 函数删除了:

  • 非数字或点:[^\d.]
  • 没有数字前缀的点 (?<!\d)\.
  • 没有数字后缀的点 \.(?!\d)
英文:

This can be done in two stages:

  1. Find segment starting and ending with a digit,
  2. Replace everything what is not a digit or dot in between.

You can pass callback to sub

print(re.sub(r&#39;.*?(\d.+\d).*&#39;, lambda x: re.sub(r&#39;[^\d.]|(?&lt;!\d)\.|\.(?!\d)&#39;,&#39;&#39;,x.group()),&#39;ABC.5,696.05&#39;))
# 5696.05

Here outer sub catches everything between first and last digit into group and passes it into lambda.

Lambda removes:

  • not digits or dots: [^\d.],
  • dots that are not preceded by digit (?&lt;!\d)\.
  • dots that are not followed by digit \.(?!\d)

huangapple
  • 本文由 发表于 2023年4月13日 16:24:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76003253.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定