正则表达式 – 匹配不带尾随第二位小数和数字的部分

huangapple go评论72阅读模式
英文:

Regex - matching without trailing second decimal and digits

问题

你的任务是在文本文件中找到符合以下模式的行:"SMC"后面跟着一个空格,1-3个数字,点号,和1-3个数字。你的问题是它还返回了第二个点号/小数点后面的数字。

import re

with open("caosDump.txt", 'r', encoding="cp1252") as inp, open("newCaosDump.txt", 'w') as output:
    for line in inp:
        if re.search(r'SMC\s\d{1,3}\.\d{1,3}', line):
            output.write(line)

你尝试了许多方法,如正向/负向预查、单词边界等,但都没有奏效。添加^$会破坏代码。

它还返回包含SMC 14.08.040的行,但你不想在新文本文件中包括这些行。(注意:使用Python)

英文:

My task is to find lines on a text file that match the following pattern: "SMC" followed by a space, 1-3 digits, period, and 1-3 digits. My issue: It's also returning digits after the second period/decimal.

import re

with open("caosDump.txt", 'r', encoding="cp1252") as inp, open("newCaosDump.txt", 'w') as output:
    for line in inp:
        if re.search(r'SMC\s\d{1,3}\.\d{1,3}', line):
            output.write(line)

I have tried many things such as positive/negative lookahead, word boundary, etc. but nothing worked. Adding ^ and $ break the code.

It’s also returning the lines that contain SMC 14.08.040, but I don’t want to include these lines on my new text file. (Note: Using Python)

答案1

得分: 1

你需要添加一个类似这样的负向预查:

if re.search(r'SMC\s\d{1,3}\.\d{1,3}(?!\.?\d)', line):

这个 (?!\.?\d) 的负向预查将在最后的1-3个数字紧接着一个 ..+数字时失败匹配。

请注意,SMC 后面的 \b 是多余的,因为你要求单词后面必须有空格。如果 SMC 必须被作为一个完整单词匹配,\b 必须放在单词的前面,即 r'\bSMC\s\d{1,3}\.\d{1,3}(?!\.?\d)'

如果在 SMC 单词后可能有多个空格,可以使用 \s+ 代替 \s

请参考正则表达式演示

英文:

You need to add a negative lookahead like this:

if re.search(r'SMC\s\d{1,3}\.\d{1,3}(?!\.?\d)', line):

The (?!\.?\d) lookahead will fail a match when the last 1-3 digits are immediately followed with a . or .+digit.

Note that the \b after SMC is redundant as you require a whitespace after the word. If SMC must be matched as a whole word, \b must be placed immediately before the word, i.e. r'\bSMC\s\d{1,3}\.\d{1,3}(?!\.?\d)'.

If there can be more than one whitespace after SMC word, use \s+ instead of \s.

See the regex demo.

huangapple
  • 本文由 发表于 2023年5月11日 10:16:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76223709.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定