如何在Python中使用正则表达式匹配模式并优化代码。

huangapple go评论59阅读模式
英文:

How to match pattern in regex using python and optimize the code

问题

我有一个模式,它应该在输入文本中的任何地方匹配

text_pattrn = """ 今天的天气 | 今天的天气是 | 天气 | 今天的天气状况 | 天气状况 """

input_text = """ 每个人都急切地等待着这场比赛在这个场地上进行,所有人的目光都集中在天空上,因为今天的天气状况是多云的,可能会影响比赛。 """

match = re.search(text_pattrn, input_text)

有几个文本模式重复,例如“天气”是多余的,因为“今天的天气”已经匹配了“天气”。 任何优化代码的解决方案都将非常有帮助。

英文:

I'm having a pattern where it should be matched any where in the input text given

text_pattrn = """ Todays weather | Todays weather is | weather | Todays weather condition | weather condition """

input_text = """ Every one is eagerly waiting for the final match to be happened in this ground and all eyes on sky as todays weather condition is cloudy and rain may affect the game play"""

match = re.search(text_pattrn,input_text)

There are several text patterns repeated, for example "weather" is redundant because "Todays weather" already matches "weather" . Any solution to optimize the code would really helps a lot.

答案1

得分: 1

You could make the pattern case insensitive using re.I, make some of the parts optional so that you can shorten the alternatives and put all the alternatives in a non-capture group adding word boundaries to the left and right (or keep the spaces if you want):

\b(?:今天的天气(?: 是)?|天气|(?:今天的 )?天气状况)\b

See a regex 101 demo.

If you want to print all matches, you can use re.findall:

import re

text_pattern = r"\b(?:今天的天气(?: 是)?|天气|(?:今天的 )?天气状况)\b"
input_text = """Every one is eagerly waiting for the final match to be happened in this ground and all eyes on sky as 今天的天气状况 is cloudy and rain may affect the game play"""
print(re.findall(text_pattern, input_text, re.I))

Output:

['今天的天气']
英文:

You could make the pattern case insensitive using re.I, make some of the parts optional so that you can shorted the alternatives and put all the alternatives in a non capture group adding word boundaries to the left and right (or keep the spaces if you want)

\b(?:Todays weather(?: is)?|weather|(?:Todays )?weather condition)\b

See a regex 101 demo.

If you want to print all matches, you can use re.findall

import re

text_pattrn = r"\b(?:Todays weather(?: is)?|weather|(?:Todays )?weather condition)\b"
input_text = """ Every one is eagerly waiting for the final match to be happened in this ground and all eyes on sky as todays weather condition is cloudy and rain may affect the game play"""
print(re.findall(text_pattrn, input_text, re.I))

Output

['todays weather']

huangapple
  • 本文由 发表于 2023年1月6日 13:56:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/75027437.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定