英文:
How to match pattern in regex using python and optimize the code
问题
我有一个模式,它应该在输入文本中的任何地方匹配
text_pattrn = """ 今天的天气 | 今天的天气是 | 天气 | 今天的天气状况 | 天气状况 """
input_text = """ 每个人都急切地等待着这场比赛在这个场地上进行,所有人的目光都集中在天空上,因为今天的天气状况是多云的,可能会影响比赛。 """
match = re.search(text_pattrn, input_text)
有几个文本模式重复,例如“天气”是多余的,因为“今天的天气”已经匹配了“天气”。 任何优化代码的解决方案都将非常有帮助。
英文:
I'm having a pattern where it should be matched any where in the input text given
text_pattrn = """ Todays weather | Todays weather is | weather | Todays weather condition | weather condition """
input_text = """ Every one is eagerly waiting for the final match to be happened in this ground and all eyes on sky as todays weather condition is cloudy and rain may affect the game play"""
match = re.search(text_pattrn,input_text)
There are several text patterns repeated, for example "weather" is redundant because "Todays weather" already matches "weather" . Any solution to optimize the code would really helps a lot.
答案1
得分: 1
You could make the pattern case insensitive using re.I
, make some of the parts optional so that you can shorten the alternatives and put all the alternatives in a non-capture group adding word boundaries to the left and right (or keep the spaces if you want):
\b(?:今天的天气(?: 是)?|天气|(?:今天的 )?天气状况)\b
See a regex 101 demo.
If you want to print all matches, you can use re.findall:
import re
text_pattern = r"\b(?:今天的天气(?: 是)?|天气|(?:今天的 )?天气状况)\b"
input_text = """Every one is eagerly waiting for the final match to be happened in this ground and all eyes on sky as 今天的天气状况 is cloudy and rain may affect the game play"""
print(re.findall(text_pattern, input_text, re.I))
Output:
['今天的天气']
英文:
You could make the pattern case insensitive using re.I
, make some of the parts optional so that you can shorted the alternatives and put all the alternatives in a non capture group adding word boundaries to the left and right (or keep the spaces if you want)
\b(?:Todays weather(?: is)?|weather|(?:Todays )?weather condition)\b
See a regex 101 demo.
If you want to print all matches, you can use re.findall
import re
text_pattrn = r"\b(?:Todays weather(?: is)?|weather|(?:Todays )?weather condition)\b"
input_text = """ Every one is eagerly waiting for the final match to be happened in this ground and all eyes on sky as todays weather condition is cloudy and rain may affect the game play"""
print(re.findall(text_pattrn, input_text, re.I))
Output
['todays weather']
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论