如何在Python中从正则表达式匹配中排除换行符?

huangapple go评论76阅读模式
英文:

How to exclude linebreaks from a regex match in python?

问题

<br>

如何使下面的正则表达式排除跨越多行的匹配?

import re
reg = re.compile(r'\b(apple)(?:\W+\w+){0,4}?\W+(tree|plant|garden)')
reg.findall('my\napple tree in the garden')
reg.findall('apple\ntree in the garden')

第一个应该匹配,第二个不应该匹配。<br>
(现在两者都匹配...)
英文:

<br>

How can I make the bellow regex exclude matches that span across lines?

import re
reg = re.compile(r&#39;\b(apple)(?:\W+\w+){0,4}?\W+(tree|plant|garden)&#39;)
reg.findall(&#39;my\napple tree in the garden&#39;)
reg.findall(&#39;apple\ntree in the garden&#39;)

The first one should match, the second one should not.<br>
(Now both matches...)

答案1

得分: 1

你的 \W 匹配换行符。要排除它们,请用 [^\w\n] 替换 \W

import re
reg = re.compile(r'\b(apple)(?:[^\n\w]+\w+){0,4}?[^\n\w]+(tree|plant|garden)')
print(reg.findall('my\napple tree in the garden'))
#  [('apple', 'tree')]
print(reg.findall('apple\ntree in the garden'))
#  []
英文:

Your \W matches newlines. To exclude them replace \W with [^\w\n]:

import re
reg = re.compile(r&#39;\b(apple)(?:[^\n\w]+\w+){0,4}?[^\n\w]+(tree|plant|garden)&#39;)
print(reg.findall(&#39;my\napple tree in the garden&#39;))
#  [(&#39;apple&#39;, &#39;tree&#39;)]
print(reg.findall(&#39;apple\ntree in the garden&#39;))
#  []

huangapple
  • 本文由 发表于 2023年6月1日 19:31:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76381414.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定