英文:
Regex match symbol and ignore exact string
问题
(?<!&)#(?!8203;)
这将允许捕获大部分的'#'
在我的情况下。
例如,对于输入he#ll#o
,将会有2个符合预期的匹配。
同样,对于输入he#ll#o&#8203;
,将会有2个符合预期的匹配。
然而,对于输入&#&#&#
或者#8203;#8203;#8203;
,它将无法找到匹配项。
我应该如何修改现有的正则表达式以忽略完全匹配'&#8203;'
,考虑到前面的文本可能不是前一个单词或空白的结尾?
英文:
Currently I have regex like so:
(?<!&)#(?!8203;)
This will allow the capturing of most '#'
for my case.
For instance, given the input he#ll#o
, there would be 2 matches as expected.
Again, given the input he#ll#o&#8203;
, there would be 2 matches as expected.
However, given the input &#&#&#
or just #8203;#8203;#8203;
, it will fail to find matches.
How do I modify the existing regular expression to ignore exactly '&#8203;'
, given that the preceding text may not be the end of a previous word or whitespace?
答案1
得分: 2
你可以调整前后查找为
#(?<!&#(?=8203;))
查看正则表达式演示。
详情:
#
- 一个#
字符(?<!&#(?=8203;))
- 一个负回顾,如果左边紧跟着一个&#
字符序列,后面紧跟着8203;
字符序列,则匹配失败。
一个类似的正则表达式如下
(?<!&(?=#8203;))#
查看这个正则表达式演示。我会使用 #(?<!&#(?=8203;))
,因为只有在找到 #
字符后才会触发回顾检查,而且查找静态字符比在字符串的每个位置检查回顾模式更容易(就像第二个正则表达式的情况一样)。
英文:
You can adjust the lookarounds to
#(?<!&#(?=8203;))
See the regex demo.
Details:
#
- a#
char(?<!&#(?=8203;))
- a negative lookbehind that fails the match if - immediately on the left - there is a&#
char sequence that is immediately followed with8203;
char sequence.
A synonymous regex will look like
(?<!&(?=#8203;))#
See this regex demo. I'd use #(?<!&#(?=8203;))
since the lookbehind check is only triggered once the #
char is found, and it is easier to look for a static char than to check for the lookbehind pattern at each location in the string (as is the case with the second regex).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论