regex101 – 仅匹配字符串在其首次出现的第一行

huangapple go评论49阅读模式
英文:

regex101 - match all occurrences of a string only in the first line where it is found

问题

在regex101网站上,我有以下文本:

text3 text3 text3 text3 text3 text3

text1 text2 text1 text2 text1 text2

text1 text2 text1 text2 text1 text2

text1 text2 text1 text2 text1 text2

我想匹配所有的text2,但只在第一行中找到它:

text3 text3 text3 text3 text3 text3

text1 **text2** text1 **text2** text1 **text2**

text1 text2 text1 text2 text1 text2

text1 text2 text1 text2 text1 text2

我该如何实现这个?

英文:

In regex101 site I have the text:

text3 text3 text3 text3 text3 text3

text1 text2 text1 text2 text1 text2

text1 text2 text1 text2 text1 text2

text1 text2 text1 text2 text1 text2

I want to match all text2 but only in the first line where it is found:

text3 text3 text3 text3 text3 text3

text1 **text2** text1 **text2** text1 **text2**

text1 text2 text1 text2 text1 text2

text1 text2 text1 text2 text1 text2

How do I get this?

答案1

得分: 3

/(?:^.*?|\G[^\r\n]*?)\Ktext2/gs
/(?:^[\s\S]*?|\G[^\r\n]*?)\Ktext2/g
/(?:\A[\s\S]*?|\G[^\r\n]*?)\Ktext2/gm
英文:

You can do this with regex flavours like PCRE, which support an end-of-match token (\G) and can be set to single line matching (/s). The idea is to match the first target by anchoring to the beginning of the line while consuming as few characters as possible (^.*?) and then to allow further matches only at the end of previous matches, while excluding line breaks in procuring them (\G[^\r\n]*?). See regex demo.

An expression to do that could look like this:

/(?:^.*?|\G[^\r\n]*?)\Ktext2/gs

\K is simply used to cut out the preceeding part of matches from the result to avoid using capturing groups for singling out text2.

To cover other aspects of line break/position matching, if you want to drop the single-line modifier (/s), in which case . ceases to match new-line characters, you can use a class that also matches line breaks, like [\s\S]*?, instead of .*? to get the initial match. See demo.

/(?:^[\s\S]*?|\G[^\r\n]*?)\Ktext2/g

If you want to use the multi-line modifier /m specifically, in which case the caret ^ now matches at the beginning of every line, you'll have to use the anchor for the beginning of string \A instead to match the initial target. See demo.

/(?:\A[\s\S]*?|\G[^\r\n]*?)\Ktext2/gm

答案2

得分: 2

你可以使用具有前瞻捕获组的技巧来实现(当然要禁用全局标志)。

你的正则表达式如下:

text2(?=(?:.*?(text2))*)

演示请点击这里

请注意,如果你有两个以上的分开匹配元素,你需要选择.NET结束,因为只有它允许对同一组进行多次捕获。

英文:

You can do it using trick with capturing groups inside of lookahead (and disabled global flag of course).

Your regex would this:

text2(?=(?:.*?(text2))*)

Demo here

Notice that if you have more than two separate matching elements, you'll need to select .Net ending, as only it will allow multiple captures for the same group.

huangapple
  • 本文由 发表于 2023年5月18日 01:21:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76274680.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定