英文:
Use JavaScript regex groups to match if other groups dont match before it
问题
可以使用 JavaScript 正则表达式来捕获第一个组(或多个组),如果它们没有捕获到内容吗?
在这里,我有两个(或三个)捕获组:
/("[^\n"]*")|("[^\n"]*(?:\n|$))/g
我想要的是将未匹配到的内容放入另一个组中,就像这样:
/("[^\n"]*")|("[^\n"]*(?:\n|$))|(<CATCH ANYTHING ELSE HERE>)/g
这是否可行?
在实际应用中,这意味着始终匹配整个字符串,但能够将其分段。
(我正在构建一个代码编辑器,尝试解析代码中的字符串)。
我尝试添加 (.*)
,但似乎总是在其他两个之前匹配到 "
。
编辑:
为了简化问题,假设我想将一个字符串分成两组:
- 所有字符都是
a
的字符。 - 所有字符都不是
a
的字符。
给定正则表达式 /(a)|(^\1)/g
,我会认为整个字符串会匹配,但实际情况并非如此。为什么?在更复杂的情况下,我认为使用反向引用更好?
英文:
Can I use a javascript regex to capture if the first group(s) don't capture?
Here i have two(three) capturing groups:
/("[^\n"]*")|("[^\n"]*(?:\n|$))/g
What i would like is to place anything that did not match in the first two, in another like so
/("[^\n"]*")|("[^\n"]*(?:\n|$))|(<CATCH ANYTHING ELSE HERE>)/g
Is this possible?
In practice, this would mean always matching the entire string, but being able to segment it.
(I am building a code editor, trying to parse strings in the code).
I tried adding (.*)
but that seemed to always catch "before" the other two.
EDIT:
To simplify, lets say I want to segment a string into two groups:
- All characters that are
a
- All characters that are not
a
Given the regex /(a)|(^\1)/g
, I would assume the entire string would match but this is not the case. Why? In more complex cases I assume using backreferences is better?
答案1
得分: 1
这个表达式完成了任务:
(regex1)|(regex2)|((?:(?!regex1|regex2).)*)
其中:
regex1
是你的第一个正则表达式。regex2
是你的第二个正则表达式。- 最后一组捕获一个字符,重复直到匹配到前两个正则表达式中的一个。特别地,
(?!...)
被称为负向前瞻。
在你的示例中,你可以使用 /(a)|((?:(?!a).)*)/g
。
附言:请注意,你的正则表达式也是错误的,因为 ^
匹配字符串的开头!
附言:正如评论中指出的,反向引用 \1
不会起作用,因为它引用了实际匹配的文本,而不是捕获它的表达式。
英文:
This expression does the job:
(regex1)|(regex2)|((?:(?!regex1|regex2).)*)
where:
regex1
is your first regexregex2
is your second regex- the last group captures a single character repeated until one of the first two regexes is matched. In particular,
(?!...)
is called negative lookahead.
In your example, you can use /(a)|((?:(?!a).)*)/g
.
P.S. Note that your regex is wrong also because ^
matches the beginning of a string!
P.P.S. As pointed out in comments, backreference \1
won't work, because it refers to the actual text matched, not to the expression that catched it.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论