英文:
Regex- GetGroupNames returning more groups than expected
问题
我无法理解我的正则表达式的情况。我有以下伪代码:
string patternRegex = "^(?<GROUP_1>[1-4][0-9]|7[3-9]|8[0-9])(?<GROUP_2>(\\d{5}))$";
Regex regex = new Regex(patternRegex, RegexOptions.Compiled);
var groupNames = regex.GetGroupNames();
我期望GetGroupNames()应该返回两个组(GROUP_1和GROUP_2),但在这种情况下,它返回了四个组。
- 0
- 1
- GROUP_1
- GROUP_2
当我尝试匹配字符串:1395614 时,组的值为:
- 1395614
- 95614
- 13
- 95614
我已经在Google上搜索过,有人说0组代表完全匹配。
GROUP_1和GROUP_2的值也是正确的。
但是组1(我不是在这里询问GROUP_1)代表什么?
英文:
I cant understand what is going on with my regex's.
I have a below pseudo-code:
string patternRegex="^(?<GROUP_1>[1-4][0-9]|7[3-9]|8[0-9])(?<GROUP_2>(\\d{5}))$"
Regex regex = new Regex(patternRegex, RegexOptions.Compiled);
var groupNames = regex.GetGroupNames();
I expect that GetGroupNames() should return me two groups (GROUP_1 and GROUP_2), but in this case it returns me 4 groups.
- 0
- 1
- GROUP_1
- GROUP_2
When i try to match string: 1395614, the group values are:
- 1395614
- 95614
- 13
- 95614
I already googled it, and i read somewhere that 0 group stands for the full match.
GROUP_1 and GROUP_2 values are also correct.
But what group 1 (im not asking here about GROUP_1) stands for?
.NET Fiddle: https://dotnetfiddle.net/Bfgv9s
答案1
得分: 4
为了避免默认的未命名捕获组索引为1,您应该重写您的正则表达式为:
string patternRegex = "^(?<GROUP_1>(?:[1-4][0-9]|7[3-9]|8[0-9]))(?<GROUP_2>(?:\\d{5}))$";
对于第一个捕获组 GROUP_1,请使用 (?:[1-4][0-9]|7[3-9]|8[0-9]),而不是 [1-4][0-9]|7[3-9]|8[0-9]。
(?:...) 是一个非捕获组,允许我们将模式的一部分分组,而不创建捕获组。这确保了这部分的匹配不会分配给捕获组。
对于第二个捕获组 GROUP_2,请使用 (?:\\d{5}),而不是 \\d{5}。
同样,(?:\\d{5}) 是一个非捕获组。
英文:
To avoid default unnamed capturing group with index 1, you should rewrite your regexp to
string patternRegex = "^(?<GROUP_1>(?:[1-4][0-9]|7[3-9]|8[0-9]))(?<GROUP_2>(?:\\d{5}))$";
For the first capturing group GROUP_1, use (?:[1-4][0-9]|7[3-9]|8[0-9]) instead of [1-4][0-9]|7[3-9]|8[0-9].
The (?:...) is a non-capturing group that allows us to group parts of the pattern without creating a capturing group. It ensures that the match for this part will not be assigned to a capturing group.
For the second capturing group GROUP_2, use (?:\\d{5}) instead of \\d{5}.
Again, (?:\\d{5}) is a non-capturing group.
答案2
得分: 3
你确实有完全匹配。
每对括号都有一个分组。
基本上,你还有这个匹配:(\d{5}),它被捕获为组 "1"。
你可以通过以下方式访问捕获的组:
var match = regex.Match("1395614");
var group1 = match.Groups["GROUP_1"].Value;
为了防止组被捕获,只需在(\d{5})组前加上?:,表示"非捕获组"。
它应该看起来像这样:(?:\d{5})。
英文:
Indeed you have the full match.
You have also a group for each parenthesis couple.
Basically you have also this match : (\d{5}) which is captured as group "1"
You can access the captured groups this way:
var match = regex.Match("1395614");
var group1 = match.Groups["GROUP_1"].Value;
To prevent group from being catched, simply add ?: that stands for non capturing group to the (\d{5}) group.
It should look like: (?:\d{5}).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论