你的正则表达式为什么在Java中只返回一个元素而不是完整的匹配组?

huangapple go评论67阅读模式
英文:

Why does my RegEx gives me only one element instead of the complete group in Java?

问题

I am new here and this is my first post. I'am also new to Java and RegEx. So...

我是新手,这是我的第一篇帖子。我也是Java和正则表达式的新手。所以...

I am working in Java and writing a RegEx to match phone numbers in a chat with this format: +XXX (XXX) XXX XXXX Written in numbers and/or in words.

我正在使用Java编写正则表达式来匹配聊天中以数字和/或单词表示的电话号码,格式为:+XXX (XXX) XXX XXXX

This is my regex:

这是我的正则表达式:

"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4}";

这是我的正则表达式:

"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4}";

Then using the method .group() to print the complete match and it works fine but if I want to print every different group with this loop:

然后使用.group()方法打印完整匹配,这可以正常工作,但如果我想要使用此循环打印每个不同的组:

    if (match.group(i) != null) {
        System.out.println("Group " + i + ": " + match.group(i));
    }
}

then it prints ONLY JUST ONE DIGIT (OR WORD) of every group.

那么它只打印每个组的一个数字(或单词)。

Why is that? And what do I have to do to print every element of every group?

为什么会这样?我该怎么做才能打印每个组的每个元素呢?

I've also tried this regex

我还尝试过这个正则表达式:

"\\+?(zero{1,3}|uno{1,3}|due{1,3}|tre{1,3}|quattro{1,3}|cinque{1,3}|sei{1,3}|sette{1,3}|otto{1,3}|nove{1,3}|\\d{1,3})?[._\\-]?\\(?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})\\)?[._\\-]?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})[._\\-]?(zero{4}|uno{4}|due{4}|tre{4}|quattro{4}|cinque{4}|sei{4}|sette{4}|otto{4}|nove{4}|\\d{4})";

...and also this one:

...还有这个:

"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{1,3})?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{4})";

Thanks in advance! 你的正则表达式为什么在Java中只返回一个元素而不是完整的匹配组?

提前感谢! 你的正则表达式为什么在Java中只返回一个元素而不是完整的匹配组?

英文:

I am new here and this is my first post. I'am also new to Java and RegEx. So...

I am working in Java and writing a RegEx to match phone numbers in a chat with this format: +XXX (XXX) XXX XXXX Written in numbers and/or in words.

This is my regex:

"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?[._\\-]?
\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}\\)?[._\\-]?
(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4}";

Then using the method .group() to print the complete match and it works fine but if I want to print every different group with this loop:

for (int i = 1; i <= match.groupCount(); i++) {
    if (match.group(i) != null) {
        System.out.println("Group " + i + ": " + match.group(i));
    }
}

then it prints ONLY JUST ONE DIGIT (OR WORD) of every group.

Why is that? And what do I have to do to print every element of every group?

I've also tried this regex

"\\+?(zero{1,3}|uno{1,3}|due{1,3}|tre{1,3}|quattro{1,3}|cinque{1,3}|sei{1,3}|sette{1,3}|otto{1,3}|nove{1,3}|\\d{1,3})?[._\\-]?\\(?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})\\)?[._\\-]?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})[._\\-]?(zero{4}|uno{4}|due{4}|tre{4}|quattro{4}|cinque{4}|sei{4}|sette{4}|otto{4}|nove{4}|\\d{4})";

...and also this one:

"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{1,3})?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{4})";

Thanks in advance! 你的正则表达式为什么在Java中只返回一个元素而不是完整的匹配组?

答案1

得分: 1

你需要将你的分组设为非捕获型 (?: ... ),然后将它们包装到捕获型分组中。类似这样:

"\\+?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?)[._\\- ]?\\(?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})\\)?[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4})";

解释:如果分组多次匹配,只捕获最后一次出现。

另外,我在分隔符的符号类中加入了空格。

英文:

You need to make your groups non-capturing (?: ... ) and then wrap them into capturing ones. Something like:

"\\+?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?)[._\\- ]?\\(?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})\\)?[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4})";

Reasoning: if group matched multiple times only last occurrence is captured.

Also, I've put spaces into symbol classes of delimiters.

huangapple
  • 本文由 发表于 2023年4月19日 18:03:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76053216.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定