英文:
Why does my RegEx gives me only one element instead of the complete group in Java?
问题
I am new here and this is my first post. I'am also new to Java and RegEx. So...
我是新手,这是我的第一篇帖子。我也是Java和正则表达式的新手。所以...
I am working in Java and writing a RegEx to match phone numbers in a chat with this format: +XXX (XXX) XXX XXXX
Written in numbers and/or in words.
我正在使用Java编写正则表达式来匹配聊天中以数字和/或单词表示的电话号码,格式为:+XXX (XXX) XXX XXXX
。
This is my regex:
这是我的正则表达式:
"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4}";
这是我的正则表达式:
"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4}";
Then using the method .group()
to print the complete match and it works fine but if I want to print every different group with this loop:
然后使用.group()
方法打印完整匹配,这可以正常工作,但如果我想要使用此循环打印每个不同的组:
if (match.group(i) != null) {
System.out.println("Group " + i + ": " + match.group(i));
}
}
then it prints ONLY JUST ONE DIGIT (OR WORD) of every group.
那么它只打印每个组的一个数字(或单词)。
Why is that? And what do I have to do to print every element of every group?
为什么会这样?我该怎么做才能打印每个组的每个元素呢?
I've also tried this regex
我还尝试过这个正则表达式:
"\\+?(zero{1,3}|uno{1,3}|due{1,3}|tre{1,3}|quattro{1,3}|cinque{1,3}|sei{1,3}|sette{1,3}|otto{1,3}|nove{1,3}|\\d{1,3})?[._\\-]?\\(?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})\\)?[._\\-]?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})[._\\-]?(zero{4}|uno{4}|due{4}|tre{4}|quattro{4}|cinque{4}|sei{4}|sette{4}|otto{4}|nove{4}|\\d{4})";
...and also this one:
...还有这个:
"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{1,3})?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{4})";
Thanks in advance!
提前感谢!
英文:
I am new here and this is my first post. I'am also new to Java and RegEx. So...
I am working in Java and writing a RegEx to match phone numbers in a chat with this format: +XXX (XXX) XXX XXXX
Written in numbers and/or in words.
This is my regex:
"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?[._\\-]?
\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}\\)?[._\\-]?
(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3}[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4}";
Then using the method .group()
to print the complete match and it works fine but if I want to print every different group with this loop:
for (int i = 1; i <= match.groupCount(); i++) {
if (match.group(i) != null) {
System.out.println("Group " + i + ": " + match.group(i));
}
}
then it prints ONLY JUST ONE DIGIT (OR WORD) of every group.
Why is that? And what do I have to do to print every element of every group?
I've also tried this regex
"\\+?(zero{1,3}|uno{1,3}|due{1,3}|tre{1,3}|quattro{1,3}|cinque{1,3}|sei{1,3}|sette{1,3}|otto{1,3}|nove{1,3}|\\d{1,3})?[._\\-]?\\(?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})\\)?[._\\-]?(zero{3}|uno{3}|due{3}|tre{3}|quattro{3}|cinque{3}|sei{3}|sette{3}|otto{3}|nove{3}|\\d{3})[._\\-]?(zero{4}|uno{4}|due{4}|tre{4}|quattro{4}|cinque{4}|sei{4}|sette{4}|otto{4}|nove{4}|\\d{4})";
...and also this one:
"\\+?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{1,3})?[._\\-]?\\(?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})\\)?[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{3})[._\\-]?(zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d{4})";
Thanks in advance!
答案1
得分: 1
你需要将你的分组设为非捕获型 (?: ... )
,然后将它们包装到捕获型分组中。类似这样:
"\\+?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?)[._\\- ]?\\(?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})\\)?[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4})";
解释:如果分组多次匹配,只捕获最后一次出现。
另外,我在分隔符的符号类中加入了空格。
英文:
You need to make your groups non-capturing (?: ... )
and then wrap them into capturing ones. Something like:
"\\+?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){1,3}?)[._\\- ]?\\(?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})\\)?[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){3})[._\\- ]?((?:zero|uno|due|tre|quattro|cinque|sei|sette|otto|nove|\\d){4})";
Reasoning: if group matched multiple times only last occurrence is captured.
Also, I've put spaces into symbol classes of delimiters.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论