英文:
Java regex returns only single match
问题
我有一个带有以下内容的文件:
~LayerData
type="waypointlist"
type="waypointlistend"
type="track" name="Track1" color=#695cbb
type="trackpoint" latitude="43.5032064" longitude="16.4266248"
type="trackpoint" latitude="43.5071074767561" longitude="16.48329290000057"
type="trackend"
~EndLayerData
~LayerData
type="waypointlist"
type="waypointlistend"
type="track" name="Track2" color=#000000
type="trackpoint" latitude="43.51037193515589" longitude="16.491883500895977"
type="trackpoint" latitude="43.521582832754135" longitude="16.473187288140295"
type="trackend"
~EndLayerData
我正在使用以下代码提取LayerData
到EndLayerData
之间的匹配项:
Pattern p = Pattern.compile("(~LayerData(.|\\n)*~EndLayerData)");
Matcher m = p.matcher(s);
结果是我获得了三个项目的m.group()
:前两个是相同的,包含整个文件。最后一个是"\n"。我预期分开获得Track1和Track2。
英文:
I have file with content:
~LayerData
type="waypointlist"
type="waypointlistend"
type="track" name="Track1" color=#695cbb
type="trackpoint" latitude="43.5032064" longitude="16.4266248"
type="trackpoint" latitude="43.5071074767561" longitude="16.48329290000057"
type="trackend"
~EndLayerData
~LayerData
type="waypointlist"
type="waypointlistend"
type="track" name="Track2" color=#000000
type="trackpoint" latitude="43.51037193515589" longitude="16.491883500895977"
type="trackpoint" latitude="43.521582832754135" longitude="16.473187288140295"
type="trackend"
~EndLayerData
I'm extracing LayerData -> EndLayerData matches using:
Pattern p = Pattern.compile("(~LayerData(.|\n)*~EndLayerData)");
Matcher m = p.matcher(s);
As a result I get m.group() with three items: first two are identical and contain the complete file. Last one is "\n". I expected to receive Track1 and Track2 separated.
答案1
得分: 1
你可以使用负向预查来匹配LayerData后面的所有行,这些行不能以LayerData或EndLayerData开头。
^~LayerData(?:\R(?!~(?:End)?LayerData).*)*\R~EndLayerData
解释
^~LayerData
从字符串的开头匹配LayerData(?:
非捕获组\R(?!~(?:End)?LayerData)
匹配换行符,并断言紧接其后的内容不是EndLayerData或LayerData.*
匹配行的其余部分
)*
关闭组并重复0次或多次,以匹配所有行\R~EndLayerData
匹配换行符和EndLayerData
在Java中需要双重转义的反斜杠:
String regex = "^~LayerData(?:\\R(?!~(?:End)?LayerData).*)*\\R~EndLayerData";
示例代码:
String regex = "^~LayerData(?:\\R(?!~(?:End)?LayerData).*)*\\R~EndLayerData";
String string = "...";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
英文:
You could match LayerData followed by all lines that do not start with either LayerData or EndLayerData using a negative lookahead.
^~LayerData(?:\R(?!~(?:End)?LayerData).*)*\R~EndLayerData
Explanation
^~LayerData
Match LayerData from the start of the string(?:
Non capture group\R(?!~(?:End)?LayerData)
Match a newline, assert what is directly to the right is not EndLayerData or LayerData.*
Match the rest of the line
)*
Close the group and repeat 0+ times to get all lines\R~EndLayerData
Match a newline and EndLayerData
In Java with double escaped backslashes:
String regex = "^~LayerData(?:\\R(?!~(?:End)?LayerData).*)*\\R~EndLayerData";
Example code
String regex = "^~LayerData(?:\\R(?!~(?:End)?LayerData).*)*\\R~EndLayerData";
String string = "...";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
答案2
得分: 0
(~LayerData(.|\n)*?~EndLayerData)
的翻译是:
(~LayerData(.|\n)*?~EndLayerData)
英文:
Try this pattern
(~LayerData(.|\n)*?~EndLayerData)
答案3
得分: 0
Update:
String regex = "~LayerData(.|\\n)*?~EndLayerData";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
Earlier Answer:
你没有正确匹配到结果,因为你使用的正则表达式不正确。因为它匹配以"~LayerData"开头并以"~EndLayerData"结尾的任何内容,所以整个文件都被匹配了。使用regex101.com创建一个适当的正则表达式(有助于可视化),并使用它应该解决问题。
英文:
Update:
Use Code Generator under tools in regex101 to get language-specific regex.
String regex = "\\~LayerData(.|\\n)*?\\~EndLayerData";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
Earlier Answer:
You are not getting the match properly as the regex you are using is not proper. Since it matches with everything that starts with "~LayerData" and ends with "~EndLayerData", the whole file is getting matched. Creating an appropriate regex using regex101.com (helps in visualizing) and using that should fix the issue.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论