英文:
Catastrophic Backtracking regular expression example in Java 9+
问题
有人是否有一个在Java 11中运行的正则表达式catastrophic backtracking示例吗?大多数常见示例(比如"(a+a+)+b")已经自Java 9起得到修复。最好的例子是一个没有回溯引用的例子,不知道在JDK 9+中是否可能。
在我们的应用程序中,我们有一个控制这种回溯的逻辑,并且为了测试该逻辑,我们使用了表达式"(x+x+)+y"。升级到JDK 11后,它不再引起需要的行为。
英文:
Does anyone have an example of a catastrophic backtracking in a regular expression, which works in Java 11? Most of the usual examples (like "(a+a+)+b") are fixed since java 9. The best would be one without back-references, no idea if it's possible in JDK 9+.
In our application we have a logic to control such backtracking, and to test that logic we used the expression "(x+x+)+y". After upgrading to JDK 11 it no longer causes the need behavior.
答案1
得分: 0
以下是翻译好的内容:
我正在使用 Java 14,回溯问题仍然存在。以下操作大约需要6秒钟才会失败。
String regex = "(a+a+a+a+a+a+)+b";
String str = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac";
System.out.println(str.matches(regex));
可以通过使用独立组 (?>(X))
来避免这个问题。一旦组在第一次匹配时成功,引擎会检查接下来的内容是否匹配 b
。如果不匹配,无需重复,因为回溯不会改变这个结果。所以它会立即失败(即使针对更大的表达式和/或测试字符串)。
String regex1 = "(?>(a+a+a+a+a+a+)+)b";
System.out.println(str.matches(regex1));
英文:
I am running java 14 and the backtracking issue still exists. It takes the following about 6 seconds to fail.
String regex = "(a+a+a+a+a+a+)+b";
String str = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac";
System.out.println(str.matches(regex));
This can be prevented by using independent group(s) (?>X)
. Once the group matches the first time, the engine checks to see if the following matches the b
. If it doesn't, no need to repeat as backtracking will not change that outcome. So it fails instantly (even for larger expressions and/or test strings).
String regex1 = "(?>(a+a+a+a+a+a+)+)b";
System.out.println(str.matches(regex1));
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论