英文:
Catastrophic backtracking error with any single character or number?
问题
首先,我知道标题不够客观,但我不明白为什么在Python的regex101网站上会出现下面的错误 "flavor"。
我尝试解释一下我的操作目的,我需要匹配在 "item" 后面的任何数字,然后匹配直到 "consumo estimado"。
正则表达式:
^item\s*(\d{0,})(.*?)consumo
示例文本:
ITEM 1 – AGULHA DE PUNÇÃO
Agulha de punção 18 ga x 70 mm
Consumo Estimado Anual: 284
Ampla Participação
ITEM 2 - CATETER ANGIOGRÁFICO PIGTAIL
Cateter angiográfico diagnóstico pigtail 5f x 100 cm
Consumo Estimado Anual: 210
Ampla Participação
ITEM 3 – Próteses Vasculares Dracon Reta 80 Cm
PROTESES VASCULARES ANELADA - Enxerto vascular reto constituído
em politetrafluoretileno (PTFE) extrudado e expandido construído com
suporte externo anelado que aumentam a resistência mecânica.
Tamanho
aproximado 8mm (diâmetro) x 70 -80 cm (comprimento)
Consumo Estimado Anual: 34
Ampla Participação
但在输入单词 "consumo" 后面加上一个空格之后,我无法再输入其他内容,导致了 "catastrophic backtracking"。
带错误的示例正则表达式:
^item\s*(\d{0,})(.*?)consumo e
^item\s*(\d{0,})(.*?)consumo 1
解决方法是使用 .*? 来匹配 "consumo" 和 "estimado" 之间的所有内容,这样正则表达式可以正常工作。
^item\s*(\d{0,})(.*?)consumo.*?estimado
为什么会出现这个错误?我找不到任何解释。
我已经找到了问题的解决方法,但我只是想知道为什么会出现这个错误。
https://regex101.com/r/uqm7ra/1
编辑1:
如建议所示,我已经添加了带有问题的当前保存正则表达式的链接。
编辑2:
如建议所示,我在提问时也尝试遵循 "meta"。谢谢你的建议!希望问题现在更清楚了。
英文:
First of all, I know the title is not as objective as it should be, I don't get why the below error is occurring on python "flavor" in regex101 website.
Just to explain what I'm trying to do, I have to match any number after "item", followed by everything until "consumo estimado".
Regex:
^item\s*(\d{0,})(.*?)consumo
Example text:
> ITEM 1 – AGULHA DE PUNÇÃO
Agulha de punção 18 ga x 70 mm
Consumo Estimado Anual: 284
Ampla Participação
>ITEM 2 - CATETER ANGIOGRAFICO PIGTAIL
Cateter angiográfico diagnóstico pigtail 5f x 100 cm
Consumo Estimado Anual: 210
Ampla Participação
> ITEM 3 – Próteses Vasculares Dracon Reta 80 Cm
PROTESES VASCULARES ANELADA - Enxerto vascular reto constituído
em politetrafluoretileno (PTFE) extrudado e expandido construído com
suporte externo anelado que aumentam a resistência mecânica.
Tamanho
aproximado 8mm (diâmetro) x 70 -80 cm (comprimento)
Consumo Estimado Anual: 34
Ampla Participação
But after entering the word "consumo" followed by a space, I cant put anything else, resulting in "catastrophic backtracking"
Example Regex with error:
^item\s*(\d{0,})(.*?)consumo e
^item\s*(\d{0,})(.*?)consumo 1
The solution was to use .*? to capture everything between "consumo" and "estimado", which worked properly.
^item\s*(\d{0,})(.*?)consumo.*?estimado
Why is this error occurring? I couldn't find any explanation for it.
I already have the solution for the problem, but I just wanna know why the error happened.
https://regex101.com/r/uqm7ra/1
Edit 1:
As suggested, I have added the link to the current saved regex with the problem.
Edit 2:
As suggested, I also have tried to follow the "meta" when asking for anything here in Stack Overflow. Thanks for the advice!
I hope the question is better now.
答案1
得分: 0
\d{0,}
看起来有点可疑,正则引擎会尝试使用更少的数字,这可能是灾难性的。用(\D.*?)?consumo
锚定它,以防止这种情况发生。
另外,如果你想要一个数字,你应该用{1,}
(或者更惯用且简洁的+
;同样,{0,}
通常写作*
)。
^item\s*(\d+)(\D.*?)?consumo
英文:
\d{0,}
looks iffy, the regex engine will retry with fewer and fewer digits which can be catastrophic. Anchor it with (\D.*?)?consumo
to prevent that.
Also, if you want a number, you mean {1,}
(or the more idiomatic and brief +
; similarly, {0,}
is customarily written *
).
^item\s*(\d+)(\D.*?)?consumo
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论