英文:
Regex to find named capturing groups with Go programming language
问题
我正在寻找一个正则表达式来查找(其他)正则表达式字符串中的命名捕获组。
例如:我想在以下正则表达式中找到(?P<country>m((a|b).+)n)
, (?P<city>.+)
和(?P<street>(5|6)\. .+)
:
/(?P<country>m((a|b).+)n)/(?P<city>.+)/(?P<street>(5|6)\. .+)
我尝试了以下正则表达式来查找命名捕获组:
var subGroups string = `(\(.+\))*?`
var prefixedSubGroups string = `.+` + subGroups
var postfixedSubGroups string = subGroups + `.+`
var surroundedSubGroups string = `.+` + subGroups + `.+`
var capturingGroupNameRegex *regexp.RichRegexp = regexp.MustCompile(
`(?U)` +
`\(\?P<.+>` +
`(` + prefixedSubGroups + `|` + postfixedSubGroups + `|` + surroundedSubGroups + `)` +
`\)`)
?U
使贪婪量词(+
和*
)变为非贪婪,非贪婪量词(*?
)变为贪婪。详细信息请参阅Go正则表达式文档。
但它不起作用,因为括号没有正确匹配。
英文:
I'm looking for a regex to find named capturing groups in (other) regex strings.
Example: I want to find (?P<country>m((a|b).+)n)
, (?P<city>.+)
and (?P<street>(5|6)\. .+)
in the following regex:
/(?P<country>m((a|b).+)n)/(?P<city>.+)/(?P<street>(5|6)\. .+)
I tried the following regex to find the named capturing groups:
var subGroups string = `(\(.+\))*?`
var prefixedSubGroups string = `.+` + subGroups
var postfixedSubGroups string = subGroups + `.+`
var surroundedSubGroups string = `.+` + subGroups + `.+`
var capturingGroupNameRegex *regexp.RichRegexp = regexp.MustCompile(
`(?U)` +
`\(\?P<.+>` +
`(` + prefixedSubGroups + `|` + postfixedSubGroups + `|` + surroundedSubGroups + `)` +
`\)`)
?U
makes greedy quantifiers(+
and *
) non-greedy, and non-greedy quantifiers (*?
) greedy. Details in the Go regex documentation.
But it doesn't work because parenthesis are not matched correctly.
答案1
得分: 7
匹配任意嵌套的括号是不可能通过正则表达式来实现的,因为任意(递归)嵌套无法被正则语言描述。
一些现代的正则表达式引擎支持递归(Perl,PCRE)或平衡匹配(.NET),但Go不是其中之一(文档明确说明,Go的正则表达式包似乎是基于RE2库,不支持Perl的(?R)
构造)。你需要构建一个递归下降解析器,而不是正则表达式。
英文:
Matching arbitrarily nested parentheses correctly is not possible with regular expressions because arbitrary (recursive) nesting cannot be described by a regular language.
Some modern regex flavor do support recursion (Perl, PCRE) or balanced matching (.NET), but Go is not one of them (the docs explicitly say that Perl's (?R)
construct is not supported by the RE2 library that Go's regex package appears to be based on). You need to build a recursive descent parser, not a regex.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论