获取Go中正则表达式的解析树

huangapple go评论82阅读模式
英文:

Getting Parse Tree of a Regex in Go

问题

我尝试使用regex.syntax模块来访问解析后的正则表达式的各个标记,但没有成功:我只能输出正则表达式的简化/优化版本。

代码:

package main

import (
	"fmt"
	"regexp/syntax"
)

func main() {
	p, e := syntax.Parse(`[0120-2]@[ab][0-9]`, 'i')

	fmt.Println(p)
	fmt.Println(e)
}

输出:

[0-2](?i:@)[A-Ba-b][0-9]
<nil>

有人可以给我一个简单的例子,展示如何遍历并输出它的解析树吗?

英文:

I tried using the regex.syntax module to access the individual tokens of a parsed regular expression without success: the only thing I'm able to output is a simplified/optimized version of the regex.

Code:

package main

import (
	&quot;fmt&quot;
	&quot;regexp/syntax&quot;
)

func main() {
	p, e := syntax.Parse(`[0120-2]@[ab][0-9]`, &#39;i&#39;)

	fmt.Println(p)
	fmt.Println(e)
}

Output:

[0-2](?i:@)[A-Ba-b][0-9]
&lt;nil&gt;

Can someone give me a simple example of how to traverse and output its parse tree?

答案1

得分: 4

你调用的Parse函数是正确的。当你调用fmt.Println(p)时,解析树被转换为字符串,这就是为什么你看到的输出只是一个等价的正则表达式。

Parse的返回值是一个指向syntax.Regexp结构体的指针。要遍历返回的解析树,你可以查看返回结构体的Sub字段,该字段列出了所有的子表达式(一个指向syntax.Regexp结构体的切片)。例如:

func printSummary(r *syntax.Regexp) {
    fmt.Printf("%v has %d sub expressions\n", r, len(r.Sub))
    for i, s := range r.Sub {
        fmt.Printf("Child %d:\n", i)
        printSummary(s)
    }
}

请参阅syntax包的参考文档了解更多值得检查的字段:OpRune是主要的字段之一。

英文:

The Parse function you're calling is right. When you call fmt.Println(p), the parse tree is being converted to a string, which is why the output you're seeing is just an equivalent regexp.

The return value of Parse is a pointer to a syntax.Regexp struct. To traverse the returned parse tree you want to look at the Sub field of the returned struct which lists all the subexpressions (a slice of pointers to syntax.Regexp structs). For example:

func printSummary(r *syntax.Regexp) {
    fmt.Printf(&quot;%v has %d sub expressions\n&quot;, r, len(r.Sub))
    for i, s := range r.Sub {
        fmt.Printf(&quot;Child %d:\n&quot;, i)
        printSummary(s)
    }
}

See the syntax package reference for more fields worth inspecting: Op and Rune are major ones.

huangapple
  • 本文由 发表于 2013年12月14日 17:16:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/20581527.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定