从ANTLR解析树中提取特定标记

huangapple go评论71阅读模式
英文:

Extract specific token out of ANTLR Parse Tree

问题

我试图从ANTLR解析树中提取数据,但对应该如何正确执行此操作还不太明白。

假设我有以下两个SQL查询:

// SQL语言
val sql3 = """
CREATE TABLE session(
    id           uuid                    not null
        constraint account_pk
            primary key,
    created     timestamp   default now() not null
)
""".trimIndent()

// SQL语言
val sql4 = """
CREATE TABLE IF NOT EXISTS blah(
    id           uuid                    not null
    constraint account_pk
        primary key,
    created     timestamp   default now() not null
)
""".trimIndent()

现在我解析它们:

val visitor = Visitor()
listOf(sql3, sql4).forEach { sql ->
    val lexer = SQLLexer(CharStreams.fromString(sql))
    val parser = SQLParser(CommonTokenStream(lexer))

    visitor.visit(parser.sql())
    println(visitor.tableName)
}

在我的访问者(Visitor)中,如果我访问tableCreateStatement,我会得到解析树,但很显然,仅仅抓取child1将适用于sql3,但不适用于sql4,因为sql4中的child1是IF NOT EXISTS

class Visitor : SQLParserBaseVisitor<Unit>() {

    var tableName = ""

    override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
        tableName = ctx?.getChild(1)?.text ?: ""
        super.visitCreate_table_statement(ctx)
    }
}

有没有办法在解析树中查找特定的标记?

我猜想这可能与payload有关,但由于它的类型是Any,我不确定该如何检查它。

override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
    ctx?.children?.forEach {
        if (it.payload.javaClass == SQLParser::Schema_qualified_nameContext) {
            tableName = it.text
        }
    }
    super.visitCreate_table_statement(ctx)
}

编辑:.g4文件来自于https://github.com/pgcodekeeper/pgcodekeeper/tree/master/apgdiff/antlr-src。

英文:

i'm trying to extract data from the ANTLR parse tree, but not fully grasping how this should be done correctly

Let's say i have the following two SQL queries:

		// language=SQL
		val sql3 = &quot;&quot;&quot;
		CREATE TABLE session(
			id           uuid                    not null
				constraint account_pk
					primary key,
			created 	timestamp	default now() not null
		)
		&quot;&quot;&quot;.trimIndent()

		// language=SQL
		val sql4 = &quot;&quot;&quot;
			CREATE TABLE IF NOT EXISTS blah(
				id           uuid                    not null
				constraint account_pk
					primary key,
				created 	timestamp	default now() not null
			)
		&quot;&quot;&quot;.trimIndent()

Now i parse both of them:

		val visitor = Visitor()
		listOf(sql3, sql4).forEach { sql -&gt;
			val lexer = SQLLexer(CharStreams.fromString(sql))
			val parser = SQLParser(CommonTokenStream(lexer))

			visitor.visit(parser.sql())
			println(visitor.tableName)

		}

In my visitor if i visit the tableCreateStatement, i get the parse tree, but obviously just grabbing child1 will work for sql3, but not for sql4 since child1 in sql4 is IF NOT EXISTS

class Visitor : SQLParserBaseVisitor&lt;Unit&gt;() {

	var tableName = &quot;&quot;

	override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
		tableName = ctx?.getChild(1)?.text ?: &quot;&quot;
		super.visitCreate_table_statement(ctx)
	}

}

Is there a way to find a specific token in the parse tree?

I'm assuming the payload has something to do with it, but since it's of type Any, i'm not sure what to check it against

	override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
		ctx?.children?.forEach {
			if (it.payload.javaClass == SQLParser::Schema_qualified_nameContext) {
				tableName = it.text
			}
		}
		super.visitCreate_table_statement(ctx)
	}

EDIT: the .g4 files are from
https://github.com/pgcodekeeper/pgcodekeeper/tree/master/apgdiff/antlr-src

答案1

得分: 0

这似乎有效
fun walkLeaves(
    childTree: ParseTree = internalTree,
    leave: (childTree: ParseTree) -> Unit
) {

    if (childTree.childCount == 0) {
        if (!childTree.text?.trim().isNullOrBlank()) {
            leave(childTree)
        }
    } else {
        for (i in 0 until childTree.childCount) {
            walkLeaves(childTree = childTree.getChild(i), leave = leave)
        }
    }
}
fun extractSQL(
    childTree: ParseTree,
    tokens: MutableList<String> = mutableListOf()
): String {

    walkLeaves(childTree = childTree) { leave ->
        tokens.add(leave.text)
    }

    ...

}
英文:

this seems to work

	override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
		ctx?.children?.forEach {
			if (it.payload.javaClass == Schema_qualified_nameContext::class.java) {
				tableName = it.text
			}
		}
		super.visitCreate_table_statement(ctx)
	}

For branching trees

	fun walkLeaves(
		childTree: ParseTree = internalTree,
		leave: (childTree: ParseTree) -&gt; Unit) {

		if (childTree.childCount == 0) {
			if (!childTree.text?.trim().isNullOrBlank()) {
				leave(childTree)
			}
		} else {
			for (i in 0 until childTree.childCount) {
				walkLeaves(childTree = childTree.getChild(i), leave = leave)
			}
		}
	}
fun extractSQL(
	childTree: ParseTree,
	tokens: MutableList&lt;String&gt; = mutableListOf()
): String {

	walkLeaves(childTree = childTree) { leave -&gt;
		tokens.add(leave.text)
	}

    ...

}

huangapple
  • 本文由 发表于 2020年9月12日 18:41:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/63859469.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定