英文:
Extract specific token out of ANTLR Parse Tree
问题
我试图从ANTLR解析树中提取数据,但对应该如何正确执行此操作还不太明白。
假设我有以下两个SQL查询:
// SQL语言
val sql3 = """
CREATE TABLE session(
    id           uuid                    not null
        constraint account_pk
            primary key,
    created     timestamp   default now() not null
)
""".trimIndent()
// SQL语言
val sql4 = """
CREATE TABLE IF NOT EXISTS blah(
    id           uuid                    not null
    constraint account_pk
        primary key,
    created     timestamp   default now() not null
)
""".trimIndent()
现在我解析它们:
val visitor = Visitor()
listOf(sql3, sql4).forEach { sql ->
    val lexer = SQLLexer(CharStreams.fromString(sql))
    val parser = SQLParser(CommonTokenStream(lexer))
    visitor.visit(parser.sql())
    println(visitor.tableName)
}
在我的访问者(Visitor)中,如果我访问tableCreateStatement,我会得到解析树,但很显然,仅仅抓取child1将适用于sql3,但不适用于sql4,因为sql4中的child1是IF NOT EXISTS。
class Visitor : SQLParserBaseVisitor<Unit>() {
    var tableName = ""
    override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
        tableName = ctx?.getChild(1)?.text ?: ""
        super.visitCreate_table_statement(ctx)
    }
}
有没有办法在解析树中查找特定的标记?
我猜想这可能与payload有关,但由于它的类型是Any,我不确定该如何检查它。
override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
    ctx?.children?.forEach {
        if (it.payload.javaClass == SQLParser::Schema_qualified_nameContext) {
            tableName = it.text
        }
    }
    super.visitCreate_table_statement(ctx)
}
编辑:.g4文件来自于https://github.com/pgcodekeeper/pgcodekeeper/tree/master/apgdiff/antlr-src。
英文:
i'm trying to extract data from the ANTLR parse tree, but not fully grasping how this should be done correctly
Let's say i have the following two SQL queries:
		// language=SQL
		val sql3 = """
		CREATE TABLE session(
			id           uuid                    not null
				constraint account_pk
					primary key,
			created 	timestamp	default now() not null
		)
		""".trimIndent()
		// language=SQL
		val sql4 = """
			CREATE TABLE IF NOT EXISTS blah(
				id           uuid                    not null
				constraint account_pk
					primary key,
				created 	timestamp	default now() not null
			)
		""".trimIndent()
Now i parse both of them:
		val visitor = Visitor()
		listOf(sql3, sql4).forEach { sql ->
			val lexer = SQLLexer(CharStreams.fromString(sql))
			val parser = SQLParser(CommonTokenStream(lexer))
			visitor.visit(parser.sql())
			println(visitor.tableName)
		}
In my visitor if i visit the tableCreateStatement, i get the parse tree, but obviously just grabbing child1 will work for sql3, but not for sql4 since child1 in sql4 is IF NOT EXISTS
class Visitor : SQLParserBaseVisitor<Unit>() {
	var tableName = ""
	override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
		tableName = ctx?.getChild(1)?.text ?: ""
		super.visitCreate_table_statement(ctx)
	}
}
Is there a way to find a specific token in the parse tree?
I'm assuming the payload has something to do with it, but since it's of type Any, i'm not sure what to check it against
	override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
		ctx?.children?.forEach {
			if (it.payload.javaClass == SQLParser::Schema_qualified_nameContext) {
				tableName = it.text
			}
		}
		super.visitCreate_table_statement(ctx)
	}
EDIT: the .g4 files are from
https://github.com/pgcodekeeper/pgcodekeeper/tree/master/apgdiff/antlr-src
答案1
得分: 0
这似乎有效
fun walkLeaves(
    childTree: ParseTree = internalTree,
    leave: (childTree: ParseTree) -> Unit
) {
    if (childTree.childCount == 0) {
        if (!childTree.text?.trim().isNullOrBlank()) {
            leave(childTree)
        }
    } else {
        for (i in 0 until childTree.childCount) {
            walkLeaves(childTree = childTree.getChild(i), leave = leave)
        }
    }
}
fun extractSQL(
    childTree: ParseTree,
    tokens: MutableList<String> = mutableListOf()
): String {
    walkLeaves(childTree = childTree) { leave ->
        tokens.add(leave.text)
    }
    ...
}
英文:
this seems to work
	override fun visitCreate_table_statement(ctx: SQLParser.Create_table_statementContext?) {
		ctx?.children?.forEach {
			if (it.payload.javaClass == Schema_qualified_nameContext::class.java) {
				tableName = it.text
			}
		}
		super.visitCreate_table_statement(ctx)
	}
For branching trees
	fun walkLeaves(
		childTree: ParseTree = internalTree,
		leave: (childTree: ParseTree) -> Unit) {
		if (childTree.childCount == 0) {
			if (!childTree.text?.trim().isNullOrBlank()) {
				leave(childTree)
			}
		} else {
			for (i in 0 until childTree.childCount) {
				walkLeaves(childTree = childTree.getChild(i), leave = leave)
			}
		}
	}
fun extractSQL(
	childTree: ParseTree,
	tokens: MutableList<String> = mutableListOf()
): String {
	walkLeaves(childTree = childTree) { leave ->
		tokens.add(leave.text)
	}
    ...
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论