如何使用Antlr4解析PlSQL时提取具有语法错误的行。

huangapple go评论85阅读模式
英文:

How to extract line with syntax error when parsing PlSQL using Antlr4

问题

我正在使用来自这个GitHub仓库的PlSql语法文件。如果PLSQL文件在解析时存在语法错误,我想要在其上划线。我有以下代码片段可以实现:

public static class UnderlineListener extends BaseErrorListener {
    
    public void syntaxError(Recognizer<?, ?> recognizer,
                            Object offendingSymbol,
                            int line, int charPositionInLine,
                            String msg,
                            RecognitionException e)
    {
        System.err.println("line " + line + ":" + charPositionInLine + " " + msg);
        underlineError(recognizer, (Token)offendingSymbol,
                       line, charPositionInLine);
    }
    
    protected void underlineError(Recognizer recognizer,
                                  Token offendingToken, int line,
                                  int charPositionInLine) {
        CommonTokenStream tokens =
            (CommonTokenStream)recognizer.getInputStream();
        String input = tokens.getTokenSource().getInputStream().toString();
        String[] lines = input.split("\n");
        String errorLine = lines[line - 1];
        System.err.println(errorLine);
        for (int i = 0; i < charPositionInLine; i++) System.err.print(" ");
        int start = offendingToken.getStartIndex();
        int stop = offendingToken.getStopIndex();
        if (start >= 0 && stop >= 0) {
            for (int i = start; i <= stop; i++) System.err.print("^");
        }
        System.err.println();
    }
}

虽然这在大多数情况下都运行良好,但某些脚本语言(例如PlSql)需要特殊处理大小写敏感性。这意味着我必须使用CaseChangingCharStream,如下所示:

CharStream s = CharStreams.fromPath(Paths.get("test.sql"));
CaseChangingCharStream upper = new CaseChangingCharStream(s, true);
Lexer lexer = new SomeSQLLexer(upper);

现在,当我尝试在UnderlineListener中使用String input = tokens.getTokenSource().getInputStream().toString();来获取输入文本时,我没有获得实际的test.sql文本。这是因为getInputStream()返回的是CaseChangingCharStream对象,而不是所需的实际test.sql文本。

我如何才能获得实际的文件文本呢?一个方法是将文件内容传递给UnderlineListener的构造函数,但我更愿意坚持上述获取实际文件文本的方法,因为它适用于未使用CaseChangingCharStream的情况。

英文:

I am using the grammar file for PlSql from this Github repository. I want to underline the line in plsql file that I parse if it has a syntax error. I have the following snippet to do so:

public static class UnderlineListener extends BaseErrorListener {

    public void syntaxError(Recognizer&lt;?, ?&gt; recognizer,
                            Object offendingSymbol,
                            int line, int charPositionInLine,
                            String msg,
                            RecognitionException e)
    {
        System.err.println(&quot;line &quot;+line+&quot;:&quot;+charPositionInLine+&quot; &quot;+msg);
        underlineError(recognizer,(Token)offendingSymbol,
                       line, charPositionInLine);
    }

    protected void underlineError(Recognizer recognizer,
                                  Token offendingToken, int line,
                                  int charPositionInLine) {
        CommonTokenStream tokens =
            (CommonTokenStream)recognizer.getInputStream();
        String input = tokens.getTokenSource().getInputStream().toString();
        String[] lines = input.split(&quot;\n&quot;);
        String errorLine = lines
; System.err.println(errorLine); for (int i=0; i&lt;charPositionInLine; i++) System.err.print(&quot; &quot;); int start = offendingToken.getStartIndex(); int stop = offendingToken.getStopIndex(); if ( start&gt;=0 &amp;&amp; stop&gt;=0 ) { for (int i=start; i&lt;=stop; i++) System.err.print(&quot;^&quot;); } System.err.println(); } }

While this works fine in most cases, some scripting languages, like PlSql, need special handling for case-sensitivity. This means I had to use CaseChangingCharStream as follows:

CharStream s = CharStreams.fromPath(Paths.get(&#39;test.sql&#39;));
CaseChangingCharStream upper = new CaseChangingCharStream(s, true);
Lexer lexer = new SomeSQLLexer(upper);

Now when I try to get the input text inside my UnderlineListener using String input = tokens.getTokenSource().getInputStream().toString();, I do not get the actual text of my test.sql. This is because getInputStream() is returning CaseChangingCharStream object which does not give the desired actual text of my test.sql.

How do I get the actual file text in my case? One way could be to pass the file content to the the constructor of UnderlineListener, but I would prefer to stick to the above method of getting actual file text since it can be used for cases where CaseChangingCharStream is not used.

答案1

得分: 1

我已经找到了一个解决方法。目前的 CaseChangingCharStream.java 实现中并没有像 getCharStream() 这样的获取器方法来访问 final CharStream stream; 属性。只需添加一个相应的获取器方法,就可以按以下方式访问底层的 CharStream 对象:

CaseChangingCharStream modifiedCharStream = (CaseChangingCharStream) tokens.getTokenSource().getInputStream();
String input = modifiedCharStream.getCharStream().toString();
英文:

I have found a workaround. The current implementation of CaseChangingCharStream.java does not have a getter method, like getCharStream(), to access final CharStream stream; attribute. Simply adding a getter method for it allows us to access the underlying CharStream object as follows:

CaseChangingCharStream modifiedCharStream = (CaseChangingCharStream) tokens.getTokenSource().getInputStream();
String input = modifiedCharStream.getCharStream().toString();

huangapple
  • 本文由 发表于 2020年8月27日 15:05:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/63610809.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定