Antlr3:构建限定名称的解析树

huangapple go评论67阅读模式
英文:

Antlr3: building parse tree for qualified names

问题

我无法找到一个与我的问题相近的问题/回答,能够帮助我解决我的问题。因此,我在这里发布这个问题。

我正在尝试为限定名称(qualified names)构建一个解析树。下面的示例显示了一个例子。

例如,
1. foo_boo.aaa.ccc1_c [![enter image description here][1]][1]

  [1]: https://i.stack.imgur.com/O5LAx.png

在这里,我有用点号分隔的单词。我正在使用antlr3,以下是我的语法。

parse
: expr
;

list_expr : <我移除了这里的语法>
SimpleType : ('a'..'z'|'A'..'Z'|'')('a'..'z'|'A'..'Z'|'0'..'9'|'')*
;

QualifiedType : SimpleType | SimpleType ('\.' SimpleType)+;

expr : list_expr
| QualifiedType
| union_expr;

/*------------------------------------------------------------------

  • 词法分析器规则
    ------------------------------------------------------------------/

WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; } ;


在这里,`SimpleType` 表示一个单词的语法。我的要求是为 `QualifiedType` 构建语法。上面给出的当前语法不能如预期般工作(`QualifiedType : SimpleType | SimpleType ('\.'SimpleType)+;`)。**如何编写正确的限定名称(由点号分隔的单词)的语法?**
英文:

I couldn't find a question/answer that comes close to helping with my issue. Therefore, I am posting this question here.

I am trying to build a parse tree for qualified names. The below example shows an example.

E.g.,

  1. foo_boo.aaa.ccc1_c Antlr3:构建限定名称的解析树

Here I have dot separated words. I am using antlr3 and below is my grammer.

parse
    :  expr
    ;


list_expr : &lt;I removed the grammar here&gt;
SimpleType : (&#39;a&#39;..&#39;z&#39;|&#39;A&#39;..&#39;Z&#39;|&#39;_&#39;)(&#39;a&#39;..&#39;z&#39;|&#39;A&#39;..&#39;Z&#39;|&#39;0&#39;..&#39;9&#39;|&#39;_&#39;)*
           ;

QualifiedType : SimpleType | SimpleType (&#39;\.&#39; SimpleType)+;


expr : list_expr
    | QualifiedType
    | union_expr;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/

WHITESPACE : ( &#39;\t&#39; | &#39; &#39; | &#39;\r&#39; | &#39;\n&#39;| &#39;\u000C&#39; )+    { $channel = HIDDEN; } ;

Here, SympleType represents grammar for a word. My requirement is to build the grammar for the QualifiedType. The current grammar given in above is not working as expected (QualifiedType : SimpleType | SimpleType (&#39;\.&#39;SimpleType)+;). How to write correct grammar for Qualified names (Dot separated words)?

答案1

得分: 0

QualifiedType 修改为解析规则而不是词法规则:

qualifiedType : SimpleType ('.' SimpleType)*;

同时,'\'.' 不需要转义:'.' 即可。

编辑

您需要将输出设置为 AST 并应用一些树重写规则,以使其正常工作。以下是一个快速示例:

grammar T;

options {
  output=AST;
}

tokens {
  Root;
  QualifiedName;
}

parse
 : qualifiedType EOF -> ^(Root qualifiedType)
 ;

qualifiedType
 : SimpleType ('.' SimpleType)* -> ^(QualifiedName SimpleType+)
 ;

SimpleType
 : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '0'..'9' | '_')*
 ;

如果您现在运行以下代码:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.CommonTree;
import org.antlr.runtime.tree.DOTTreeGenerator;
import org.antlr.stringtemplate.StringTemplate;

public class Main {
    public static void main(String[] args) throws Exception {
        TLexer lexer = new TLexer(new ANTLRStringStream("foo_boo.aaa.ccc1_c"));
        TParser parser = new TParser(new CommonTokenStream(lexer));
        CommonTree tree = (CommonTree)parser.parse().getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

您将获得一些 DOT 输出,该输出对应以下 AST:

Antlr3:构建限定名称的解析树

英文:

Make QualifiedType a parser rule instead of a lexer rule:

qualifiedType : SimpleType (&#39;.&#39; SimpleType)*;

Also, &#39;\.&#39; does not need an escape: &#39;.&#39; is OK.

EDIT

You'll have to set the output to AST and apply some tree rewrite rules to make it work properly. Here's a quick demo:

grammar T;

options {
  output=AST;
}

tokens {
  Root;
  QualifiedName;
}

parse
 : qualifiedType EOF -&gt; ^(Root qualifiedType)
 ;

qualifiedType
 : SimpleType (&#39;.&#39; SimpleType)* -&gt; ^(QualifiedName SimpleType+)
 ;

SimpleType
 : (&#39;a&#39;..&#39;z&#39; | &#39;A&#39;..&#39;Z&#39; | &#39;_&#39;) (&#39;a&#39;..&#39;z&#39; | &#39;A&#39;..&#39;Z&#39; | &#39;0&#39;..&#39;9&#39; | &#39;_&#39;)*
 ;

And if you now run the code:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.CommonTree;
import org.antlr.runtime.tree.DOTTreeGenerator;
import org.antlr.stringtemplate.StringTemplate;

public class Main {
    public static void main(String[] args) throws Exception {
        TLexer lexer = new TLexer(new ANTLRStringStream(&quot;foo_boo.aaa.ccc1_c&quot;));
        TParser parser = new TParser(new CommonTokenStream(lexer));
        CommonTree tree = (CommonTree)parser.parse().getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

you'll get some DOT output, which corresponds to the following AST:

Antlr3:构建限定名称的解析树

huangapple
  • 本文由 发表于 2020年9月13日 16:01:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/63868502.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定