Why does my Antlr give error: rule prog contains a closure with at least one alternative that can match an empty string to my grammar

huangapple go评论74阅读模式
英文:

Why does my Antlr give error: rule prog contains a closure with at least one alternative that can match an empty string to my grammar

问题

如何确保 prog 规则的任何子规则都不匹配空字符串?

英文:

I am using Antlr4 4.12 version for parser generator on Intellij IDE
My grammar :

grammar new_g;

prog: (statements)+ | EOF ;

statements: simplestatement | compoundstatement ;

simplestatement:
  DOC paramlist  ;

compoundstatement:
  TABLE paramlist tabledata TABLE |
  QUIZ paramlist .*? QUIZ |
  QUESTION paramlist .*? QUESTION |
  ;

paramlist: '(' param (',' param)* ')' ;

param: VAR '=' STR ;

tabledata: tablerow(':'tablerow )*;

tablerow: (INT|FLOAT|VAR|STR)(',' (INT|FLOAT|VAR|STR))* ;
// tokens
DOC: '%doc';
TABLE: '%table';
//COMMENT:'%c' ;
//GRAPH: '%graph';
QUIZ: '%quiz';
QUESTION: '%question';
INT: [0-9]+ ;
FLOAT: [+-]?([0-9]*[.])?[0-9]+ ;
//GVAR:[a-zA-Z_][a-zA-Z0-9_]* ; 
VAR:  [a-zA-Z_][a-zA-Z0-9_]* ; // abc123 a_123_xyZ
STR: '"'[a-zA-Z0-9., ]*'"';
//COMMENT_STR: '% ' .* ; 
WS: [ \r\n\t]+ -> skip;
//COMMENT_STR: '%c' ([\n] {skip();})? [a-zA-Z0-9., ]* '%c';

How can I make sure that no subrule for `prog rule matches any empty string

答案1

得分: 1

The | operator in an ANTLR grammar can be read as an OR.

Your compoundstatement is formatted so that it reads more as an "alternative terminator":

compoundstatement:
  TABLE paramlist tabledata TABLE |
  QUIZ paramlist .*? QUIZ |
  QUESTION paramlist .*? QUESTION |
  ;

If you use the ANTLR plugin for VSCode, it will reformat it as:

compoundstatement
    : TABLE paramlist tabledata TABLE
    | QUIZ paramlist .*? QUIZ
    | QUESTION paramlist .*? QUESTION
    |
    ;

This makes it pretty easy to see that the rule has a final alternative that matches "nothing" (a.k.a an "empty string").

This is the source of your error:

This corrects the warning:

compoundstatement
    : TABLE paramlist tabledata TABLE
    | QUIZ paramlist .*? QUIZ
    | QUESTION paramlist .*? QUESTION
    ;

NOTE: reading | as OR also makes it clear that the prog rule itself:

prog: (statements)+ | EOF;

is probably not what you intend. It says that it matches one or more statements OR EOF.

It's a good practice to have any start rules consume all input through EOF, so you probably want:

prog: statements+ EOF;

(Maybe also consider renaming statements to statement as the rule only matches a single statement.)

英文:

The | operator in an ANTLR grammar can be read as an OR.

Your compoundstatement is formatted so that it reads more as an "alternative terminator"

compoundstatement:
  TABLE paramlist tabledata TABLE |
  QUIZ paramlist .*? QUIZ |
  QUESTION paramlist .*? QUESTION |
  ;

If you use the ANTLR plugin for VSCode, it will reformat it as:

compoundstatement
    : TABLE paramlist tabledata TABLE
    | QUIZ paramlist .*? QUIZ
    | QUESTION paramlist .*? QUESTION
    |
    ;

This makes it pretty easy to see that the rule has a final alternative that matches "nothing" (a.k.a an "empty string").

This is the source of your error:

This corrects the warning:

compoundstatement
    : TABLE paramlist tabledata TABLE
    | QUIZ paramlist .*? QUIZ
    | QUESTION paramlist .*? QUESTION
    ;

NOTE: reading | as OR also makes it clear that the prog rule itself:

prog: (statements)+ | EOF;

is probably not what you intend. It says that is matches one of more statements OR EOF.

It's a good practice to have any start rules consume all input through the OEF, so you probably want:

prog: statements+ EOF;

(Maybe also consider renaming statements to statement as the rule only matches a single statement.)

huangapple
  • 本文由 发表于 2023年7月7日 06:32:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76632896.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定