ANTLR语法不

huangapple go评论68阅读模式
英文:

ANTLR Grammar not

问题

I started writing another C-like language a few days ago and I've gotten stuck here.

The "pointers" rule seems to be colliding with the * operator in the OP token and making the * operator not recognized in the "expr" rule, and same for the & operator with the "reference" rule. How can I fix this?

grammar C;

program
  : (include | var_decl | boigaCall '; | func_decl | typedef ';)*;

stmt
  : if_stmt
  | repeat_stmt
  | var_decl
  | var_change
  | function_call ';'
  | return_stmt ';
  | boigaCall ';'
  | switch_stmt
  | '{' stmt* '}';

if_stmt
  : if_part else_part?;

if_part
  : 'if' paren_expr stmt;

else_part
  : 'else' stmt;

repeat_stmt
  : 'repeat' '(&' expr ')&' stmt;

var_decl
  : type name=ID ('=&' expr)? ';';

var_change
  : pointers? name=ID ('=&' | VARIABLE_MODIFIER) expr ';';

func_decl
  : (inline='inline')? type recursion? noturbo? name=ID '(&' functionArgs ')&' stmt;

functionArgs
  : ((ID name=ID) (',' ID name=ID)*?)?;

recursion
  : '!';

noturbo
  : '?';

paren_expr
  : '(&' expr ')&';

function_call
  : ID '(&' expr? (',' expr)* ')&';

return_stmt
  : 'return' expr?;

typedef
  : 'typedef' structdef ID;

structdef
  : 'struct' '{' (structelem ';')+ '}';

switch_stmt
  : 'switch' paren_expr switch_chain;

switch_chain
  : '{' case_block+ default_block? '}';

case_block
  : 'case' expr ':' stmt* 'break' ';';

default_block
  : 'default' ':' stmt*;

structelem
  : typedName;

typedName
  : ID name=ID;

expr
  : pointers expr
  | term
  | expr OP expr
  | cast expr
  | '(&' expr ')&';

term
  : ID | INT | HEX | BIN | FLOAT | STRING | boigaCall | sizeOf | function_call | reference ID;

sizeOf
  : 'sizeof' '(&' ID ')&';

boigaCall
  : '__boiga' '(&' STRING (',' expr)* ')&';

cast
  : '(&' type ')&';

pointers
: '*'*;

reference
: '&';

type
  : ID pointers?;

include
  : '#include' (LIBRARY | STRING);

fragment DIGIT: [0-9];
fragment LETTER: [a-zA-Z];
fragment HEX_CHAR: [a-fA-F];

STRING        : '"([^"\\"]|\\.)+"';
LIBRARY       : '<[a-zA-Z.]*>';
ID            : (LETTER | '_')+ (LETTER | '_' | DIGIT)*;
INT           : '-?' DIGIT+;
HEX           : '0x' (DIGIT | HEX_CHAR)+;
BIN           : '0b' ('0' | '1')+;
FLOAT         : '-?' DIGIT+ '.' DIGIT+;
VARIABLE_MODIFIER : OP '=&';
OP            : '+' | '-' | '*' | '/' | '%' | '==' | '!=' | '<' | '<=' | '>' | '>=' | '&&' | '||' | '&' | '|' | '^' | '>>' | '<<';

COMMENT       : SINGLE_COMMENT | BLOCK_COMMENT;
SINGLE_COMMENT: '// .*? \n';
BLOCK_COMMENT : '/*' .*? '*/';
WS: ([ \t\r\n] | COMMENT)+ -> skip;
英文:

I started writing another C-like language a few days ago and I've gotten stuck here.

The "pointers" rule seems to be colliding with the * operator in the OP token and making the * operator not recognized in the "expr" rule, and same for the &amp; operator with the "reference" rule. How can I fix this?

grammar C;

program
  : (include | var_decl | boigaCall &#39;;&#39; | func_decl | typedef &#39;;&#39;)*;

stmt
  : if_stmt
  | repeat_stmt
  | var_decl
  | var_change
  | function_call &#39;;&#39;
  | return_stmt &#39;;&#39;
  | boigaCall &#39;;&#39;
  | switch_stmt
  | &#39;{&#39; stmt* &#39;}&#39;;

if_stmt
  : if_part else_part?;

if_part
  : &#39;if&#39; paren_expr stmt;

else_part
  : &#39;else&#39; stmt;

repeat_stmt
  : &#39;repeat&#39; &#39;(&#39; expr &#39;)&#39; stmt;

var_decl
  : type name=ID (&#39;=&#39; expr)? &#39;;&#39;;

var_change
  : pointers? name=ID (&#39;=&#39; | VARIABLE_MODIFIER) expr &#39;;&#39;;

func_decl
  : (inline=&#39;inline&#39;)? type recursion? noturbo? name=ID &#39;(&#39; functionArgs &#39;)&#39; stmt;

functionArgs
  : ((ID name=ID) (&#39;,&#39; ID name=ID)*?)?;

recursion
  : &#39;!&#39;;

noturbo
  : &#39;?&#39;;

paren_expr
  : &#39;(&#39; expr &#39;)&#39;;

function_call
  : ID &#39;(&#39; expr? (&#39;,&#39; expr)* &#39;)&#39;;

return_stmt
  : &#39;return&#39; expr?;

typedef
  : &#39;typedef&#39; structdef ID;

structdef
  : &#39;struct&#39; &#39;{&#39; (structelem &#39;;&#39;)+ &#39;}&#39;;

switch_stmt
  : &#39;switch&#39; paren_expr switch_chain;

switch_chain
  : &#39;{&#39; case_block+ default_block? &#39;}&#39;;

case_block
  : &#39;case&#39; expr &#39;:&#39; stmt* &#39;break&#39; &#39;;&#39;;

default_block
  : &#39;default&#39; &#39;:&#39; stmt*;

structelem
  : typedName;

typedName
  : ID name=ID;

expr
  : pointers expr
  | term
  | expr OP expr
  | cast expr
  | &#39;(&#39; expr &#39;)&#39;;

term
  : ID | INT | HEX | BIN | FLOAT | STRING | boigaCall | sizeOf | function_call | reference ID;

sizeOf
  : &#39;sizeof&#39; &#39;(&#39; ID &#39;)&#39;;

boigaCall
  : &#39;__boiga&#39; &#39;(&#39; STRING (&#39;,&#39; expr)* &#39;)&#39;;

cast
  : &#39;(&#39; type &#39;)&#39;;

pointers
: &#39;*&#39;+;

reference
: &#39;&amp;&#39;;

type
  : ID pointers?;

include
  : &#39;#include&#39; (LIBRARY | STRING);


fragment DIGIT: [0-9];
fragment LETTER: [a-zA-Z];
fragment HEX_CHAR: [a-fA-F];

STRING        : &#39;&quot;&#39; (~&#39;&quot;&#39;|&#39;\\&quot;&#39;)* &#39;&quot;&#39;;
LIBRARY       : &#39;&lt;&#39; [a-zA-Z.]* &#39;&gt;&#39;;
ID            : (LETTER | &#39;_&#39;)+ (LETTER | &#39;_&#39; | DIGIT)*;
INT           : &#39;-&#39;? DIGIT+;
HEX           : &#39;0x&#39; (DIGIT | HEX_CHAR)+;
BIN           : &#39;0b&#39; (&#39;0&#39; | &#39;1&#39;)+;
FLOAT         : &#39;-&#39;? DIGIT+ &#39;.&#39; DIGIT+;
VARIABLE_MODIFIER : OP &#39;=&#39;;
OP            : &#39;+&#39; | &#39;-&#39; | &#39;*&#39; | &#39;/&#39; | &#39;%&#39; | &#39;==&#39; | &#39;!=&#39; | &#39;&lt;&#39; | &#39;&lt;=&#39; | &#39;&gt;&#39; | &#39;&gt;=&#39; | &#39;&amp;&amp;&#39; | &#39;||&#39; | &#39;&amp;&#39; | &#39;|&#39; | &#39;^&#39; | &#39;&gt;&gt;&#39; | &#39;&lt;&lt;&#39;;

COMMENT       : SINGLE_COMMENT | BLOCK_COMMENT;
SINGLE_COMMENT: &#39;//&#39; .*? &#39;\n&#39;;
BLOCK_COMMENT : &#39;/*&#39; .*? &#39;*/&#39;;
WS: ([ \t\r\n] | COMMENT)+ -&gt; skip;

I tried making * and &amp; into their own token and using those tokens in the "pointers" and "reference" rule, but that only caused the &amp; and * tokens to be seen as operators again, but not as pointers/reference anymore. I tested the "program" rule with var x = a*b; and var x = a&amp;b;, which tests the rule that is not properly working.

答案1

得分: 2

If you just move '' out of 'OP', everything works just fine. Your grammar created an implicit token when you used '' inside of the 'pointers' rule, so these '*'s are always that token and never 'OP'.

当你将 '' 移出 'OP' 时,一切都能正常工作。你的语法在 'pointers' 规则中使用 '' 时创建了一个隐式的标记,因此这些 '*' 总是该标记,而不是 'OP'。

When you create a token rule for this specific literal, Antlr tracks it down, and doesn't create its double. Therefore it allows the user to type either '*' or 'POW'.

当你为这个特定的字面量创建一个标记规则时,Antlr会跟踪它,并不会创建它的复制。因此,它允许用户输入 '*' 或 'POW'。

expr
: term
| pointers expr
| expr (OP | POW) expr
| cast expr
| '(' expr ')';
POW: '*';
OP: '+' | '-' | '/' | '%' | '==' | '!=' | '<' | '<=' | '>' | '>=' | '&&' | '||' | '&' | '|' | '^' | '>>' | '<<';
英文:

If you just move &#39;*&#39; out of OP, everything works just fine. Your grammar created an implicit token when you used &#39;*&#39; inside of the pointers rule, so these *'s are always that token and never OP.

When you create a token rule for this specific literal, Antlr tracks it down, and doesn't create its double. Therefore it allows the user to type either &#39;*&#39; or POW.

expr
: term
| pointers expr
| expr (OP | POW) expr
| cast expr
| &#39;(&#39; expr &#39;)&#39;;
POW: &#39;*&#39;;
OP: &#39;+&#39; | &#39;-&#39; | &#39;/&#39; | &#39;%&#39; | &#39;==&#39; | &#39;!=&#39; | &#39;&lt;&#39; | &#39;&lt;=&#39; | &#39;&gt;&#39; | &#39;&gt;=&#39; | &#39;&amp;&amp;&#39; | &#39;||&#39; | &#39;&amp;&#39; | &#39;|&#39; | &#39;^&#39; | &#39;&gt;&gt;&#39; | &#39;&lt;&lt;&#39;;

Input: a*b*c

ANTLR语法不

Input: a***b

ANTLR语法不

huangapple
  • 本文由 发表于 2023年5月18日 01:39:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76274811.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定