收到来自yylex的无效令牌传递给Bison解析器

huangapple go评论62阅读模式
英文:

Getting invalid token to Bison parser from yylex

问题

I have made lex/bison parser where I am using lex named token rules like:
a {return Tok_A;} and yacc has declaration of this token:
%token Tok_A then grammar follows.
Everything works fine, if the string is right, it accepts.
Now I try to make more general parser using directly the alphabet in lex.
For some reason yacc gives me invalid token when I want to send "a" character:

//parser.l
%{
#include "parser4.tab.h"
%}

%%
[a-h]	 {return *yytext;}
\n	 {return 0;}  /* EOF */
%%

//parser.y
%{
   extern void yyerror(char *);
   extern int yylex(void);
   #define YYDEBUG 1 
 %}
 
%token a 

%%

S : a {printf("S->a");}
%%

int main(void)
{
#if YYDEBUG
  yydebug = 1;
#endif
	if(!yyparse())
		printf("End of input reached\n");
	return 0;
}

void yyerror (char *s)
{
  /* fprintf (stderr, "%s\n", s); */
  printf("Incorrect derivation!\n");
}

When I compile, start and give program input a, its output is:

Starting parse
Entering state 0
Stack now 0
Reading a token
a
Next token is token "invalid token" ()
Incorrect derivation!
Cleanup: discarding lookahead token "invalid token" ()
Stack now 0

I think the trick is in lex and the rule return yytext.
If I understand it right, yacc and lex communicate through parser.tab.h. There are definitions for token translation int to token name. From int 257. 0-255 are used for classic characters.
So should I somehow translate the token in lex to ASCII? I thought when lex sends directly the "a" char, bison/yacc would understand it.

英文:

I have made lex/bison parser where I am using lex named token rules like:
a {return Tok_A;} and yacc has declaration of this token:
%token Tok_A then grammar follows.
Everything works fine, if the string is right, it accepts.
Now I try to make more general parser using directly the alphabet in lex.
For some reason yacc gives me invalid token when I want to send "a" character:

//parser.l
%{
#include "parser4.tab.h"
%}

%%
[a-h]	 {return *yytext;}
\n	 {return 0;}  /* EOF */
%%

//parser.y
%{
   extern void yyerror(char *);
   extern int yylex(void);
   #define YYDEBUG 1 
 %}
 
%token a 

%%
S : a {printf("S->a");}
%%

int main(void)
{
#if YYDEBUG
  yydebug = 1;
#endif
	if(!yyparse())
		printf("End of input reached\n");
	return 0;
}

void yyerror (char *s)
{
  /* fprintf (stderr, "%s\n", s); */
  printf("Incorrect derivation!\n");
}

When I compile, start and give program input a, its output is:

Starting parse
Entering state 0
Stack now 0
Reading a token
a
Next token is token "invalid token" ()
Incorrect derivation!
Cleanup: discarding lookahead token "invalid token" ()
Stack now 0


I think the trick is in lex and the rule return yytext.
If I understand it right, yacc and lex communicate through parser.tab.h. There are definitions for token translation int to token name. From int 257. 0-255 are used for classic characters.
So should I somehow translate the token in lex to ASCII? I thought when lex sends directly the "a" char, bison/yacc would understand it.

答案1

得分: 2

当您声明%token a时,它将a定义为一个标记的名称,您可以从词法分析器中返回该标记。但这与字符'a'不同。如果您想在语法中使用字符'a'作为标记,您不需要声明它,但您需要在它周围加上单引号,如'a'而不是a

在您的情况下,将yacc语法更改为:

S : 'a' {printf("S->a");}

这样就可以正常工作。

英文:

When you declare %token a it defines a as a name for a token, which you could return from lex. But that is not the same as the character 'a'. If you want to use the character 'a' as a token in the grammar, you DON'T need to declare it, but you DO need single-quotes around it, as 'a' and not a

In your case, change the yacc grammar to

S : 'a' {printf("S->a");}

and it will work

huangapple
  • 本文由 发表于 2023年5月28日 09:20:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76349610.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定