如何解决未识别的规则和致命的解析错误。

huangapple go评论50阅读模式
英文:

How to solve unrecognized rule and fatal parse error

问题

It appears that you have shared a piece of code written in Lex (a lexer generator). The error you are encountering at line 209 is likely due to a formatting issue or a problem with your Lex specifications. To help you identify the issue, I can point out a few potential problems in your code:

  1. Make sure there are no extra spaces or tabs before %%. It should start at the beginning of the line.

  2. Check for any mismatched curly braces {} or other syntax errors in your Lex rules. Any missing or extra braces can lead to parse errors.

  3. Ensure that you have defined all the necessary tokens and rules correctly. Double-check the syntax and regular expressions for each token definition.

  4. Verify that you have included the necessary header files and library functions if required for your Lex program.

  5. It seems like you are using HTML escape codes (&quot;, &lt;, &gt;, etc.) in your Lex rules. These should be replaced with their corresponding characters (e.g., replace &quot; with ", &lt; with <, &gt; with >).

After checking these points, you should be able to identify and fix the issue causing the "unrecognized rule" and "fatal parse error." Remember that Lex has strict syntax requirements, so any deviation from the correct format can lead to errors.

英文:

I'm very new to lex, and have tried to make a scanner. Here's my code:

The definition and rule part

identifier              ([A-Za-z][0-9A-Za-z]*)
digit                   ([0-9])
integer                 ({digit}+)
float			({integer}&quot;.&quot;[0-9]+)
delimiter               ([.,:;()\[\]{}])
arithmetic              ([+-*/])
relational              ([&lt;&gt;=])
string                  (\&quot;(\&quot;\&quot;|[^&quot;\n])*\&quot;)
commentLine		(\%[^\n]*)
commentLeft		({%)
commentRight		(%})

%x COMMENT

%option noyywrap

%%

 {delimiter;} 		{tokenChar(yytext[0]);}
 {arithmetic;}		{tokenChar(yytext[0]);}
 {relational;}		{tokenChar(yytext[0]);}

&quot;&lt;=&quot;                    { token(&#39;&lt;=&#39;); }
&quot;&gt;=&quot;                    { token(&#39;&gt;=&#39;); }
&quot;not=&quot;                  { token(&#39;not=&#39;); }
&quot;:=&quot; 			{ token(&#39;:=&#39;); }

&quot;not&quot;                   { token(&#39;not&#39;); }
&quot;and&quot;                   { token(&#39;and&#39;); }
&quot;or&quot;                    { token(&#39;or&#39;); }

&quot;array&quot;                 { token(ARRAY); }
&quot;begin&quot;                 { token(BEGIN); }
&quot;bool&quot;                  { token(BOOL); }
&quot;char&quot;                  { token(CHAR); }
&quot;const&quot;                 { token(CONST); }
&quot;decreasing&quot;            { token(DECREASING); }
&quot;default&quot;               { token(DEFAULT); }
&quot;do&quot;                    { token(DO); }
&quot;else&quot;                  { token(ELSE); }
&quot;end&quot;                   { token(END); }
&quot;exit&quot;                  { token(EXIT); }
&quot;false&quot;                 { token(FALSE); }
&quot;for&quot;                   { token(FOR); }
&quot;function&quot;              { token(FUNCTION); }
&quot;get&quot;                   { token(GET); }
&quot;if&quot;                    { token(IF); }
&quot;int&quot;                   { token(INT); }
&quot;loop&quot;                  { token(LOOP); }
&quot;of&quot;                    { token(OF); }
&quot;put&quot;                   { token(PUT); }
&quot;procedure&quot;             { token(PROCEDURE); }
&quot;real&quot;                  { token(REAL); }
&quot;result&quot;                { token(RESULT); }
&quot;return&quot;                { token(RETURN); }
&quot;skip&quot;                  { token(SKIP); }
&quot;string&quot;                { token(STRING); }
&quot;then&quot;                  { token(THEN); }
&quot;true&quot;                  { token(TRUE); }
&quot;var&quot;                   { token(VAR); }
&quot;when&quot;                  { token(WHEN); }

 {integer}	{
	tokenInteger(&quot;integer&quot;, atoi(yytext));
}

 {identifier} {
	tokenString(&quot;identifier&quot;, yytext);
	table -&gt; insert(yytext);
}

 {float}	{
	tokenFloat(&quot;float&quot;, yytext);
 }

 {string}	 {
	char s[MAX_LINE_LENG] = {0};
	int idx = 0;
	for (int i = 1; i &lt; yyleng - 1; ++i){
		if (yytext[i] == &#39;&quot;&#39;)
			++i;
		s[idx++] = yytext[i];
	}
	tokenString(&quot;string&quot;, s);
}

 {commentLine}	{
    LIST;
}

 {%	{ 
	LIST;
	BEGIN(COMMENT);
}

&lt;COMMENT&gt;[^\n]	{
	LIST;
}

&lt;COMMENT&gt;\n	{	
	LIST;
	printf(&quot;%d: %s&quot;, linenum, buf);
	linenum++;
	buf[0] = &#39;
identifier              ([A-Za-z][0-9A-Za-z]*)
digit                   ([0-9])
integer                 ({digit}+)
float			({integer}&quot;.&quot;[0-9]+)
delimiter               ([.,:;()\[\]{}])
arithmetic              ([+-*/])
relational              ([&lt;&gt;=])
string                  (\&quot;(\&quot;\&quot;|[^&quot;\n])*\&quot;)
commentLine		(\%[^\n]*)
commentLeft		({%)
commentRight		(%})
%x COMMENT
%option noyywrap
%%
{delimiter;} 		{tokenChar(yytext[0]);}
{arithmetic;}		{tokenChar(yytext[0]);}
{relational;}		{tokenChar(yytext[0]);}
&quot;&lt;=&quot;                    { token(&#39;&lt;=&#39;); }
&quot;&gt;=&quot;                    { token(&#39;&gt;=&#39;); }
&quot;not=&quot;                  { token(&#39;not=&#39;); }
&quot;:=&quot; 			{ token(&#39;:=&#39;); }
&quot;not&quot;                   { token(&#39;not&#39;); }
&quot;and&quot;                   { token(&#39;and&#39;); }
&quot;or&quot;                    { token(&#39;or&#39;); }
&quot;array&quot;                 { token(ARRAY); }
&quot;begin&quot;                 { token(BEGIN); }
&quot;bool&quot;                  { token(BOOL); }
&quot;char&quot;                  { token(CHAR); }
&quot;const&quot;                 { token(CONST); }
&quot;decreasing&quot;            { token(DECREASING); }
&quot;default&quot;               { token(DEFAULT); }
&quot;do&quot;                    { token(DO); }
&quot;else&quot;                  { token(ELSE); }
&quot;end&quot;                   { token(END); }
&quot;exit&quot;                  { token(EXIT); }
&quot;false&quot;                 { token(FALSE); }
&quot;for&quot;                   { token(FOR); }
&quot;function&quot;              { token(FUNCTION); }
&quot;get&quot;                   { token(GET); }
&quot;if&quot;                    { token(IF); }
&quot;int&quot;                   { token(INT); }
&quot;loop&quot;                  { token(LOOP); }
&quot;of&quot;                    { token(OF); }
&quot;put&quot;                   { token(PUT); }
&quot;procedure&quot;             { token(PROCEDURE); }
&quot;real&quot;                  { token(REAL); }
&quot;result&quot;                { token(RESULT); }
&quot;return&quot;                { token(RETURN); }
&quot;skip&quot;                  { token(SKIP); }
&quot;string&quot;                { token(STRING); }
&quot;then&quot;                  { token(THEN); }
&quot;true&quot;                  { token(TRUE); }
&quot;var&quot;                   { token(VAR); }
&quot;when&quot;                  { token(WHEN); }
{integer}	{
tokenInteger(&quot;integer&quot;, atoi(yytext));
}
{identifier} {
tokenString(&quot;identifier&quot;, yytext);
table -&gt; insert(yytext);
}
{float}	{
tokenFloat(&quot;float&quot;, yytext);
}
{string}	 {
char s[MAX_LINE_LENG] = {0};
int idx = 0;
for (int i = 1; i &lt; yyleng - 1; ++i){
if (yytext[i] == &#39;&quot;&#39;)
++i;
s[idx++] = yytext[i];
}
tokenString(&quot;string&quot;, s);
}
{commentLine}	{
LIST;
}
{%	{ 
LIST;
BEGIN(COMMENT);
}
&lt;COMMENT&gt;[^\n]	{
LIST;
}
&lt;COMMENT&gt;\n	{	
LIST;
printf(&quot;%d: %s&quot;, linenum, buf);
linenum++;
buf[0] = &#39;\0&#39;;
}
&lt;COMMENT&gt; {commentRight}	{
LIST;
BEGIN(INITIAL);
}
\n  {
LIST;
printf(&quot;%d: %s&quot;, linenum++, buf);
buf[0] = &#39;\0&#39;;
}
[ \t]*  {
LIST;
}
.   {
LIST;
printf(&quot;%d:%s\n&quot;, linenum+1, buf);
printf(&quot;bad character:&#39;%s&#39;\n&quot;,yytext);
exit(-1);
}
%%
&#39;; } &lt;COMMENT&gt; {commentRight} { LIST; BEGIN(INITIAL); } \n { LIST; printf(&quot;%d: %s&quot;, linenum++, buf); buf[0] = &#39;
identifier              ([A-Za-z][0-9A-Za-z]*)
digit                   ([0-9])
integer                 ({digit}+)
float			({integer}&quot;.&quot;[0-9]+)
delimiter               ([.,:;()\[\]{}])
arithmetic              ([+-*/])
relational              ([&lt;&gt;=])
string                  (\&quot;(\&quot;\&quot;|[^&quot;\n])*\&quot;)
commentLine		(\%[^\n]*)
commentLeft		({%)
commentRight		(%})
%x COMMENT
%option noyywrap
%%
{delimiter;} 		{tokenChar(yytext[0]);}
{arithmetic;}		{tokenChar(yytext[0]);}
{relational;}		{tokenChar(yytext[0]);}
&quot;&lt;=&quot;                    { token(&#39;&lt;=&#39;); }
&quot;&gt;=&quot;                    { token(&#39;&gt;=&#39;); }
&quot;not=&quot;                  { token(&#39;not=&#39;); }
&quot;:=&quot; 			{ token(&#39;:=&#39;); }
&quot;not&quot;                   { token(&#39;not&#39;); }
&quot;and&quot;                   { token(&#39;and&#39;); }
&quot;or&quot;                    { token(&#39;or&#39;); }
&quot;array&quot;                 { token(ARRAY); }
&quot;begin&quot;                 { token(BEGIN); }
&quot;bool&quot;                  { token(BOOL); }
&quot;char&quot;                  { token(CHAR); }
&quot;const&quot;                 { token(CONST); }
&quot;decreasing&quot;            { token(DECREASING); }
&quot;default&quot;               { token(DEFAULT); }
&quot;do&quot;                    { token(DO); }
&quot;else&quot;                  { token(ELSE); }
&quot;end&quot;                   { token(END); }
&quot;exit&quot;                  { token(EXIT); }
&quot;false&quot;                 { token(FALSE); }
&quot;for&quot;                   { token(FOR); }
&quot;function&quot;              { token(FUNCTION); }
&quot;get&quot;                   { token(GET); }
&quot;if&quot;                    { token(IF); }
&quot;int&quot;                   { token(INT); }
&quot;loop&quot;                  { token(LOOP); }
&quot;of&quot;                    { token(OF); }
&quot;put&quot;                   { token(PUT); }
&quot;procedure&quot;             { token(PROCEDURE); }
&quot;real&quot;                  { token(REAL); }
&quot;result&quot;                { token(RESULT); }
&quot;return&quot;                { token(RETURN); }
&quot;skip&quot;                  { token(SKIP); }
&quot;string&quot;                { token(STRING); }
&quot;then&quot;                  { token(THEN); }
&quot;true&quot;                  { token(TRUE); }
&quot;var&quot;                   { token(VAR); }
&quot;when&quot;                  { token(WHEN); }
{integer}	{
tokenInteger(&quot;integer&quot;, atoi(yytext));
}
{identifier} {
tokenString(&quot;identifier&quot;, yytext);
table -&gt; insert(yytext);
}
{float}	{
tokenFloat(&quot;float&quot;, yytext);
}
{string}	 {
char s[MAX_LINE_LENG] = {0};
int idx = 0;
for (int i = 1; i &lt; yyleng - 1; ++i){
if (yytext[i] == &#39;&quot;&#39;)
++i;
s[idx++] = yytext[i];
}
tokenString(&quot;string&quot;, s);
}
{commentLine}	{
LIST;
}
{%	{ 
LIST;
BEGIN(COMMENT);
}
&lt;COMMENT&gt;[^\n]	{
LIST;
}
&lt;COMMENT&gt;\n	{	
LIST;
printf(&quot;%d: %s&quot;, linenum, buf);
linenum++;
buf[0] = &#39;\0&#39;;
}
&lt;COMMENT&gt; {commentRight}	{
LIST;
BEGIN(INITIAL);
}
\n  {
LIST;
printf(&quot;%d: %s&quot;, linenum++, buf);
buf[0] = &#39;\0&#39;;
}
[ \t]*  {
LIST;
}
.   {
LIST;
printf(&quot;%d:%s\n&quot;, linenum+1, buf);
printf(&quot;bad character:&#39;%s&#39;\n&quot;,yytext);
exit(-1);
}
%%
&#39;; } [ \t]* { LIST; } . { LIST; printf(&quot;%d:%s\n&quot;, linenum+1, buf); printf(&quot;bad character:&#39;%s&#39;\n&quot;,yytext); exit(-1); } %%

And the main function:

int main(int argc, char **argv){
    FILE *yyin;
    if (argc &gt; 0){
        yyin = fopen(argv[1], &quot;r&quot;);
        if (!yyin){
            printf(&quot;Failed to open file %s\n&quot;, argv[1]);
            return 1;
        }
    }
    yylex();
    fclose(yyin);
    return 0;
}

Since I'm still trying, there should be a lot of mistakes. However, when I type lex scanner.l it only says:

> scanner.l:209: unrecognized rule
scanner.l:209: fatal parse error

Line 209 is the %% after rules sections. Is my definition incorrect? Not sure where's the mistake.

答案1

得分: 2

以下是翻译好的内容:

你的文件中有相当多初学者错误,都是拼写错误。调试这种问题的最佳方法是从代码中删除行,直到错误消失(也许使用二分法)。通过这样做,你将迅速找到确切引起问题的行。我不会列举你犯的每个错误,但我可以在几分钟内在 lex/flex 中轻松完成这项工作。

我发现的错误有:

  • lex 模式必须从第一列开始。你的一些模式之前有空格。
  • lex 动作规则不得在第一列。你的一些规则在第一列有字符(特别是当它是一个闭合 <kbd>}</kbd> 时)。
  • <kbd>}</kbd> 是 lex 元字符(表示它是 lex 语法的一部分)。这意味着在模式中使用它时必须小心。应该用 <kbd>\</kbd> 转义 <kbd>}</kbd>,以使其与自身匹配。在几个地方你没有这样做。
  • 减号 <kbd>-</kbd> 在字符集中也是 lex 元字符。如果集合要匹配实际的 <kbd>-</kbd> 而不表示字符范围,则必须用 <kbd>\</kbd> 转义。
  • 你在模式的名称中放了一个分号 <kbd>;</kbd>,这是不正确的。例如{delimiter;}
  • Lex 认为你有两个可能匹配的裸换行字符 <kbd>\n</kbd>。仔细检查这一点。

然而,你犯的最大错误,也是许多学生犯的错误之一,就是试图一次实现整个东西。非常不好。有经验的软件工程师知道要一次一个部分来做。从你的语言中开始一个小的子集,让它工作,然后添加几个更多的语句,继续直到所有都能正常工作。也许从操作符或关键字开始,逐步添加其他部分。然后你就知道错误在哪里。这就是为什么 Stack Overflow 喜欢小的可重现示例,而不是整个代码库的原因。

英文:

You have made quite a few novice errors in your file, which are all typos. The best way of debugging this kind of problem is to cut out lines from your code until the error goes away (perhaps using the binary chop method). Doing this you will rapidly find exactly which lines are causing problems. I will not list every fault you made, but I can make this work in lex/flex quite easily in a few minutes.

The mistakes I found were:

  • the lex patterns MUST start in the first column. Some of yours had a space before them.
  • the lex action rules MUST NOT be in the first column. Some of yours did have characters in column one (particularly when it was a closing <kbd>}</kbd>).
  • The <kbd>}</kbd> is a lex meta-character (meaning it is part of the lex syntax). This means you have to be careful when using it in your patterns. You should escape <kbd>}</kbd> with a <kbd>\</kbd> when it should be matched as itself. In several places you have not.
  • The minus <kbd>-</kbd> is also a lex meta character in character sets. If the set is to match the actual <kbd>-</kbd> and does not indicate a character range then it also must be escaped by <kbd>\</kbd>.
  • You put a semicolon <kbd>;</kbd> in the name of a pattern, which is not correct. For example {delimiter;}.
  • Lex thinks you have two possible matches for a bare newline character <kbd>\n</kbd>. Check this carefully.

However the BIGGEST MISTAKE you made, and one made by so many students, is to try and implement the whole thing IN ONE GO. Very poor. Experienced software engineers know to do it one piece at a time. Start with a small subset of your language, get that to work, and then add a few more statements and continue until it all works. Start, perhaps, with operators or keywords and add the other parts piecemeal. Then you know where the errors lie. This is why Stack Overflow likes small reproducable examples and not whole code bases.

huangapple
  • 本文由 发表于 2023年5月7日 00:53:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76190082.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定