Yacc用于不同的数据类型。

huangapple go评论52阅读模式
英文:

Yacc for different data types

问题

我正在尝试编写一个语法,让用户可以使用他们熟悉的操作符进行多次计算。例如,A+B,其中A和B可以是矩阵或数字。以下是语法的相关部分:

q_term: fraction	
| q_term '+' fraction	{$$ = q_add($1,$3);}
| q_term '-' fraction	{$$ = q_sub($1,$3);}
| q_term '*' fraction	{$$ = q_mul($1,$3);}
| q_term '/' fraction	{$$ = q_div($1,$3);}
;

qm_term: q_matrix	
| qm_term '+' q_matrix	{$$ = qm_add($1,$3);}
| qm_term '-' q_matrix	{$$ = qm_sub($1,$3);}
| qm_term '*' q_matrix	{$$ = qm_mul($1,$3);}
;

这些产生了一堆移入/规约错误。我认为这是因为它在多个地方看到了操作字符。

如何解决这些移入/规约错误?

编辑:

以下是解析器如何区分矩阵和标量的方法:

q_term: fraction	
| q_term '+' fraction	{$$ = q_add($1,$3);}
| q_term '-' fraction	{$$ = q_sub($1,$3);}
| q_term '*' fraction	{$$ = q_mul($1,$3);}
| q_term '/' fraction	{$$ = q_div($1,$3);}
;

q_matrix: '[' q_term	{qm_temp = qm_create();  qm_append(qm_temp,$2,'c');}/* new q_matrix */
| q_matrix ',' q_term {qm_append(qm_temp,$3,'c');}	/* add a number to the current q_matrix row */
| q_matrix ';' q_term {qm_append(qm_temp,$3,'r');}	/* add a new row */
| q_matrix ']'	{qm_finish(qm_temp); $$ = qm_copy_matrix(qm_temp);} /* close the list */
;

fraction: INTEGER {$$ = q_new($1,1);} /* this converts a lone integer into a fraction */
| INTEGER '|' INTEGER {$$ = q_new($1,$3);}
英文:

I am trying to write a grammar lets the user use the operation symbol they are used to for multiple calculations. For example, A+B, where A and B are matrices, or numbers.
Here is the relevant part of the grammar:

q_term: fraction	
| q_term '+' fraction	{$$ = q_add($1,$3);}
| q_term '-' fraction	{$$ = q_sub($1,$3);}
| q_term '*' fraction	{$$ = q_mul($1,$3);}
| q_term '/' fraction	{$$ = q_div($1,$3);}
;

qm_term: q_matrix	
| qm_term '+' q_matrix	{$$ = qm_add($1,$3);}
| qm_term '-' q_matrix	{$$ = qm_sub($1,$3);}
| qm_term '*' q_matrix	{$$ = qm_mul($1,$3);}
;

It gives me a bunch of shift/reduce errors. I think that is because it sees the operation characters in more than one place.

How do I resolve the shift reduce errors?

Edit:

Here is how the parser tells the difference between a matrix and a scalar

q_term: fraction	
| q_term '+' fraction	{$$ = q_add($1,$3);}
| q_term '-' fraction	{$$ = q_sub($1,$3);}
| q_term '*' fraction	{$$ = q_mul($1,$3);}
| q_term '/' fraction	{$$ = q_div($1,$3);}
;

q_matrix: '[' q_term	{qm_temp = qm_create();  qm_append(qm_temp,$2,'c');}/* new q_matrix */
| q_matrix ',' q_term {qm_append(qm_temp,$3,'c');}	/* add a number to the current q_matrix row */
| q_matrix ';' q_term {qm_append(qm_temp,$3,'r');}	/* add a new row */
| q_matrix ']'	{qm_finish(qm_temp); $$ = qm_copy_matrix(qm_temp);} /* close the list */
;

fraction: INTEGER {$$ = q_new($1,1);} /* this converts a lone integer into a fraction */
| INTEGER '|' INTEGER {$$ = q_new($1,$3);}

答案1

得分: 2

在一个没有变量的语言中(例如一个简单的计算器),在解析过程中区分不同类型的表达式是可能的,前提是不能自动将一种类型强制转换为另一种类型。

但实际上,每次都完整输入矩阵可能会很麻烦。您和其他用户很快就会要求一种方法来将矩阵常量保存为具有名称的对象。如果命名对象也可以是标量,那么您要么必须坚持认为对象的名称在某种程度上代表类型(例如,矩阵可能必须以大写字母或类似的方式编写),要么更可能的情况是,在解析过程中您将不知道名称是标量表达式还是矩阵表达式。在那一点上,您可能已经建立了用于在解析过程中区分这两种表达式类型的任何复杂语法都会突然变得毫无意义。

因此,我的建议是省去这些麻烦。初始解析应该只是构建某种形式的抽象语法树(AST),然后您可以遍历树来执行您需要的任何语义分析,包括解决多态运算符和插入自动强制转换(如果有的话)。


可忽略的附录

虽然您的 q_matrix 语法没有问题,但我认为它有点笨拙,因为它实际上不代表矩阵常量的语法结构。我会稍微不同地编写它(还使用语义值来存储中间结果,而不是全局变量):

q_matrix: '[' q_row_list ']' { $$ = $2; }
q_rows  : q_row              { $$ = qm_create();
                               qm_append_row($$, $1); }
        | q_rows ';' q_row   { /* $$ = $1; */
                               /* 确保 $1.cols() == $3.cols */
                               qm_append_row($$, $3); }
q_row   : q_term             { $$ = qr_create();
                               qr_append_val($$, $1); }
        | q_row ',' q_term   { /* $$ = $1; */
                               qr_append_val($$, $3); }

在上述代码中,我注释掉了两个 $$ = $1; 的实例,因为在C语言bison生成的解析器中,在执行任何操作之前已经执行了该复制操作。如果切换到另一种语言,比如C++,您可能需要包含显式复制操作。

这段代码假设您具有矩阵和行(或向量)对象。当在将行附加到矩阵之前完成行时,在此时很容易检查要附加的行是否具有与累积矩阵相同数量的列。我用注释表示了这个测试,而不是试图建议如果测试失败应该采取什么行动。

英文:

In a language without variables (a simple calculator, for example), it can be possible to distinguish expressions of different types during the parse, provided that it is not possible to automatically coerce (cast) one type to another.

But realistically, it's going to be a nuisance to repeatedly type matrices out in full every time. You and your other users will very quickly demand some way to save a matrix constant as a named object. If named objects can also be scalars, then you will either have to insist that the name of an object somehow represent the type (for example, a matrix might have to be written with a capital letter or some such), or more likely you're going to end up not knowing during the parse whether a name is a scalar expression or a matrix expression. And at that point, any complicated grammar you might have built to try to distinguish the two types of expressions during the parse will suddenly become pointless.

So my advice is to save yourself the aggravation. The initial parse should just build an AST of some form, and you can then walk the tree to perform whatever semantic analysis you require, including resolving polymorphic operators and inserting automatic coercions, if any.


Ignorable Appendix

Although there is nothing wrong with your grammar for q_matrix, it strikes me as a little awkward because it doesn't really represent the syntactic structure of matrix constants. I would have written it slightly differently (also using the semantic values to store intermediate results instead of a global variable):

q_matrix: '[' q_row_list ']' { $$ = $2; }
q_rows  : q_row              { $$ = qm_create();
                               qm_append_row($$, $1); }
        | q_rows ';' q_row   { /* $$ = $1; */
                               /* ensure $1.cols() == $3.cols */
                               qm_append_row($$, $3); }
q_row   : q_term             { $$ = qr_create();
                               qr_append_val($$, $1); }
        | q_row ',' q_term   { /* $$ = $1; */
                               qr_append_val($$, $3); }

In the above, I commented out both instances of $$ = $1;, since in the case of C language bison-generated parsers, that copy has already been done just before executing any action. If you change to another language, such as C++, you might need to include the explicit copy.

The code assumes that you have both matrix and row (or vector) objects. (Of course, a vector object could be a matrix object with one row, if you didn't want to go to the trouble of implementing two distinct types.) In the code above, a row is completed before being appended to the matrix; at this point, it is easy to check to make sure that the row being appended has the same number of columns as the accumulated matrix. I indicated this test with a comment, rather than try to suggest what action should be taken if the test fails.

huangapple
  • 本文由 发表于 2020年1月6日 22:42:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/59614042.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定