英文:
How does a C compiler read this operation?
问题
C编译器如何解释这个操作,即操作的顺序是什么?
z = (xx + rr) - (yy + ww)
我不确定它是从左括号开始,还是同时处理两个括号。
英文:
How does a C compiler read this operation i.e. what is the order of the operations?
z = (x*x + r*r) - (y*y + w*w)
I'm confused whether it starts with the left parentheses first or it works on both at the same time.
答案1
得分: 3
C标准不规定大多数“结构独立”操作的评估顺序。甚至可能没有操作的顺序,即一个操作可能被部分启动,然后另一个操作可能被部分启动,然后第一个操作可能完成,然后第二个。(例如,编译器可以使用32位操作的序列来实现64位整数运算。)
在你给出的表达式中,z = (x*x + r*r) - (y*y + w*w)
,如果 x
、r
、y
和 w
是普通变量,右边的操作顺序并不重要 —— 从内存加载 x
的值不会影响加载 r
或其他变量的值,因此这些操作的执行顺序不重要。然而,如果这些变量被标记为 volatile
或被替换为具有副作用的表达式(如具有副作用的函数调用,比如向标准输出打印),那么顺序可能很重要。在这些情况下,C标准不规定哪一个会首先进行评估。编译器可以首先评估第二个 r
,然后第一个 x
,然后第一个 w
,然后第二个 y
,等等。
在表达式中存在一些结构依赖关系。必须在计算 x*x
的结果之前对其进行评估,以便将其添加到 r*r
的结果,并且必须在计算 x*x + r*r
的结果之前对其进行评估,以便从中减去 y*y + w*w
的结果。
在实践中,编译器很可能使用它已经拥有的值来评估表达式。例如,使用普通的未标记变量,如果最近的语句是 q = y*y - w*w
,一个好的编译器可能会保留 y*y
和 w*w
的值以供在 z = (x*x + r*r) - (y*y + w*w)
中重用。因此,y
、w
、y*y
和 w*w
将在 x
、r
、x*x
和 r*r
之前进行评估。反之,不同的先前语句可能导致 x
、r
、x*x
和 r*r
更早地进行评估,还有其他组合也是可能的。
英文:
The C standard does not specify the order of evaluation for most “structurally independent” operations. There might not even be any order of operations in the sense that one operation might be started in part, then another operation might be started in part, then the first operation might be completed, then the second. (For example, a compiler could implement 64-bit integer arithmetic using sequences of 32-bit operations.)
In the expression you give, z = (x*x + r*r) - (y*y + w*w)
, the order of operations on the right side of the =
does not matter if x
, r
, y
, and w
are ordinary variables—loading the value of x
from memory will not affect loading the value of r
or the other variables, so it does not matter which of these are done first. However, if these variables are qualified with volatile
or are replaced with expressions (such as function calls with side effects, such as printing to standard output), then the order can matter. In these cases, the C standard does not say which of them is evaluated first. The compiler can evaluate the second r
first, then the first x
, then the first w
, then the second y
, and so on.
There are some structural dependencies in the expression. The result of x*x
must be evaluated before it can be added to the result of r*r
, and the result of x*x + r*r
must be evaluated before the result of y*y + w*w
can be subtracted from it.
In practice, a compiler is likely to evaluate expressions using values it has on hand already. For example, using ordinary unqualified variables, if a recent prior statement were q = y*y - w*w
, a good compiler would like retain the value of y*y
and w*w
for reuse in z = (x*x + r*r) - (y*y + w*w)
. So y
, w
, y*y
, and w*w
would be evaluated before x
, r
, x*x
, and r*r
. Conversely, a different prior statement could result in x
, r
, x*x
, and r*r
being evaluated earlier, and other combinations are also possible.
答案2
得分: 3
在大多数情况下,编译器可以按其喜好的任何顺序评估子表达式,只要在值要使用的点时,所有评估都已完成。
旧的C99标准最好解释了这一点,6.5节:
(明确提到的运算符具有特殊的评估顺序规则。)
除非另有规定(对于函数调用 ()
、&&
、||
、?:
和逗号运算符),子表达式的评估顺序和副作用发生的顺序都是未指定的。
这是“未指定的行为”,这意味着:
- 编译器可以以任何它喜欢的方式执行操作。
- 编译器不需要记录它如何执行操作。
- 即使在遇到多个相同的代码段时,编译器也不需要在整个程序中表现一致。
- 程序员不应该期望特定的结果或编写依赖于它的程序。
在这种特定情况下,程序员不应期望操作数的评估/执行顺序。
通常的说明方式是用函数调用替换算术操作数。根据你的方程,我们可以创建一个与这些变量对应的函数名称。然后使用一些“模拟”脏宏:
#include <stdio.h>
int x (void) { puts(__func__); return 1; }
int r (void) { puts(__func__); return 2; }
int y (void) { puts(__func__); return 3; }
int w (void) { puts(__func__); return 3; }
int* z (void){ puts(__func__); static int foo; return &foo; }
#define x x()
#define r r()
#define y y()
#define w w()
#define z *z()
int main (void)
{
z = (x*x + r*r) - (y*y + w*w);
}
在表达式中,每个操作数都会导致一个函数调用。将调用的函数名称打印出来。当你在多个编译器上或者相同编译器上使用不同的优化选项尝试时,你会发现它们可能会以不同的方式行为。
例如,在gcc的情况下:
它仍然是未指定的。查看使用gcc 12.2与gcc 5.1的相同编译器选项的示例:https://godbolt.org/z/haPYefnb1
我们应该注意,在C中,赋值运算符 =
要求右操作数在值更新之前进行评估,但是gcc 5.1选择首先执行 z
。它所做的是存储返回的地址,以供以后使用。这是合规行为。
英文:
In most situations, a compiler may evaluate the sub expressions in any order it pleases, just as soon as all evaluations are done at the point when the value is to be used.
The old C99 standard explained this best, 6.5:
(The operators explicitly mentioned have special order of evaluation rules.)
> Except as specified later (for the function-call ()
, &&
, ||
, ?:
, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.
This is unspecified behavior, which means:
- The compiler might do things in any way it pleases.
- The compiler need not document how it does things.
- The compiler need not behave consistently throughout the program, even when encountering several identical sections of code.
- The programmer should not expect any certain outcome or write a program that relies on it.
In this specific case the programmer shouldn't expect a certain order in which the operands are evaluated/executed.
The usual way to illustrate this is to replace arithmetic operands with function calls. Taking your little equation, we can cook up a function with a name corresponding to each of those variables. Then do some "mocking" with dirty macros:
#include <stdio.h>
int x (void) { puts(__func__); return 1; }
int r (void) { puts(__func__); return 2; }
int y (void) { puts(__func__); return 3; }
int w (void) { puts(__func__); return 3; }
int* z (void){ puts(__func__); static int foo; return &foo; }
#define x x()
#define r r()
#define y y()
#define w w()
#define z *z()
int main (void)
{
z = (x*x + r*r) - (y*y + w*w);
}
Here each operand in the expression results in a function call. The name of the function called will be printed. And when you try this out on multiple compilers or the same one with different optimization options, you'll find out that they may behave differently.
> In the case of gcc for example
It is still unspecified. Check out this example using gcc 12.2 vs gcc 5.1 with identical compiler options: https://godbolt.org/z/haPYefnb1
We should note that the assignment operator =
in C requires the right operand to be evaluated before the value is updated and yet gcc 5.1 chose to execute z
first. What it did was to store down the address returned, to use it later. It's free to do so, this is conforming behavior.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论