英文:
C compiler optimization - how it is done between modules?
问题
Imagine 2
source files, called file1.c
and file2.c
. They are both compiled separately by the compiler, and linked at the final step. Consider optimization GCC -Ofast
. They both have its corresponding header file and headers include each others.
file1:
//#include...
int bool_check;
int main(void) {
bool_check = 1;
while (bool_check) {
do_something(); //Defined in another file
}
}
file2:
void do_something() {
extern int bool_check;
bool_check = 0;
}
In this example, file1
calls a function in file2
.
- Is compiler/optimizer allowed to optimize the code in
file1
sayingbool_check
will never get changed so we don't need to perform constant read-from-memory operation? - If no, how does it know that function in
file2
modifies the variable, if files are compiled separately? - If yes, isn't this a compiler bug?
It is not the great real-life use case, but good for demonstration.
英文:
Imagine 2
source files, called file1.c
and file2.c
. They are both compiled separately by the compiler, and linked at the final step. Consider optimization GCC -Ofast
. They both have its corresponding header file and headers include each others.
file1:
//#include...
int bool_check;
int main(void) {
bool_check = 1;
while (bool_check) {
do_something(); //Defined in another file
}
}
file2:
void do_something() {
extern int bool_check;
bool_check = 0;
}
In this example, file1
calls a function in file2
.
- Is compiler/optimizer allowed to optimize the code in
file1
sayingbool_check
will never get changed so we don't need to perform constant read-from-memory operation? - If no, how does it know that function in
file2
modifies the variable, if files are compiled separately? - If yes, isn't this a compiler bug?
It is not the great real-life use case, but good for demonstration.
答案1
得分: 2
不。只有在编译器/优化器能够证明bool_check
实际上没有被do_something()
调用修改时,它才能进行这种优化,而这仅在它能够看到代码时才可能发生。
通常情况下,如果文件分开编译,编译器不知道这一点。因此,它必须假设do_something()
可能会修改bool_check
,这意味着它必须在每次调用后重新从内存加载它。
一般来说,如果编译器看不到被调用函数的代码,它必须假设该函数可能读取或写入程序中的每个全局变量,以及每个地址已经“逃逸”并且可能被被调用函数知道的静态、局部或动态对象。因此,在调用之前必须将所有先前的写入这些对象存储到内存中,并且在调用之后必须加载所有后续的读取。
例如:
void other1(int *p);
void other2(void);
int foo() {
int x, y, z;
int *p = &z;
other1(&x);
x = 3;
y = 5;
*p = 7;
other2();
return x + y;
}
在调用other2()
之后,编译器必须重新从内存中加载x
。调用other1()
可能会将指向x
的指针存储在某个共享对象中,other2()
也可以通过该指针修改x
。同样,x=3
必须将值3
实际存储到内存中,不能被优化掉。
另一方面,y=5
不需要存储到内存中,y
在other2()
之后也不需要重新加载;y
的地址从未被获取,因此没有其他代码“知道”它的位置以便修改它。事实上,y
甚至不需要占用内存;它可以在return
表达式中被一个立即数取代。
对于z
也是同样的情况;尽管它的地址被获取,但它没有传递到函数foo
之外,因此没有“逃逸”。这种逻辑被称为逃逸分析。因此,编译器可以将其优化掉,而不会改变程序的可观察行为。
尽管如此,现代工具链可以进行链接时优化,在这种情况下,编译器不仅会生成每个源文件(翻译单元)的汇编代码,还会生成描述代码语义更详细的中间表示(IR)。所有这些信息都可供链接器使用,链接器可以重新运行优化和代码生成,现在整个程序都“可见”。在这一点上,优化器可以看到do_something()
调用会修改bool_check
,尽管这实际上并没有改变什么,因为编译器已经假定了这一点。更相关的是,优化器可以看到哪些变量没有被修改,并优化掉任何不必要的存储或重新加载操作。
英文:
> Is compiler/optimizer allowed to optimize the code in file1 saying bool_check
will never get changed so we don't need to perform constant read-from-memory operation?
No. It can only make that optimization if it can prove that bool_check
actually isn't modified by the call to do_something()
, which is only possible if it can see the code.
> If no, how does it know that function in file2 modifies the variable, if files are compiled separately?
Normally, it doesn't know. So it has to assume that do_something()
might modify bool_check
, which means that it has to reload it from memory after every call.
In general, if the compiler can't see the code for a called function, it must assume that function might read or write every global variable in the program, as well as every static, local or dynamic object whose address has "escaped" and could possibly be known to the called function. So all preceding writes to such objects must be stored to memory before the call, and all subsequent reads must be loaded after the call.
For example:
void other1(int *p);
void other2(void);
int foo() {
int x, y, z;
int *p = &z;
other1(&x);
x = 3;
y = 5;
*p = 7;
other2();
return x + y;
}
After the call to other2()
, the compiler has to reload x
from memory. It is possible that the call to other1()
stashed the pointer to x
in some shared object that other2()
can also access, in which case other2()
might have modified x
through that pointer. Likewise, the x=3
has to actually store the value 3
into memory and can't be optimized away.
On the other hand, y=5
needn't store to memory, and y
need not be reloaded after other2()
; the address of y
has never been taken and so no other code could "know" where it's located in order to modify it. In fact, y
need not occupy memory at all; it can simply be replaced by an immediate constant in the return
expression.
The same is true for z
; even though its address has been taken, it has not been passed outside the function foo
and so it has not "escaped". This logic is called escape analysis. So the compiler can optimize it away without changing the program's observable behavior.
Having said all this, modern toolchains are capable of link time optimization, in which the compiler emits not only assembly code for each source file (translation unit) but also some intermediate representation (IR) that describes the code's semantics in more detail. All of this information is then available to the linker, which can re-run optimization and code generation, now with the entire program "visible" at once. At this point, the optimizer can see that bool_check
is modified by the call to do_something()
, though that doesn't really change anything because the compiler had already assumed that it was. What's more relevant is that the optimizer can see which variables are not modified, and optimize away any dead stores or unnecessary reloads.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论