英文:
gcc optimization: Increment on condition
问题
我注意到 gcc12 并没有使用相同的代码优化这两个函数(使用 -O3
):
int x = 0;
void f(bool a)
{
if (a) {
++x;
}
}
void f2(bool a)
{
x += a;
}
基本上没有进行任何转换。可以在这里查看:https://godbolt.org/z/1G3n4fxEK
将 f
优化为 f2
中的代码似乎是微不足道的,不再需要跳转。但我很好奇为什么 gcc 不这样做?是不是仍然更慢或其他原因?我认为它永远不会更慢,有时可能更快,但我可能错了。
谢谢
英文:
I've noticed that gcc12 does not optimize these two functions to the same code (with -O3
):
int x = 0;
void f(bool a)
{
if (a) {
++x;
}
}
void f2(bool a)
{
x += a;
}
Basically no transformation is done. That can be seen here: https://godbolt.org/z/1G3n4fxEK
Optimizing f
to the code in f2
seems to be trivial and no jump would be needed anymore. However, I'm curious if there's a reason why this is not done by gcc? Is it somehow still slower or something? I would assume it's never slower and sometimes faster, but I might be wrong.
Thanks
答案1
得分: 1
这种替换在一个线程调用f(1)
而另一个线程调用f(0)
的情况下是不正确的。如果x
实际上从未在第一个线程之外访问过,那么在代码中原样编写时不会出现竞争条件,但替换将创建一个竞争条件。如果x
最初为1,那么没有任何东西可以防止代码被处理为:
- 线程1:读取x(得到1)
- 线程2:读取x(得到1)
- 线程1:写入2
- 线程2:写入1
这将导致x
保留值1,而线程2刚刚写入值2。更糟糕的是,如果在这样的上下文中调用函数:
x = 1;
f(1);
if (x != 1)
launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();
编译器可能会认识到x
在从f(1)
返回后将始终等于2,从而使函数调用无条件执行。
可以确定,这种替代在实际情况中很少会引起问题,但标准明确禁止进行可能在源代码中不存在竞争条件的转换。
英文:
Such a substitution would be incorrect in a scenario where one thread calls f(1)
while another thread calls f(0)
. If x
is never actually accessed outside the first thread, there would be no race condition in the code as written, but the substitution would create one. If x
is initially 1, nothing would prevent the code from being processed as:
- thread 1: read x (yields 1)
- thread 2: read x (yields 1)
- thread 1: write 2
- thread 2: write 1
This would cause x
to be left holding the value 1 when thread 2 has just written the value 2
. Worse than that, if the function was invoked within a context like:
x = 1;
f(1);
if (x != 1)
launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();
a compiler might recognize that x
will always equal 2 following the return from f(1)
, and thus make the function call unconditional.
To be sure, such substitution would rarely cause problems in real-world situations, but the Standard explicitly forbids transformations that could create race conditions where none would exist in the source code as written.
答案2
得分: 0
我希望编译器会将f2更改为f。读写内存可能需要较慢的事务来获取内存位置的副本,并向其他总线控制器更新该位置的状态(无效->共享->修改)。根据寄存器值跳转更新非常便宜,尤其是有分支预测器的效力。
英文:
I would have hoped the compiler would have changed f2 to f. Reading and writing memory may require slow transactions to acquire a copy of the memory location, and update other bus controllers about the state of that location (Invalid -> Shared -> Modified).
Jumping around an update based on a register value is quite cheap; especially with the efficacy of branch predictors.
答案3
得分: -2
一个更简单的原因为什么不进行这种优化是因为 bool
只是 int
的别名。具体来说,没有阻止你将任意整数传递给你的函数:
int v = 5;
f2(*(bool *)&v);
// 这里 x = 5
英文:
A much simpler reason why that optimization isn't done is because bool
is nothing but an alias for int
. Specifically, nothing stops you from passing an arbitrary integer to your function:
int v = 5;
f2(*(bool *)&v);
// x = 5 here
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论