gcc 优化:根据条件递增

huangapple go评论59阅读模式
英文:

gcc optimization: Increment on condition

问题

我注意到 gcc12 并没有使用相同的代码优化这两个函数(使用 -O3):

int x = 0;

void f(bool a)
{
    if (a) {
        ++x;
    }
}

void f2(bool a)
{
    x += a;
}

基本上没有进行任何转换。可以在这里查看:https://godbolt.org/z/1G3n4fxEK

f 优化为 f2 中的代码似乎是微不足道的,不再需要跳转。但我很好奇为什么 gcc 不这样做?是不是仍然更慢或其他原因?我认为它永远不会更慢,有时可能更快,但我可能错了。

谢谢

英文:

I've noticed that gcc12 does not optimize these two functions to the same code (with -O3):

int x = 0;

void f(bool a)
{
    if (a) {
        ++x;
    }
}

void f2(bool a)
{
    x += a;
}

Basically no transformation is done. That can be seen here: https://godbolt.org/z/1G3n4fxEK

Optimizing f to the code in f2 seems to be trivial and no jump would be needed anymore. However, I'm curious if there's a reason why this is not done by gcc? Is it somehow still slower or something? I would assume it's never slower and sometimes faster, but I might be wrong.

Thanks

答案1

得分: 1

这种替换在一个线程调用f(1)而另一个线程调用f(0)的情况下是不正确的。如果x实际上从未在第一个线程之外访问过,那么在代码中原样编写时不会出现竞争条件,但替换将创建一个竞争条件。如果x最初为1,那么没有任何东西可以防止代码被处理为:

  1. 线程1:读取x(得到1)
  2. 线程2:读取x(得到1)
  3. 线程1:写入2
  4. 线程2:写入1

这将导致x保留值1,而线程2刚刚写入值2。更糟糕的是,如果在这样的上下文中调用函数:

x = 1;
f(1);
if (x != 1)
  launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();

编译器可能会认识到x在从f(1)返回后将始终等于2,从而使函数调用无条件执行。

可以确定,这种替代在实际情况中很少会引起问题,但标准明确禁止进行可能在源代码中不存在竞争条件的转换。

英文:

Such a substitution would be incorrect in a scenario where one thread calls f(1) while another thread calls f(0). If x is never actually accessed outside the first thread, there would be no race condition in the code as written, but the substitution would create one. If x is initially 1, nothing would prevent the code from being processed as:

  1. thread 1: read x (yields 1)
  2. thread 2: read x (yields 1)
  3. thread 1: write 2
  4. thread 2: write 1

This would cause x to be left holding the value 1 when thread 2 has just written the value 2. Worse than that, if the function was invoked within a context like:

x = 1;
f(1);
if (x != 1)
  launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();

a compiler might recognize that x will always equal 2 following the return from f(1), and thus make the function call unconditional.

To be sure, such substitution would rarely cause problems in real-world situations, but the Standard explicitly forbids transformations that could create race conditions where none would exist in the source code as written.

答案2

得分: 0

我希望编译器会将f2更改为f。读写内存可能需要较慢的事务来获取内存位置的副本,并向其他总线控制器更新该位置的状态(无效->共享->修改)。根据寄存器值跳转更新非常便宜,尤其是有分支预测器的效力。

英文:

I would have hoped the compiler would have changed f2 to f. Reading and writing memory may require slow transactions to acquire a copy of the memory location, and update other bus controllers about the state of that location (Invalid -> Shared -> Modified).
Jumping around an update based on a register value is quite cheap; especially with the efficacy of branch predictors.

答案3

得分: -2

一个更简单的原因为什么不进行这种优化是因为 bool 只是 int 的别名。具体来说,没有阻止你将任意整数传递给你的函数:

int v = 5;
f2(*(bool *)&v);

// 这里 x = 5
英文:

A much simpler reason why that optimization isn't done is because bool is nothing but an alias for int. Specifically, nothing stops you from passing an arbitrary integer to your function:

int v = 5;
f2(*(bool *)&v);

// x = 5 here

huangapple
  • 本文由 发表于 2023年2月8日 19:34:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/75385201.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定