2023年2月8日 19:34:26go评论89阅读模式

英文:

gcc optimization: Increment on condition

问题

我注意到 gcc12 并没有使用相同的代码优化这两个函数（使用 -O3）：

int x = 0;
void f(bool a)
{
    if (a) {
        ++x;
    }
}
void f2(bool a)
{
    x += a;
}

基本上没有进行任何转换。可以在这里查看：https://godbolt.org/z/1G3n4fxEK

将 f 优化为 f2 中的代码似乎是微不足道的，不再需要跳转。但我很好奇为什么 gcc 不这样做？是不是仍然更慢或其他原因？我认为它永远不会更慢，有时可能更快，但我可能错了。

谢谢

英文:

I've noticed that gcc12 does not optimize these two functions to the same code (with -O3):

int x = 0;
void f(bool a)
{
    if (a) {
        ++x;
    }
}
void f2(bool a)
{
    x += a;
}

Basically no transformation is done. That can be seen here: https://godbolt.org/z/1G3n4fxEK

Optimizing f to the code in f2 seems to be trivial and no jump would be needed anymore. However, I'm curious if there's a reason why this is not done by gcc? Is it somehow still slower or something? I would assume it's never slower and sometimes faster, but I might be wrong.

Thanks

答案1

得分: 1

这种替换在一个线程调用f(1)而另一个线程调用f(0)的情况下是不正确的。如果x实际上从未在第一个线程之外访问过，那么在代码中原样编写时不会出现竞争条件，但替换将创建一个竞争条件。如果x最初为1，那么没有任何东西可以防止代码被处理为：

线程1：读取x（得到1）
线程2：读取x（得到1）
线程1：写入2
线程2：写入1

这将导致x保留值1，而线程2刚刚写入值2。更糟糕的是，如果在这样的上下文中调用函数：

x = 1;
f(1);
if (x != 1)
  launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();

编译器可能会认识到x在从f(1)返回后将始终等于2，从而使函数调用无条件执行。

可以确定，这种替代在实际情况中很少会引起问题，但标准明确禁止进行可能在源代码中不存在竞争条件的转换。

英文:

Such a substitution would be incorrect in a scenario where one thread calls f(1) while another thread calls f(0). If x is never actually accessed outside the first thread, there would be no race condition in the code as written, but the substitution would create one. If x is initially 1, nothing would prevent the code from being processed as:

thread 1: read x (yields 1)
thread 2: read x (yields 1)
thread 1: write 2
thread 2: write 1

This would cause x to be left holding the value 1 when thread 2 has just written the value 2. Worse than that, if the function was invoked within a context like:

x = 1;
f(1);
if (x != 1)
  launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();

a compiler might recognize that x will always equal 2 following the return from f(1), and thus make the function call unconditional.

To be sure, such substitution would rarely cause problems in real-world situations, but the Standard explicitly forbids transformations that could create race conditions where none would exist in the source code as written.

答案2

得分: 0

我希望编译器会将f2更改为f。读写内存可能需要较慢的事务来获取内存位置的副本，并向其他总线控制器更新该位置的状态（无效->共享->修改）。根据寄存器值跳转更新非常便宜，尤其是有分支预测器的效力。

英文:

I would have hoped the compiler would have changed f2 to f. Reading and writing memory may require slow transactions to acquire a copy of the memory location, and update other bus controllers about the state of that location (Invalid -> Shared -> Modified).
Jumping around an update based on a register value is quite cheap; especially with the efficacy of branch predictors.

答案3

得分: -2

一个更简单的原因为什么不进行这种优化是因为 bool 只是 int 的别名。具体来说，没有阻止你将任意整数传递给你的函数：

int v = 5;
f2(*(bool *)&amp;v);
// 这里 x = 5

英文:

A much simpler reason why that optimization isn't done is because bool is nothing but an alias for int. Specifically, nothing stops you from passing an arbitrary integer to your function:

int v = 5;
f2(*(bool *)&amp;v);
// x = 5 here

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

gcc 优化：根据条件递增

问题

答案1

答案2

答案3

使用Scipy的fsolve解决非线性方程组（遇到数学域错误）。

如何在回调函数中解析多个 GtkWidget？

如何读取用户输入的数量，并将其存入一个指定大小的数组中？

JavaScript的For循环与If语句和数组

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。