为什么给变量添加volatile限定词不能阻止指令重排序?

huangapple go评论71阅读模式
英文:

Why does adding a volatile qualifier to a variable not prevent instruction reordering?

问题

以下是您要翻译的部分:

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

当我编译此代码时,生成的汇编代码被重新排列如下:

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

然而,当我添加一个内存屏障(C++代码中的注释行)时,指令不会被重新排序。我理解添加volatile限定符到一个变量应该能够防止指令重新排序。因此,我修改了代码,将volatile添加到变量B:

int A;
volatile int B;

void foo() {
    A = B + 1;
    B = 0;
}

令我惊讶的是,生成的汇编代码仍然显示重新排序的指令。有人可以解释为什么在这种情况下volatile限定符没有防止指令重新排序吗?

代码可在godbolt中找到。

英文:

I have a simple C++ code snippet as shown below:

int A;
int B;

void foo() {
    A = B + 1;
    // asm volatile("" ::: "memory");
    B = 0;
}

When I compile this code, the generated assembly code is reordered as follows:

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

However, when I add a memory fence (commented line in the C++ code), the instructions are not reordered. My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering. So, I modified the code to add volatile to variable B:

int A;
volatile int B;

void foo() {
    A = B + 1;
    B = 0;
}

To my surprise, the generated assembly code still shows reordered instructions. Can someone explain why the volatile qualifier did not prevent instruction reordering in this case?

Code is available in godbolt

答案1

得分: 4

> 我理解的是,将volatile限定符添加到变量中应该还可以防止指令重排序。

这是一个非常简化的说法。尽管C++标准没有非常明确地定义了volatile的语义(仅表示"访问按照抽象机器的规则严格评估"),但未写明的规则是,volatile对象被视为某个外部实体(例如I/O硬件)可能会异步读取和写入它们,并且读取和写入都是外部实体可以观察到的副作用。因此,对volatile对象(大小不超过机器字大小)的每次读取/写入都应导致执行正好一条加载/存储指令。

由此可见,对volatile对象的加载和存储将不会相互重排序。但在您的程序中,A不是volatile,因此我们假设外部实体不会看到它。因此,访问A与访问B或其他任何内容的顺序无关紧要,编译器可以对它们进行重排序。根本不访问内存的指令,比如add eax, 1,也是公平竞争的;外部实体也无法看到机器寄存器。

根据您对[并发]标签的使用,这是volatile不适合用于在线程之间共享变量的众多原因之一,因为与"外部实体"不同,另一个线程确实可以访问您的非volatile变量。在C++11之前的旧时代,人们使用volatile是因为那是唯一的选择,如果您了解编译器如何进行优化(通常是未记录的),则可以通过使用显式内存屏障函数来使其正常工作。自C++11以来,我们有了std::atomic,这是处理线程间共享的唯一正确方式,但不幸的是,与volatile的关联仍然存在于过时的文档和老一辈人的思维中。更多信息,请参阅https://stackoverflow.com/questions/2484980/why-is-volatile-not-considered-useful-in-multithreaded-c-or-c-programming?rq=3。

还相关的是:https://stackoverflow.com/questions/26307071/does-the-c-volatile-keyword-introduce-a-memory-fence?rq=2(不,它不会,正如您已经发现的那样)。

英文:

> My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering.

That's a major oversimplification. Although the C++ standard doesn't define the semantics of volatile very explicitly (saying only that "accesses are evaluated strictly according to the rules of the abstract machine"), the unwritten rule is that volatile objects are treated as if some external entity (e.g. I/O hardware) may be reading and writing them asynchronously, and that both reads and writes are side effects that the external entity can observe. As such, each read/write to a volatile object (of machine word size or less) should result in the execution of exactly one load/store instruction.

From this it follows that loads and stores to volatile objects will not be reordered with each other. But in your program A is not volatile, so we assume that the external entity does not see it. Therefore it does not matter how the accesses to A are ordered with respect to accesses to B or anything else, and the compiler is free to reorder them. Instructions like add eax, 1 that do not access memory at all are also fair game; the external entity can't see the machine registers either.

Per your use of the [tag:concurrency] tag, this is one of the many reasons that volatile is not the right approach for variables to be shared between threads - because unlike the "external entity", another thread does have access to your non-volatile variables. In olden times prior to C++11, people used volatile because it was all there was, and you could make it work, with the use of explicit memory barrier functions, if you knew something about the way your compiler did optimizations (which was usually undocumented). Since C++11 we have std::atomic and that is the only right way to handle inter-thread sharing, but unfortunately the association with volatile lingers on in obsolete docs and the minds of old-timers. See https://stackoverflow.com/questions/2484980/why-is-volatile-not-considered-useful-in-multithreaded-c-or-c-programming?rq=3 for more.

Also relevant: https://stackoverflow.com/questions/26307071/does-the-c-volatile-keyword-introduce-a-memory-fence?rq=2 (No, it does not, as you have discovered.)

huangapple
  • 本文由 发表于 2023年6月27日 21:06:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76565192.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定