2023年6月27日 21:06:11go评论109阅读模式

英文:

Why does adding a volatile qualifier to a variable not prevent instruction reordering?

问题

以下是您要翻译的部分：

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

当我编译此代码时，生成的汇编代码被重新排列如下：

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

然而，当我添加一个内存屏障（C++代码中的注释行）时，指令不会被重新排序。我理解添加volatile限定符到一个变量应该能够防止指令重新排序。因此，我修改了代码，将volatile添加到变量B：

int A;
volatile int B;
void foo() {
    A = B + 1;
    B = 0;
}

令我惊讶的是，生成的汇编代码仍然显示重新排序的指令。有人可以解释为什么在这种情况下volatile限定符没有防止指令重新排序吗？

代码可在godbolt中找到。

英文:

I have a simple C++ code snippet as shown below:

int A;
int B;
void foo() {
    A = B + 1;
    // asm volatile(&quot;&quot; ::: &quot;memory&quot;);
    B = 0;
}

When I compile this code, the generated assembly code is reordered as follows:

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

However, when I add a memory fence (commented line in the C++ code), the instructions are not reordered. My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering. So, I modified the code to add volatile to variable B:

int A;
volatile int B;
void foo() {
    A = B + 1;
    B = 0;
}

To my surprise, the generated assembly code still shows reordered instructions. Can someone explain why the volatile qualifier did not prevent instruction reordering in this case?

Code is available in godbolt

答案1

得分: 4

> 我理解的是，将volatile限定符添加到变量中应该还可以防止指令重排序。

这是一个非常简化的说法。尽管C++标准没有非常明确地定义了volatile的语义（仅表示"访问按照抽象机器的规则严格评估"），但未写明的规则是，volatile对象被视为某个外部实体（例如I/O硬件）可能会异步读取和写入它们，并且读取和写入都是外部实体可以观察到的副作用。因此，对volatile对象（大小不超过机器字大小）的每次读取/写入都应导致执行正好一条加载/存储指令。

由此可见，对volatile对象的加载和存储将不会相互重排序。但在您的程序中，A不是volatile，因此我们假设外部实体不会看到它。因此，访问A与访问B或其他任何内容的顺序无关紧要，编译器可以对它们进行重排序。根本不访问内存的指令，比如add eax, 1，也是公平竞争的；外部实体也无法看到机器寄存器。

根据您对[并发]标签的使用，这是volatile不适合用于在线程之间共享变量的众多原因之一，因为与"外部实体"不同，另一个线程确实可以访问您的非volatile变量。在C++11之前的旧时代，人们使用volatile是因为那是唯一的选择，如果您了解编译器如何进行优化（通常是未记录的），则可以通过使用显式内存屏障函数来使其正常工作。自C++11以来，我们有了std::atomic，这是处理线程间共享的唯一正确方式，但不幸的是，与volatile的关联仍然存在于过时的文档和老一辈人的思维中。更多信息，请参阅https://stackoverflow.com/questions/2484980/why-is-volatile-not-considered-useful-in-multithreaded-c-or-c-programming?rq=3。

还相关的是：https://stackoverflow.com/questions/26307071/does-the-c-volatile-keyword-introduce-a-memory-fence?rq=2（不，它不会，正如您已经发现的那样）。

英文:

> My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering.

That's a major oversimplification. Although the C++ standard doesn't define the semantics of volatile very explicitly (saying only that "accesses are evaluated strictly according to the rules of the abstract machine"), the unwritten rule is that volatile objects are treated as if some external entity (e.g. I/O hardware) may be reading and writing them asynchronously, and that both reads and writes are side effects that the external entity can observe. As such, each read/write to a volatile object (of machine word size or less) should result in the execution of exactly one load/store instruction.

From this it follows that loads and stores to volatile objects will not be reordered with each other. But in your program A is not volatile, so we assume that the external entity does not see it. Therefore it does not matter how the accesses to A are ordered with respect to accesses to B or anything else, and the compiler is free to reorder them. Instructions like add eax, 1 that do not access memory at all are also fair game; the external entity can't see the machine registers either.

Per your use of the [tag:concurrency] tag, this is one of the many reasons that volatile is not the right approach for variables to be shared between threads - because unlike the "external entity", another thread does have access to your non-volatile variables. In olden times prior to C++11, people used volatile because it was all there was, and you could make it work, with the use of explicit memory barrier functions, if you knew something about the way your compiler did optimizations (which was usually undocumented). Since C++11 we have std::atomic and that is the only right way to handle inter-thread sharing, but unfortunately the association with volatile lingers on in obsolete docs and the minds of old-timers. See https://stackoverflow.com/questions/2484980/why-is-volatile-not-considered-useful-in-multithreaded-c-or-c-programming?rq=3 for more.

Also relevant: https://stackoverflow.com/questions/26307071/does-the-c-volatile-keyword-introduce-a-memory-fence?rq=2 (No, it does not, as you have discovered.)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么给变量添加volatile限定词不能阻止指令重排序？

问题

答案1

在命令行游戏中避免大量switch case的更好解决方案

How to refactor old Apple SDK files displaying build warnings in C++ with OSAtomicCompareAndSwap32()?

尝试渲染一个立方体时，所有的三角形都被渲染成一条线。

INT 13h 无法读取特定扇区之后的内容。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。