编译器可以在成员未在中间使用但稍后更新时进行优化吗?

huangapple go评论69阅读模式
英文:

Can compiler optimize member access when not used in the interim but updated later

问题

我理解编译器优化可能会暴露已经存在的错误(假设编译器优化不会引入错误)。我也找到了一个类似的问题:https://stackoverflow.com/questions/2722302/can-compiler-optimization-introduce-bugs 并阅读了一篇建议的文章:危险的优化,它指出了编译器优化的陷阱。

我担心在这段代码中第二次检查可能会被优化掉;这种情况可能吗?

class Lock {
  void Acquire(bool is_exclusive);
  // 尝试将共享锁升级为独占锁,如果可能的话。
  bool TryUpgrade();
  void Release();
}

class Foo {
  bool condition_boolean = true; 
  Lock lock;

  void f1() {
    lock.Acquire(false);
    if (condition_boolean) {
      if (!lock.TryUpgrade()) {
        lock.Release();
        lock.Acquire(true);
      }

      // 重新检查条件,因为我们可能释放了锁来以独占方式获取它。
      if (condition_boolean) {  // 可能被优化掉。
        // 做一些事情;
        condition_boolean = false;
      }
    }

    lock.Release();
  }
};
英文:

I understand that compiler optimizations can expose already existing bugs (Given that compiler optimizations do not introduce bugs). I also found a similar question: https://stackoverflow.com/questions/2722302/can-compiler-optimization-introduce-bugs and read a suggested article: Dangerous Optimizations which points to the pitfalls of compiler optimization.

I'm worried that the second check could be optimized away in this code; is that possible?

class Lock {
  void Acquire(bool is_exclusive);
  // Upgrades shared lock to an exclusive lock, if possible. 
  bool TryUpgrade();
  void Release();
}

class Foo {
  bool condition_boolean = true; 
  Lock lock;

  void f1() {
    lock.Acquire(false);
    if (condition_boolean) {
      if (!lock.TryUpgrade()) {
        lock.Release();
        lock.Acquire(true);
      }

      // Recheck the condition as we may have released the lock to 
      // acquire it exclusively.
      if (condition_boolean) {  // Can be optimized away.
        // Do something;
        condition_boolean = false;
      }
    }

    lock.Release();
  }
};

答案1

得分: 3

假设Lock类被正确编写,以使其acquire/release函数包含相应类型的内存屏障,那么否定,第二个检查不能被优化掉。

这些优化属于C++标准中的“as-if rule”[C++20 N4860 intro.abstract p1]。抽象地说,代码必须按照原样执行。但是如果编译器可以证明,在假设程序在其他方面没有未定义行为的情况下,进行转换不会改变程序的可观察行为,那么编译器可以应用这些转换。(或者,如果程序有多种有效的执行方式,那么至少它必须确保它仍然以其中一种有效的方式运行。)

在多线程程序的情况下,“假设程序在其他方面没有未定义行为”经常出现在数据竞争中,应用于数据竞争。一个典型的例子是:

bool b;

void foo() {
    int x=0, y=0;
    if (b)
        x=1;
    if (b)
        y=1;
    assert(x == y);
}

在这里,编译器可以假定b在两次测试之间不会改变,并优化掉第二次测试,从而将代码转换为if (b) { x=1; y=1; }

你可能会说,“但是难道不可能有其他线程在两次测试之间修改了b吗?”嗯,如果是这样的话,你就会有一个数据竞争。数据竞争规则规定,对于对同一非原子变量的任何两次访问,其中至少一次是写入,程序必须通过同步来确保其中一次“happens-before”另一次[intro.races p21]。同步通常由互斥锁提供,或者由原子变量的获取加载操作提供,该操作读取由之前的释放存储写入的值。

在上面的程序中,编译器可以看到没有这样的同步操作是可能的,因为在两次读取b之间,代码中没有互斥锁操作或任何获取或释放操作。因此,对b的任何并发写入都将是数据竞争,这将是UB,因此编译器可以假定它不会发生。(如果编译器的假设被违反,那么可能会发生一些不好的事情 - 但这不是编译器的错,而是你调用UB的错。)

然而,在你发布的代码中,确实有同步发生的可能。一些其他线程可以在你暂时释放锁时获取锁,修改condition_boolean,然后释放锁。这将不包含数据竞争。假设与之前一样,Lock类被正确实现,那么你释放锁与其他线程的获取同步,意味着你对condition_boolean的第一次读操作发生在其他线程的写操作之前。同样,其他线程的释放与你的重新获取同步,所以其他线程的写操作发生在你的第二次读操作之前。

因此,所有对condition_boolean的操作都是通过happens-before完全有序的,因此不会发生数据竞争。此外,由于对condition_boolean的写入发生在第二次读取之前,第二次读取必须观察到新值[intro.races p13]。为了提供这种行为,编译器必须实际执行第二次读取;它不能将其优化掉。

再次强调,所有这些都是在假设Lock类被正确编写的情况下。如果它只是std::shared_mutex之类的东西的包装器,那么一切应该都没问题,因为标准定义了std::mutex等提供适当获取和释放语义的内容。如果你自己使用std::atomic编写了它,那么你需要确保你使用了正确的算法和适当的std::memory_order来在解锁和锁定例程之间提供同步。如果你在不使用std::atomic的情况下编写它,几乎可以肯定它是错误的,会导致数据竞争,无论你对condition_boolean做了什么都会导致程序出错。(而且,volatile不能替代std::atomic。)

尽管如此,即使如此,如果编译器可以证明违反这些规则对于一个否定未定义行为的程序没有影响,它仍然可以打破这些规则。作为一个简单的例子,可能可以通过编译器选项来指定程序将以单线程方式运行。在这种情况下,编译器可以将所有原子对象降级为普通对象,并删除所有互斥锁操作和内存屏障。然后它可以删除对condition_boolean的第二次检查。但当然,在这种情况下,这样做是完全正确的,因为实际上condition_boolean不可能发生改变。

英文:

Assuming that the Lock class is written properly, such that its acquire/release functions contain memory barriers of the corresponding types, then no, the second check cannot be optimized away.

Such optimizations are covered under the C++ standards "as-if rule" [C++20 N4860 intro.abstract p1]. Abstractly, the code must execute exactly as written. But the compiler can apply transformations if it can prove that, assuming the program is otherwise free of undefined behavior, the transformation would not change the observable behavior of the program. (Or, if the program has more than one valid way to execute, then at least it must ensure that it still behaves in one of those valid ways.)

In the case of multithreaded programs, the "assuming the program is otherwise free of undefined behavior" frequently comes into play, applied to data races. A typical example is:

bool b;

void foo() {
    int x=0, y=0;
    if (b)
        x=1;
    if (b)
        y=1;
    assert(x == y);
}

Here, the compiler may assume that b cannot change between the two tests, and optimize out the second one, thus turning the code into if (b) { x=1; y=1; }.

You might say, "but isn't it possible that some other thread modifies b in between?" Ah, well, if that were the case, you would have a data race. The data race rule says that for any two accesses to the same non-atomic variable, where at least one of them is a write, the program must have synchronization to ensure that one of them happens-before the other [intro.races p21]. Synchronization is typically provided by mutexes, or by an acquire load of an atomic variable that reads the value written by an earlier release store.

In the program above, the compiler can see no such synchronization is possible, because in between the two reads of b, the code contains no mutex operations nor any acquire or release operations. Therefore, any concurrent write of b would inevitably be a data race, which would be UB, so the compiler can assume it won't happen. (If the compiler's assumption is violated, then something bad will probably happen - but that's not the compiler's fault, it's your fault for invoking UB.)


However, in the code you posted, there is a way for synchronization to occur. Some other thread could take the lock while you have temporarily released it, modify condition_boolean, and then release the lock. This would be free of data races. Assuming as before that the Lock class is properly implemented, your release of the lock synchronizes with the other thread's acquisition, implying that your first read of condition_boolean happens before the other thread's write. Likewise, the other thread's release synchronizes with your re-acquisition, so the other thread's write happens before your second read.

Thus all the operations on condition_boolean are totally ordered by happens-before, so no data race occurs. Moreover, since the write to condition_boolean happens before the second read, the second read must observe the new value [intro.races p13]. To provide this behavior, the compiler must actually perform the second read; it cannot optimize it out.


Once again, all of this assumes that the Lock class is written correctly. If it is just a wrapper around std::shared_mutex or something like that, then everything should be fine, since the standard defines that std::mutex and friends provide the appropriate acquire and release semantics. If it's something you wrote yourself using std::atomics, then it's on you to ensure that you used a correct algorithm and appropriate std::memory_order to provide synchronization between your unlock and lock routines. If you wrote it without using std::atomic, then almost certainly it is wrong, causing a data race all by itself, and your program is broken regardless of what you do with condition_boolean. (And no, volatile does not substitute for std::atomic.)


Having said all that, even so, the compiler is free to break any of these rules if it can prove that it makes no difference to an otherwise UB-free program. As a simple example, it may be possible to specify via a compiler option that the program will run single-threaded. In this case, the compiler can demote all atomic objects to normal ones, and delete all mutex operations and memory barriers. It could also then delete the second check of condition_boolean. But of course, in such a case, it would be perfectly correct to do so, because condition_boolean in fact cannot change.

huangapple
  • 本文由 发表于 2023年7月18日 09:03:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/76708936.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定