当循环变量在溢出时未定义时,可能存在哪些优化措施?

huangapple go评论73阅读模式
英文:

What optimizations are possible when loop variable is undefined on overflow?

问题

这篇文章介绍了一段C代码示例,编译器可以通过循环计数器的类型在溢出时将其优化。以下是你提到的代码段:

for (i = 0; i <= N; ++i) { ... }

在这个循环中,如果i在溢出时是未定义的,编译器可以假设循环将恰好执行N+1次,这样就可以应用广泛的循环优化。但是,如果变量在溢出时定义为回绕,那么编译器必须假设循环可能是无限的(当N为INT_MAX时会发生这种情况),这将禁用这些重要的循环优化。这对于64位平台特别影响,因为很多代码使用"int"作为归纳变量。

关于你的疑问:

  • 我理解如果i的类型在溢出时是未定义的,编译器可以假设i不会回绕,但为什么这意味着循环执行N+1次呢?难道在循环体中i不会被修改吗?
  • 即使有这个假设,基于此可以进行哪些优化呢?例如,如果N在编译时是未知的,循环就无法展开,对吗?
  • "广泛的循环优化"让我觉得我可能漏掉了很多东西。

请注意,我只会翻译你提供的内容,不会回答翻译的问题。

英文:

This article features an example snippet of C code which the compiler can optimize thanks to the fact that overflow is undefined for the type of the loop counter.

Here's the snippet with the comment
> c
&gt; for (i = 0; i &lt;= N; ++i) { ... }
&gt;

> In this loop, the compiler can assume that the loop will iterate exactly N+1 times if "i" is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses "int" as induction variables.

I have quite a few doubts:

  • I understand that, if the type of i is undefined on overflow, the compiler can assume that i will not wrap around, but why would this imply that the loop runs N+1 times? Can't it be that i is altered in the body of the loop?
  • Even given that assumption, what optimization could it do based on that? If N is not known at compile-time, for instance, the loop can't be unrolled, can it?
  • "Broad range of loop optimizations" makes me think I'm missing a lot, here.

答案1

得分: 6

我理解了,如果i的类型在溢出时未定义,编译器可以假设i不会溢出,但为什么这意味着循环会运行N+1次呢?难道循环体中不能改变i的值吗?

首先回答第二个问题,作者在他们的例子中假设循环中不会改变i的值,并且循环中也不包含break或其他会终止循环的代码。这只是一个简单的例子,面向熟悉这些概念并能填补空白的知识渊博的读者。

回答第一个问题,i不会溢出并不意味着循环会运行N+1次。文章并没有这样说,它说的是它允许编译器假设。推理如下:

  • 如果N的值使得++i永远不会溢出,循环将恰好运行N+1次。
  • 如果N的值使得++i在某个点上溢出,行为是未定义的,语言标准允许任何行为。如果程序崩溃,那符合标准。如果N+1溢出并且测试i <= N始终为真,循环无限迭代,那也符合标准。如果循环运行N+1次,那也符合标准。作为编译器实现者,我们可以选择其中任何一种行为,而得到的行为将符合语言标准。

在这种情况下,我们可以自由选择,我们选择假设循环总是运行N+1次,因为这是一个简化的假设:它给出了非溢出情况和溢出情况的相同规范。

即使有这个假设,基于它可以进行哪些优化呢?例如,如果循环是for (int i = 0; i <= N; ++i) { sum += i; },我们可以优化为sum += N*(N+1)/2;(在考虑可能溢出的情况下计算),并完全移除循环。如果我们被要求假设i在溢出时会重新开始,我们就无法完全移除循环;我们需要一个测试来判断N的值是否使得循环退出或永远继续下去。(尽管标准中关于循环必须有进展的其他语言,这只是一个简化的例子。)

“广泛的循环优化”让我觉得我可能漏掉了很多东西,这是一个很宽泛的说法,没有列举出几个具体的优化,所以不要太担心这一部分。

英文:

> I understand that, if the type of i is undefined on overflow, the compiler can assume that i will not wrap around, but why would this imply that the loop runs N+1 times? Can't it be that i is altered in the body of the loop?

Answering the second question first, the author is taking it, for their example, that i is not altered in the loop and that the loop does not contain a break or other code that would terminate the loop. It is merely a briefly presented example aimed at a knowledgeable audience that is expected to be familiar with these things and to fill in the gaps.

Answering the first question, the fact that i will not wrap does not imply the loop will run N+1 times. The passage does not say that, it says it allows the compiler to assume that. The reasoning is this:

  • If N is such that ++i never overflows, the loop runs exactly N+1 times.
  • If N is such that ++i overflows at some point, the behavior is not defined, and the language standard allows any behavior. If the program crashes, that conforms to the standard. If N+1 wraps and the test i &lt;= N is always true, and the loop iterates forever, that conforms to the standard. If the loop runs N+1 times, that conforms to the standard. As the compiler implementor, we can choose any of these behaviors, and the resulting behavior will conform to the language standard.

Given a free choice, in this instance we choose to behave as if the loop always runs N+1 times because this is a simplifying assumption: It gives us the same specification for both the non-overflow case and the overflow case.

> - Even given that assumption, what optimization could it do based on that? If N is not known at compile-time, for instance, the loop can't be unrolled, can it?

If the loop is for (int i = 0; i &lt;= N; ++i) { sum += i; }, we can optimize to sum += N*(N+1)/2; (calulated with appropriate regard for possible overflow) and remove the loop entirely. If we were required to behave as if i wrapped upon overflow, we could not remove the loop entirely; we would need a test to see if the value of N were such that the loop exited or continued forever. (Other language in the standard about loops making progress notwithstanding; this is a simplified example.)

> - "Broad range of loop optimizations" makes me think I'm missing a lot, here.

It is a broad claim to make without listing several, so do not worry about that part too much.

答案2

得分: 3

在循环体中,i的值不会改变。假设在一个64位平台上,N的值可能太大,以至于循环无法正确运行(也许程序员知道N永远不会太大,但编译器不知道)。但是如果N太大,i会溢出,而溢出是不可能发生的。因此,编译器可以假设N永远不会太大,即使有时候它确实太大。

相比之下,

for (unsigned i = 0; i <= N; ++i) {

即使N是可表示的最大的unsigned值,i也不会溢出,而是会绕回到0,这是完全合法的。编译器不能再假设N永远不会太大,必须处理一个可能无法结束的循环。

关于优化,你可以在这里观看一个令人惊叹的示例:链接

如果编译器知道循环会结束,它可以使用一组更高效的指令。机器码会变得更长、更复杂,但如果这能为我们每次运行节省3.27皮秒,那又有谁在乎呢?

英文:

> Can't it be that i is altered in the body of the loop?

No, they mean this:

size_t N = getSize();
for (int i = 0; i &lt;= N; ++i) { 
   // i never changed here
}

Assuming a 64-bit platform, N might be too big for the loop to run correctly. (Perhaps the programmer knows that N is never too big, but the compiler doesn't). But if N is too big, i will overflow, and overflow "cannot" happen. So the compiler can assume that N is never too big after all -- even if it sometimes is.

By contrast,

for (unsigned i = 0; i &lt;= N; ++i) {

here, even if N is the maximum representable unsigned, i won't overflow, but rather wrap around, which is completely legal. The compiler no longer can assume that N is never too big, and must deal with a loop that potentially doesn't end.

> what optimization could it do based on that?

Just watch this in awe.

If the compiler know the loop is going to end, it is able to use a (presumably) more efficient set of instructions. The machine code becomes much longer and much more complicated, but who cares if it wins us 3.27 picoseconds per run?

答案3

得分: 2

这是一个简单的代码示例,可以在这个假设下进行优化,而且这个版本更简单,因为我们使用的是<而不是<=

int Foo(const std::vector<int> &vec) {
  int count = 0;
  for (int i = 0; i < vec.size(); i++) {
    count++;
  }
  return count;
}

显然,解释这段代码的正常方式是完全省略循环,并将其重写为:

int Foo(const std::vector<int> &vec) {
  return vec.size();
}

如果vec.size() <= INT_MAX,那么一切正常,没有什么棘手的问题需要考虑。但是,如果vec.size() > INT_MAX呢?简化/重写的版本是否仍然有效?这实际上是需要考虑的重要问题,因为在大多数合理的64位实现中,vec.size()将返回一个64位无符号整数size_t

在大向量的情况下,在INT_MAX次迭代之后会出现问题,因为如果i下溢,则行为是未定义的,这意味着我们可能会得到诸如无限循环之类的结果,这意味着我们简化的重写版本是无效的。然而,由于下溢是未定义的,这意味着编译器可以假设这不会发生,并且允许我们进行上述的简化。

在实际的代码中,这种优化是很重要的,因为人们经常出于习惯使用int作为循环迭代变量,尽管实际上它们是在迭代一个可以容纳size_t元素的容器。

英文:

Here is a simple example of code that can be optimized under this assumption, and this version is even simpler because we're using &lt; rather than &lt;=:

int Foo(const std::vector&lt;int&gt; &amp;vec) {
  int count = 0;
  for (int i = 0; i &lt; vec.size(); i++) {
    count++;
  }
  return count;
}

Obviously the normal way to interpret this code would be to omit the loop entirely, and just rewrite it as:

int Foo(const std::vector&lt;int&gt; &amp;vec) {
  return vec.size();
}

If vec.size() &lt;= INT_MAX then everything is normal and there's nothing tricky to think about. But what if vec.size() &gt; INT_MAX? Is the simplified/rewritten version still valid? This is actually important to consider because on most sane 64-bit implementations vec.size() will return a size_t which is a 64-bit unsigned integer.

In the huge-vector case there is a problem after INT_MAX iterations because if i underflows then the behavior is undefined, and this means that we could get things like an infinite loop, meaning our simplified rewritten version is not valid. However, since underflow is UB this means that the compiler can assume this won't happen, and it is permitted to make the simplification we made above.

In real world code this kind of optimization is important because it's extremely common for people to use things like an int loop iteration variable out of habit even though they're actually iterating over a container that can hold size_t elements.

huangapple
  • 本文由 发表于 2023年8月9日 02:13:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/76862202.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定