2015年5月1日 03:19:51go评论119阅读模式

英文:

Is Go compiler smart enough to pick up microoptimization?

问题

我很好奇在使用微优化时是否有意义，比如：

当a是整数时，使用a / 2与a >> 1
使用a * 2与a << 1
使用a % 2与a & 1
还有其他类似的优化技巧

我知道任何一个像样的C编译器都足够处理这些。请不要提及过早优化，因为这些技巧是如此明显，以至于它们不仅仅是优化，更像是一种编码偏好的问题。

**附言：**我尝试进行了基准测试，计时差异在统计上并不显著。我不知道如何检查Go的字节码，所以感谢您的指点。

英文:

I am curious whether it makes sense to use microoptimizations like

a / 2 versus a >> 1 when a is an integer
a * 2 vs a << 1
a % 2 vs a & 1
and some others like these

I know that any decent C compiler is good enough handle this. Also please do not write about premature optimization, because these techniques are so obvious, that it is not even optimization and more like a matter of preferences how to write code.

P.S. I tried to do benchmarks and the difference in timing is not statistically significant. I do not know how to check go's bytecode so thank you for pointing it.

答案1

得分: 10

简短回答是，编译器会对这些进行优化。但是对于int和uint（以及其他有符号和无符号整数类型，如byte），优化方式略有不同。

在两种情况下，编译器都会避免使用乘法和除法指令，但对于无符号整数来说，只需要一条指令（对于有符号整数则需要少量指令）。这是因为对于无符号整数，你给出的语句对是完全等价的，但对于有符号整数来说并非如此。

更详细的回答如下：

以一个简单的程序为例：

package main
func main() {}
func div2(a int) {
        b := a / 2
        c := a >> 1
        _, _ = b, c
}
func mul2(a int) {
        b := a * 2
        c := a << 1
        _, _ = b, c
}
func mod2(a int) {
        b := a % 2
        c := a & 1
        _, _ = b, c
}

运行go build -gcflags="-S"将输出汇编代码，例如：

"".mod2 t=1 size=32 value=0 args=0x8 locals=0x0
        0x0000 00000 (…/opt.go:17)       TEXT    "".mod2+0(SB),4,$0-8
        …
        0x0000 00000 (…/opt.go:17)       MOVQ    "".a+8(FP),BX
        …
        0x0005 00005 (…/opt.go:18)       MOVQ    BX,AX
        0x0008 00008 (…/opt.go:18)       SARQ    $63,AX
        0x000c 00012 (…/opt.go:18)       MOVQ    BX,DX
        0x000f 00015 (…/opt.go:18)       SUBQ    AX,DX
        0x0012 00018 (…/opt.go:18)       ANDQ    $1,DX
        0x0016 00022 (…/opt.go:18)       ADDQ    AX,DX
        0x0019 00025 (…/opt.go:19)       ANDQ    $1,BX
        0x001d 00029 (…/opt.go:21)       RET     ,

这里的BX是参数，DX和BX似乎是两个结果（BX被重用作为其中一个结果）。它们略有不同，但只有几条指令的差异（请看源代码行号），没有任何除法或乘法指令（因此基本上与之前一样快）。这种差异是由于算法移位和逻辑移位以及Go对负值进行取模的方式造成的。

你可以通过将程序中的int改为uint来确认这一点，然后输出将包含以下内容：

        0x0008 00008 (…/opt.go:18)       ANDQ    $1,CX
        0x000c 00012 (…/opt.go:19)       ANDQ    $1,BX

即完全相同的指令。对于你给出的每个示例，这都是成立的。

英文:

Short answer, yes, the compiler optimises those.
But it does so slightly differently for int vs uint (and presumably any signed vs unsigned integer types such as byte).

In both cases multiplication and division instructions are avoided but it's only a single instruction for unsigned integers (and a small number of instructions for signed integers).
That's because your pairs of statments are only exactly equivalent for unsigned integers and not for signed integers.

Longer answer:

Taking a simple program like:

package main
func main() {}
func div2(a int) {
        b := a / 2
        c := a &gt;&gt; 1
        _, _ = b, c
}
func mul2(a int) {
        b := a * 2
        c := a &lt;&lt; 1
        _, _ = b, c
}
func mod2(a int) {
        b := a % 2
        c := a &amp; 1
        _, _ = b, c
}

and running go build -gcflags="-S" will give you assembly output such as:

&quot;&quot;.mod2 t=1 size=32 value=0 args=0x8 locals=0x0
        0x0000 00000 (…/opt.go:17)       TEXT    &quot;&quot;.mod2+0(SB),4,$0-8
        …
        0x0000 00000 (…/opt.go:17)       MOVQ    &quot;&quot;.a+8(FP),BX
        …
        0x0005 00005 (…/opt.go:18)       MOVQ    BX,AX
        0x0008 00008 (…/opt.go:18)       SARQ    $63,AX
        0x000c 00012 (…/opt.go:18)       MOVQ    BX,DX
        0x000f 00015 (…/opt.go:18)       SUBQ    AX,DX
        0x0012 00018 (…/opt.go:18)       ANDQ    $1,DX
        0x0016 00022 (…/opt.go:18)       ADDQ    AX,DX
        0x0019 00025 (…/opt.go:19)       ANDQ    $1,BX
        0x001d 00029 (…/opt.go:21)       RET     ,

Here BX is the argument and DX and BX appear to be the two results
(BX being reused as one of the results).
Here they are slightly different, but only by a few instructions
(look at the source line numbers shown)
and without any division or multiplication instructions
(so basically just as fast).
The difference is due to algorithmic vs logical shifts and how Go does mod for negative values.

You can confirm this by changing int to uint in the program and then the output contains things like:

        0x0008 00008 (…/opt.go:18)       ANDQ    $1,CX
        0x000c 00012 (…/opt.go:19)       ANDQ    $1,BX

i.e. the exact same instruction.
This is true for each of the examples you gave.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Go编译器是否足够智能，能够识别微观优化？

问题

答案1

如何将 workflow.Context 传递给 Cadence 活动（activity）？

Gin – 结构参数被验证为空。

Golang中的时间布局

无法在Golang中对RethinkDB运行过滤器。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。