英文:
Is branch prediction purely cpu behavior, or will the compiler give some hints?
问题
在Go标准包src/sync/once.go中,最近的修订更改了代码片段:
if atomic.LoadUint32(&o.done) == 1 {
return
}
//otherwise
...
改为:
//if atomic.LoadUint32(&o.done) == 1 {
// return
// }
if atomic.LoadUint32(&o.done) == 0 {
...
}
问题是,根据这个改变,热路径不再在代码中明确表示,这个改变对分支预测有不良影响吗?Go编译器在后续运行中是否提供了一些帮助,还是分支预测完全由CPU处理?
提交页面:https://github.com/golang/go/commit/ca8354843ef9f30207efd0a40bb6c53e7ba86892
英文:
In go standard package src/sync/once.go, a recent revision change the snippets
if atomic.LoadUint32(&o.done) == 1 {
return
}
//otherwise
...
to:
//if atomic.LoadUint32(&o.done) == 1 {
// return
// }
if atomic.LoadUint32(&o.done) == 0 {
...
}
the question is, according to this change, hot path is no longer explicit in the code, does this change has bad impact on branch prediction ? does go compiler make some help in the subsequent run of this function or the whole thing of branch prediction is on cpu?
commit page:https://github.com/golang/go/commit/ca8354843ef9f30207efd0a40bb6c53e7ba86892
答案1
得分: 1
你所提到的特定提交(通过Brits在评论中找到)并不是为了利用分支预测。它使用了关于Go编译器如何对小函数进行内联扩展的知识。
我们可以选择以这种方式编写函数:
func (o *Object) Operate() {
if (o.alreadyDone) { return }
... some code ...
}
或者以这种方式编写:
func (o *Object) Operate() {
if (!o.alreadyDone) { o.reallyOperate() }
}
其中o.reallyOperate
接管了... some code ...
部分。
如果... some code ...
部分超过几条指令,并且按照原始的once.Do
的方式编写,Go编译器通过让调用者调用实际函数来实现该函数。但是,当它像替代方案那样很短时,调用者将函数实现为内联测试、分支,然后可能调用reallyOperate
函数。
由于sync.Once
实际上每个Once
对象只调用一次函数,在其余时间内不调用该函数,这种内联扩展导致在每个Do
调用上除了第一个调用之外都不进行调用。这实际上使得调用点的代码变得更大(增加了一两条指令),但由于通常不执行调用,结果通常更快。
英文:
The particular commit you're talking about (found by Brits in a comment) is not an attempt to make use of branch prediction. It's using knowledge about how the Go compiler does inline expansion of small functions.
We're given the option of writing a function in this way:
func (o *Object) Operate() {
if (o.alreadyDone) { return }
... some code ...
}
Or, alternatively, writing it this way:
func (o *Object) Operate() {
if (!o.alreadyDone) { o.reallyOperate() }
}
where o.reallyOperate
takes over the ... some code ...
part.
If the some code
part is more than a few instructions long and is written the way the original once.Do
was, the Go compiler implements the function by having the caller call the actual function. But when it's as short as the replacement, the caller implements the function as an inline test, branch, and then maybe call the reallyOperate
function.
Since sync.Once
actually calls the function only once per Once
object, and the rest of the time, does not call, this inline expansion results in not making the call on every Do
call except the first one. This actually makes the code at the call site bigger (by one or two instructions) but since the call is normally not executed, the result is normally faster.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论