英文:
Are we Overusing Pass-by-Pointer in Go?
问题
这个问题涉及到函数调用,主要是关于在Go中通过值传递结构体和通过指针传递结构体时,Go优化器的可靠性。
如果你想知道何时在结构体字段中使用值传递和指针传递,请参考:https://stackoverflow.com/questions/24452323/go-performance-whats-the-difference-between-pointer-and-value-in-struct
请注意:我尽量以易于理解的方式表达,因此其中一些术语可能不够准确。
一些低效的Go代码
假设我们有一个结构体:
type Vec3 struct{
X, Y, X float32
}
我们想要创建一个计算两个向量叉乘的函数(对于这个问题,数学并不重要)。有几种方法可以实现这个目标。一个简单的实现方式是:
func CrossOf(a, b Vec3) Vec3{
return Vec3{
a.Y*b.Z - a.Z*b.Y,
a.Z*b.X - a.X*b.Z,
a.X*b.Y - a.Y*b.X,
}
}
可以通过以下方式调用:
a:=Vec3{1,2,3}
b:=Vec3{2,3,4}
var c Vec3
// ...and later on:
c := CrossOf(a, b)
这个实现方式是正确的,但在Go中,它显然不够高效。a
和b
被值传递(复制)到函数中,并且结果再次被复制出来。虽然这只是一个小例子,但如果考虑到大型结构体,问题就更加明显了。
一个更高效的实现方式是:
func (res *Vec3) CrossOf(a, b *Vec3) {
// 由于我们使用指针,不能直接赋值。a或b可能等于res
x := a.Y*b.Z - a.Z*b.Y
y := a.Z*b.X - a.X*b.Z
res.Z = a.X*b.Y - a.Y*b.X
res.Y = y
res.X = x
}
// 使用方法
c.CrossOf(&a, &b)
这种方式更难阅读,占用更多空间,但更高效。如果传递的结构体非常大,这是一个合理的权衡。
对于大多数具有类C编程背景的人来说,尽可能通过引用传递是直观的,纯粹是为了提高效率。
在Go中,直觉上认为这是最佳实践,但Go本身指出了这种推理的缺陷。
Go比这更聪明
下面的代码在Go中可以工作,但在大多数低级C语言中无法工作:
func GetXAsPointer(vec Vec3) *float32{
return &vec.X
}
我们分配了一个Vec3
,获取了X
字段的指针,并将其返回给调用者。看到问题了吗?在C中,当函数返回时,堆栈会展开,返回的指针将变为无效。
然而,Go是有垃圾回收的。它会检测到这个float32
必须继续存在,并将其(无论是float32
还是整个Vec3
)分配到堆上而不是栈上。
为了使这个工作,Go 需要进行逃逸分析。它模糊了值传递和指针传递之间的界限。
众所周知,Go被设计用于积极优化。如果通过引用传递更高效,并且传递的结构体不会被函数修改,为什么Go不应该采用更高效的方式呢?
因此,我们的高效示例可以重写为:
func (res *Vec3) CrossOf(a, b Vec3) {
res.X = a.Y*b.Z - a.Z*b.Y
rex.Y = a.Z*b.X - a.X*b.Z
res.Z = a.X*b.Y - a.Y*b.X
}
// 使用方法
c.CrossOf(a, b)
请注意,这种方式更易读,并且如果我们假设一个具有侵入性的“值传递转指针传递”编译器,与之前一样高效。
根据文档,建议使用指针传递足够大的接收器,并将接收器视为参数:https://golang.org/doc/faq#methods_on_values_or_pointers
Go已经对每个变量进行逃逸分析,以确定它是放在堆上还是栈上。因此,似乎更符合Go范式的做法是只在结构体将被函数修改时才通过指针传递。这将导致更易读且更少错误的代码。
Go编译器是否会自动将值传递优化为指针传递?看起来应该会。
英文:
This question is specific to function calls, and is directed towards the trustworthiness of the Go optimizer when passing structs by value vs by pointer.
If you're wondering when to use values vs pointers in struct fields, see: https://stackoverflow.com/questions/24452323/go-performance-whats-the-difference-between-pointer-and-value-in-struct
Please note: I've tried to word this so that it's easy for anyone to understand, some of the terminology is imprecise as a result.
Some Inefficient Go Code
Let's assume that we have a struct:
type Vec3 struct{
X, Y, X float32
}
And we want to create a function that computes the cross product of two vectors. (For this question, the math isn't important.) There are several ways to go about this. A naive implementation would be:
func CrossOf(a, b Vec3) Vec3{
return Vec3{
a.Y*b.Z - a.Z*b.Y,
a.Z*b.X - a.X*b.Z,
a.X*b.Y - a.Y*b.X,
}
}
Which would be called via:
a:=Vec3{1,2,3}
b:=Vec3{2,3,4}
var c Vec3
// ...and later on:
c := CrossOf(a, b)
This works fine, but in Go, it's apparently not very efficient. a
and b
are passed by value (copied) into the function, and the results are copied out again. Though this is a small example, the issues are more obvious if we consider large structs.
A more efficient implementation would be:
func (res *Vec3) CrossOf(a, b *Vec3) {
// Cannot assign directly since we are using pointers. It's possible that a or b == res
x := a.Y*b.Z - a.Z*b.Y
y := a.Z*b.X - a.X*b.Z
res.Z = a.X*b.Y - a.Y*b.X
res.Y = y
res.X = x
}
// usage
c.CrossOf(&a, &b)
This is harder to read and takes more space, but is more efficient. If the passed struct was very large, it would be a reasonable tradeoff.
For most people with a C-like programming background, it's intuitive to pass by reference, as much as possible, purely for efficiency.
In Go, it's intuitive to think that this is the best approach, but Go itself points out a flaw in this reasoning.
Go Is Smarter Than This
Here's something that works in Go, but cannot work in most low-level C-like languages:
func GetXAsPointer(vec Vec3) *float32{
return &vec.X
}
We allocated a Vec3
, grabbed a pointer the X
field, and returned it to the caller. See the problem? In C, when the function returns, the stack will unwind, and the returned pointer would become invalid.
However, Go is garbage collected. It will detect that this float32
must continue to exist, and will allocate it (either the float32
or the entire Vec3
) onto the heap instead of the stack.
Go requires escape detection in order for this to work. It blurs the line between pass-by-value and pass-by-pointer.
It's well known that Go is designed for aggressive optimization. If it's more efficient to pass by reference, and the passed struct is not altered by the function, why shouldn't Go take the more efficient approach?
Thus, our efficient example could be rewritten as:
func (res *Vec3) CrossOf(a, b Vec3) {
res.X = a.Y*b.Z - a.Z*b.Y
rex.Y = a.Z*b.X - a.X*b.Z
res.Z = a.X*b.Y - a.Y*b.X
}
// usage
c.CrossOf(a, b)
Notice that this is more readable, and if we assume an aggressive pass-by-value to pass-by-pointer compiler, just as efficient as before.
According to the docs, it's recommended to pass sufficiently large receivers using pointers, and to consider receivers in the same way as arguments: https://golang.org/doc/faq#methods_on_values_or_pointers
Go does escape detection on every variable already, to determine if it is placed on the heap or the stack. So it seems more within the Go paradigm to only pass by pointer if the struct will be altered by the function. This will result in more readable and less bug-prone code.
Does the Go compiler optimize pass-by-value into pass-by-pointer automatically? It seems like it should.
So Here's the Question
For structs, when should we use pass-by-pointer vs pass-by-value?
Things that should be taken into account:
- For structs, is one actually more efficient than the other, or will they be optimized to be the same by the Go compiler?
- Is it bad practice to rely on the compiler to optimize this?
- Is it worse to pass-by-pointer everywhere, creating bug-prone code?
答案1
得分: 14
简短回答:是的,你在这里过度使用了指针传递。
快速计算一下...你的结构体由三个float32对象组成,总共96位。假设你在64位机器上,你的指针长度为64位,所以在最好的情况下,你只能节省32位的复制。
为了节省这32位,你强制进行了额外的查找(需要跟随指针并读取原始值)。它必须在堆上分配这些对象,这意味着额外的开销、垃圾收集器的额外工作和降低的内存局部性。
在编写高性能代码时,你必须意识到内存局部性不佳的潜在成本,缓存未命中可能非常昂贵。主内存的延迟可能是L1缓存的100倍。
此外,因为你正在获取结构体的指针,所以阻止了编译器可能能够进行的许多优化。例如,Go可能会在将来实现寄存器调用约定,但在这种情况下将无法实现。
简而言之,为了节省4字节的复制,可能会付出相当大的代价,所以在这种情况下,你过度使用了指针传递。除非结构体的大小是这个的10倍,否则我不会仅仅出于效率考虑使用指针,即使结构体的大小是这个的10倍,由于意外修改可能导致的错误,这也并不总是明智的做法。
英文:
Short answer: Yes, you're overusing pass-by-pointer here.
Some quick math here... your struct consists of three float32 objects for a total of 96 bits. Assuming you're on a 64 bit machine, your pointer is 64 bits long, so in the best case you're saving yourself a paltry 32 bits of copying.
As a price of saving those 32 bits, you're forcing an extra lookup (it needs to follow the pointer and then read the original values). It has to allocate these objects on the heap instead of the stack, which means a whole bunch of extra overhead, extra work for the garbage collector, and reduced memory locality.
When writing highly performant code, you have to be aware of the potential costs of poor memory locality cache misses can be extremely expensive. The latency of main memory can be 100x that of L1.
Furthermore, because you're taking a pointer to the struct you're preventing the compiler from making a number of optimizations it might otherwise be able to make. For example, Go might implement register calling conventions in the future, which would be prevented here.
In a nutshell, saving 4 bytes of copying could cost you quite a bit, so yes in this case you are overusing pass-by-pointer. I wouldn't use pointers just for efficiency unless the struct was 10x as large as this, and even then it's not always clear if that is the right approach given the potential for bugs caused by accidental modification.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论