英文:
Slices of structs vs. slices of pointers to structs
问题
我经常使用结构体的切片。这是一个示例结构体:
type MyStruct struct {
val1, val2, val3 int
text1, text2, text3 string
list []SomeType
}
因此,我将我的切片定义如下:
[]MyStruct
假设我有大约一百万个元素,并且我在大量使用切片:
- 我经常添加新元素。(元素的总数是未知的。)
- 我偶尔对其进行排序。
- 我也删除元素(尽管不像添加新元素那样频繁)。
- 我经常读取元素并传递它们(作为函数参数)。
- 元素本身的内容不会改变。
我理解的是,这会导致实际结构体的大量洗牌。另一种选择是创建指向结构体的指针切片:
[]*MyStruct
现在结构体保持在原地,我们只处理指针,我认为指针的占用空间较小,因此操作速度更快。但现在垃圾收集器需要做更多的工作。
- 您能提供一般性的准则,什么时候直接使用结构体,什么时候使用结构体指针?
- 我应该担心给垃圾收集器带来多少工作吗?
- 复制结构体与复制指针之间的性能开销是否可以忽略不计?
- 或许一百万个元素并不多。当切片变得更大(当然仍适合内存)时,所有这些情况会如何改变?
英文:
I often work with slices of structs. Here's an example for such a struct:
type MyStruct struct {
val1, val2, val3 int
text1, text2, text3 string
list []SomeType
}
So I define my slices as follows:
[]MyStruct
Let's say I have about a million elements in there and I'm working heavily with the slice:
- I append new elements often. (The total number of elements is unknown.)
- I sort it every now and then.
- I also delete elements (although not as much as adding new elements).
- I read elements often and pass them around (as function arguments).
- The content of the elements themselves doesn't get changed.
My understanding is that this leads to a lot of shuffling around of the actual struct. The alternative is to create a slice of pointers to the struct:
[]*MyStruct
Now the structs remain where they are and we only deal with pointers which I assume have a smaller footprint and will therefore make my operations faster. But now I'm giving the garbage collector a lot more work.
- Can you provide general guidelines of when to work with structs directly vs. when to work with pointers to structs?
- Should I worry about how much work I leave to the GC?
- Is the performance overhead of copying a struct vs. copying a pointer negligible?
- Maybe a million elements is not much. How does all of this change when the slice gets much bigger (but still fits in RAM, of course)?
答案1
得分: 68
刚刚对此产生了好奇。运行了一些基准测试:
type MyStruct struct {
F1, F2, F3, F4, F5, F6, F7 string
I1, I2, I3, I4, I5, I6, I7 int64
}
func BenchmarkAppendingStructs(b *testing.B) {
var s []MyStruct
for i := 0; i < b.N; i++ {
s = append(s, MyStruct{})
}
}
func BenchmarkAppendingPointers(b *testing.B) {
var s []*MyStruct
for i := 0; i < b.N; i++ {
s = append(s, &MyStruct{})
}
}
结果:
BenchmarkAppendingStructs 1000000 3528 ns/op
BenchmarkAppendingPointers 5000000 246 ns/op
结论:我们的单位是纳秒。对于小切片来说可能可以忽略不计。但对于数百万次操作来说,这是毫秒和微秒之间的差异。
顺便说一下,我尝试再次运行基准测试,使用预先分配了容量为1000000的切片,以消除append()
定期复制底层数组的开销。追加结构体的时间减少了1000纳秒,追加指针的时间没有变化。
英文:
Just got curious about this myself. Ran some benchmarks:
type MyStruct struct {
F1, F2, F3, F4, F5, F6, F7 string
I1, I2, I3, I4, I5, I6, I7 int64
}
func BenchmarkAppendingStructs(b *testing.B) {
var s []MyStruct
for i := 0; i < b.N; i++ {
s = append(s, MyStruct{})
}
}
func BenchmarkAppendingPointers(b *testing.B) {
var s []*MyStruct
for i := 0; i < b.N; i++ {
s = append(s, &MyStruct{})
}
}
Results:
BenchmarkAppendingStructs 1000000 3528 ns/op
BenchmarkAppendingPointers 5000000 246 ns/op
Take aways: we're in nanoseconds. Probably negligible for small slices. But for millions of ops, it's the difference between milliseconds and microseconds.
Btw, I tried running the benchmark again with slices which were pre-allocated (with a capacity of 1000000) to eliminate overhead from append() periodically copying the underlying array. Appending structs dropped 1000ns, appending pointers didn't change at all.
答案2
得分: 12
你能提供一般的准则,关于何时直接使用结构体,何时使用指向结构体的指针吗?
不,这太依赖于你已经提到的其他因素。
唯一真正的答案是:进行基准测试并观察结果。每种情况都不同,当你有实际的时间数据时,所有的理论都无关紧要。
(话虽如此,我的直觉是使用指针,可能还可以使用sync.Pool
来帮助垃圾回收器:http://golang.org/pkg/sync/#Pool)
英文:
> Can you provide general guidelines of when to work with structs directly vs. when to work with pointers to structs?
No, it depends too much on all the other factors you've already mentioned.
The only real answer is: benchmark and see. Every case is different and all the theory in the world doesn't make a difference when you've got actual timings to work with.
(That said, my intuition would be to use pointers, and possibly a sync.Pool
to aid the garbage collector: http://golang.org/pkg/sync/#Pool)
答案3
得分: 3
与地图、切片、通道、函数和方法不同,结构体变量是通过复制传递的,这意味着在幕后分配了更多的内存。另一方面,减少指针可以减少垃圾收集器的工作量。从我的角度来看,我会考虑三个方面:结构体的复杂性、要处理的数据量以及创建变量后的功能需求(当它被传递到函数中时,是否需要可变性等)。
英文:
Unlike maps, slices, channels, functions, and methods, struct variables are passed by copy which means there's more memory allocated behind the scene. On the other hand, reducing pointers result in less work for the garbage collector. From my perspective, I would think more about 3 things: the struct complexity, the quantity of data to handle, and the functional need once you'd have created your var (does it need to be mutable when it's being passed into a function? etc..)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论