英文:
What is the point in setting a slice's capacity?
问题
在Golang中,我们可以使用内置的make()
函数来创建一个具有给定初始长度和容量的切片。
考虑以下代码,切片的长度设置为1,容量设置为3:
func main() {
var slice = make([]int, 1, 3)
slice[0] = 1
slice = append(slice, 6, 0, 2, 4, 3, 1)
fmt.Println(slice)
}
我惊讶地发现这个程序输出的结果是:
[1 6 0 2 4 3 1]
这让我想知道,如果append()
函数可以超过切片的容量,那么最初定义切片的容量有什么意义呢?设置一个足够大的容量是否会提高性能?
英文:
In Golang, we can use the builtin make()
function to create a slice with a given initial length and capacity.
Consider the following lines, the slice's length is set to 1, and its capacity 3:
func main() {
var slice = make([]int, 1, 3)
slice[0] = 1
slice = append(slice, 6, 0, 2, 4, 3, 1)
fmt.Println(slice)
}
I was surprised to see that this program prints:
> [1 6 0 2 4 3 1]
This got me wondering- what is the point of initially defining a slice's capacity if append()
can simply blow past it? Are there performance gains for setting a sufficiently large capacity?
答案1
得分: 38
一个切片实际上只是一种管理底层数组的高级方式。它会自动跟踪大小,并在需要时重新分配新的空间。
当你向切片追加元素时,运行时会在超过当前容量时将其容量翻倍。这需要复制所有元素。如果你在开始之前知道它将有多大,你可以通过一次性获取所有元素来避免一些复制操作和内存分配。
当你使用make
函数创建一个指定容量的切片时,你设置的是初始容量,而不是任何类型的限制。
请参阅这篇关于切片的博文,了解一些有关切片内部细节的有趣信息。
英文:
A slice is really just a fancy way to manage an underlying array. It automatically tracks size, and re-allocates new space as needed.
As you append to a slice, the runtime doubles its capacity every time it exceeds its current capacity. It has to copy all of the elements to do that. If you know how big it will be before you start, you can avoid a few copy operations and memory allocations by grabbing it all up front.
When you make
a slice providing capacity, you set the initial capacity, not any kind of limit.
See this blog post on slices for some interesting internal details of slices.
答案2
得分: 16
一个 slice
是一个对简单 array
的精妙抽象。你可以获得各种好用的功能,但在其核心深处,它仍然是一个 array
。(我以相反的顺序解释以下内容是有原因的)。因此,当你指定一个 capacity
为 3
时,实际上在内存中分配了一个长度为 3
的数组,你可以在不需要重新分配内存的情况下向其中 append
元素。这个属性在 make
命令中是可选的,但请注意,无论你选择与否,slice
总是会有一个 capacity
。如果你指定了一个 length
(它也总是存在的),那么 slice
将可以通过索引访问到该长度。剩余的 capacity
被隐藏在幕后,这样在使用 append
时就不需要分配全新的数组。
下面是一个示例来更好地解释这个机制。
s := make([]int, 1, 3)
底层的 array
将被分配为 3
个 int
类型的零值(即 0
):
[0,0,0]
然而,length
被设置为 1
,所以这个 slice
本身只会打印 [0]
,如果你尝试索引第二个或第三个值,它会引发 panic
,因为 slice
的机制不允许这样做。如果你对它执行 s = append(s, 1)
,你会发现它实际上已经被创建为包含 zero
值直到 length
的 slice
,你会得到 [0,1]
。此时,你可以在整个底层 array
填满之前再次执行 append
,而另一个 append
将强制它分配一个新的数组并将所有值复制过去,容量加倍。这实际上是一个相当昂贵的操作。
因此,对你的问题的简短回答是,预先分配 capacity
可以极大地提高代码的效率。特别是当 slice
要么非常大,要么包含复杂的 structs
(或两者兼有)时,因为 struct
的零值实际上是其每个字段的零值。这不是因为它可以避免分配这些值,因为它必须要分配,而是因为每次需要调整底层数组的大小时,append
都必须重新分配一个新的由这些零值填充的数组。
简短的示例代码:https://play.golang.org/p/LGAYVlw-jr
英文:
A slice
is a wonderful abstraction of a simple array
. You get all sorts of nice features, but deep down at its core, lies an array
. (I explain the following in reverse order for a reason). Therefore, if/when you specify a capacity
of 3
, deep down, an array of length 3
is allocated in memory, which you can append
up to without having it need to reallocate memory. This attribute is optional in the make
command, but note that a slice
will always have a capacity
whether or not you choose to specify one. If you specify a length
(which always exists as well), the slice
be indexable up to that length. The rest of the capacity
is hidden away behind the scenes so it does not have to allocate an entirely new array when append
is used.
Here is an example to better explain the mechanics.
s := make([]int, 1, 3)
The underlying array
will be allocated with 3
of the zero value of int
(which is 0
):
[0,0,0]
However, the length
is set to 1
, so the slice itself will only print [0]
, and if you try to index the second or third value, it will panic
, as the slice
's mechanics do not allow it. If you s = append(s, 1)
to it, you will find that it has actually been created to contain zero
values up to the length
, and you will end up with [0,1]
. At this point, you can append
once more before the entire underlying array
is filled, and another append
will force it to allocate a new one and copy all the values over with a doubled capacity. This is actually a rather expensive operation.
Therefore the short answer to your question is that preallocating the capacity
can be used to vastly improve the efficiency of your code. Especially so if the slice
is either going to end up very large, or contains complex structs
(or both), as the zero
value of a struct
is effectively the zero
values of every single one of its fields
. This is not because it would avoid allocating those values, as it has to anyway, but because append
would have to reallocate new array
s full of these zero values each time it would need to resize the underlying array.
Short playground example: https://play.golang.org/p/LGAYVlw-jr
答案3
得分: 12
正如其他人已经提到的,使用cap
参数可以避免不必要的内存分配。为了感受性能差异,想象一下你有一个包含随机值的[]float64
切片,并且想要一个新的切片,过滤掉小于某个值(比如0.5
)的元素。
朴素方法 - 无len或cap参数
func filter(input []float64) []float64 {
ret := make([]float64, 0)
for _, el := range input {
if el > 0.5 {
ret = append(ret, el)
}
}
return ret
}
更好的方法 - 使用cap参数
func filterCap(input []float64) []float64 {
ret := make([]float64, 0, len(input))
for _, el := range input {
if el > 0.5 {
ret = append(ret, el)
}
}
return ret
}
基准测试(n=10)
filter 131 ns/op 56 B/op 3 allocs/op
filterCap 56 ns/op 80 B/op 1 allocs/op
使用cap
使程序的速度提高了2倍以上,并将内存分配次数从3次减少到1次。那么在大规模情况下会发生什么呢?
基准测试(n=1,000,000)
filter 9630341 ns/op 23004421 B/op 37 allocs/op
filterCap 6906778 ns/op 8003584 B/op 1 allocs/op
速度差异仍然显著(约1.4倍),这要归功于减少了36次对runtime.makeslice
的调用。然而,更大的差异在于内存分配(减少了约4倍)。
更好的方法 - 校准cap参数
你可能已经注意到在第一个基准测试中,cap
使得整体内存分配变得更糟糕(80B vs 56B
)。这是因为你分配了10个槽位,但平均只需要其中的5个。这就是为什么不要不必要地设置cap
过高。根据你对程序的了解,你可以校准容量。在这种情况下,我们可以估计过滤后的切片将需要原始切片容量的50%。
func filterCalibratedCap(input []float64) []float64 {
ret := make([]float64, 0, len(input)/2)
for _, el := range input {
if el > 0.5 {
ret = append(ret, el)
}
}
return ret
}
毫不奇怪,这个校准的cap
分配的内存比之前的版本少了50%,所以在1百万个元素的情况下,相对于朴素实现,改进了约8倍。
另一种选择 - 直接访问而不是使用append
如果你想要进一步提高这样一个程序的性能,可以使用len
参数进行初始化(忽略cap参数),直接访问新的切片而不使用append,然后丢弃不需要的槽位。
func filterLen(input []float64) []float64 {
ret := make([]float64, len(input))
var counter int
for _, el := range input {
if el > 0.5 {
ret[counter] = el
counter++
}
}
return ret[:counter]
}
在大规模情况下,这比filterCap
快约10%。然而,除了更加复杂之外,这种模式在尝试校准内存需求时不提供与cap
相同的安全性。
- 使用
cap
校准时,如果低估了总容量需求,程序将在需要时自动分配更多内存。 - 使用这种方法时,如果低估了总长度需求,程序将失败。在这个例子中,如果初始化为
ret := make([]float64, len(input)/2)
,并且len(output) > len(input)/2
,那么在某个时刻程序将尝试访问不存在的槽位并引发panic。
英文:
As others have already said, using the cap
parameter can avoid unnecessary allocations. To give a sense of the performance difference, imagine you have a []float64
of random values and want a new slice that filters out values that are not above, say, 0.5
.
Naive approach - no len or cap param
func filter(input []float64) []float64 {
ret := make([]float64, 0)
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Better approach - using cap param
func filterCap(input []float64) []float64 {
ret := make([]float64, 0, len(input))
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Benchmarks (n=10)
filter 131 ns/op 56 B/op 3 allocs/op
filterCap 56 ns/op 80 B/op 1 allocs/op
Using cap
made the program 2x+ faster and reduced the number of allocations from 3 to 1. Now what happens at scale?
Benchmarks (n=1,000,000)
filter 9630341 ns/op 23004421 B/op 37 allocs/op
filterCap 6906778 ns/op 8003584 B/op 1 allocs/op
The speed difference is still significant (~1.4x) thanks to 36 fewer calls to runtime.makeslice
. However, the bigger difference is the memory allocation (~4x less).
Even better - calibrating the cap
You may have noticed in the first benchmark that cap
makes the overall memory allocation worse (80B vs 56B
). This is because you allocate 10 slots but only need, on average, 5 of them. This is why you don't want to set cap
unnecessarily high. Given what you know about your program, you may be able to calibrate the capacity. In this case, we can estimate that our filtered slice will need 50% as many slots as the original slice.
func filterCalibratedCap(input []float64) []float64 {
ret := make([]float64, 0, len(input)/2)
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Unsurprisingly, this calibrated cap
allocates 50% as much memory as its predecessor, so that's ~8x improvement on the naive implementation at 1m elements.
Another option - using direct access instead of append
If you are looking to shave even more time off a program like this, initialize with the len
parameter (and ignore the cap parameter), access the new slice directly instead of using append, then throw away all the slots you don't need.
func filterLen(input []float64) []float64 {
ret := make([]float64, len(input))
var counter int
for _, el := range input {
if el > .5 {
ret0+网站访问量 = el
counter++
}
}
return ret[:counter]
}
This is ~10% faster than filterCap
at scale. However, in addition to being more complicated, this pattern does not provide the same safety as cap
if you try and calibrate the memory requirement.
- With
cap
calibration, if you underestimate the total capacity required, then the program will automatically allocate more when it needs it. - With this approach, if you underestimate the total
len
required, the program will fail. In this example, if you initialize asret := make([]float64, len(input)/2)
, and it turns out thatlen(output) > len(input)/2
, then at some point the program will try to access a non-existent slot and panic.
答案4
得分: 0
每次向长度为len(mySlice) == cap(mySlice)
的切片中添加一个元素时,底层数据结构都会被替换为一个更大的结构。
fmt.Printf("原始容量:%v", cap(mySlice)) // 输出:8
mySlice = append(mySlice, myNewItem)
fmt.Printf("新容量:%v", cap(mySlice)) // 输出:16
在这里,通过赋值运算符,mySlice
被替换为一个新的切片,其中包含原始mySlice
的所有元素,加上myNewItem
,还有一些空间(容量)可以在不触发此调整大小的情况下进行扩展。
可以想象,这个调整大小的操作在计算上是非常复杂的。
通常情况下,如果你知道需要在mySlice
中存储多少个项目,就可以避免所有的调整大小操作。如果你有这个预知,你可以事先设置原始切片的容量,避免所有的调整大小操作。
(实际上,通常可以知道将向集合中添加多少个项目;特别是在将数据从一种格式转换为另一种格式时。)
英文:
Each time you add an item to a slice that has len(mySlice) == cap(mySlice)
, the underlying data structure is replaced with a larger structure.
fmt.Printf("Original Capacity: %v", cap(mySlice)) // Output: 8
mySlice = append(mySlice, myNewItem)
fmt.Printf("New Capacity: %v", cap(mySlice)) // Output: 16
Here, mySlice
is replaced (through the assignment operator) with a new slice containing all the elements of the original mySlice
, plus myNewItem
, plus some room (capacity) to grow without triggering this resize.
As you can imagine, this resizing operation is computationally non-trivial.
Quite often, all the resize operations can be avoided if you know how many items you will need to store in mySlice
. If you have this foreknowledge, you can set the capacity of the original slice upfront and avoid all the resize operations.
(In practice, it's quite often possible to know how many items will be added to a collection; especially when transforming data from one format to another.)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论