英文:
Golang: appending slices with or w/o allocation
问题
Go的append()
函数只在给定切片的容量不足时分配新的切片数据(参见:https://stackoverflow.com/a/28143457/802833)。这可能会导致意外的行为(至少对于我这样的Go语言新手来说):
package main
import (
"fmt"
)
func main() {
a1 := make([][]int, 3)
a2 := make([][]int, 3)
b := [][]int{{1, 1, 1}, {2, 2, 2}, {3, 3, 3}}
common1 := make([]int, 0)
common2 := make([]int, 0, 12) // 提供足够的容量
common1 = append(common1, []int{10, 20}...)
common2 = append(common2, []int{10, 20}...)
idx := 0
for _, k := range b {
a1[idx] = append(common1, k...) // 分配新的切片
a2[idx] = append(common2, k...) // 不分配
idx++
}
fmt.Println(a1)
fmt.Println(a2) // 令人惊讶!!!
}
输出:
[[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
[[10 20 3 3 3] [10 20 3 3 3] [10 20 3 3 3]]
https://play.golang.org/p/8PEqFxAsMt
那么,在Go中强制分配新的切片数据或确保append()
的切片参数保持不变的(惯用)方法是什么?
英文:
Go's append()
function only allocates new slice data, when the capacity of the given slice is not sufficient (see also: https://stackoverflow.com/a/28143457/802833). This can lead to unexpected behavior (at least for me as a golang newbie):
package main
import (
"fmt"
)
func main() {
a1 := make([][]int, 3)
a2 := make([][]int, 3)
b := [][]int{{1, 1, 1}, {2, 2, 2}, {3, 3, 3}}
common1 := make([]int, 0)
common2 := make([]int, 0, 12) // provide sufficient capacity
common1 = append(common1, []int{10, 20}...)
common2 = append(common2, []int{10, 20}...)
idx := 0
for _, k := range b {
a1[idx] = append(common1, k...) // new slice is allocated
a2[idx] = append(common2, k...) // no allocation
idx++
}
fmt.Println(a1)
fmt.Println(a2) // surprise!!!
}
output:
> [[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
>
> [[10 20 3 3 3] [10 20 3 3 3] [10 20 3 3 3]]
https://play.golang.org/p/8PEqFxAsMt
So, what ist the (idomatic) way in Go to force allocation of new slice data or more precisely to make sure that the slice argument to append()
remains unchanged?
答案1
得分: 9
你可能对Go语言中切片的工作原理有误解。
当你向切片追加元素时,调用append()
函数会返回一个新的切片。如果没有发生重新分配,调用append()
的切片和它返回的切片将共享同一个底层数组,但它们的长度会不同;请看以下示例代码:
package main
import "fmt"
func main() {
a := make([]int, 0, 10)
b := append(a, 1, 2, 3)
c := append(a, 4, 3, 2)
fmt.Printf("a=%#v\nb=%#v\nc=%#v\n", a, b, c)
}
输出结果为:
a=[]int{}
b=[]int{4, 3, 2}
c=[]int{4, 3, 2}
因此,len(a) == 0
,len(b) == 3
,len(c) == 3
,第二次调用append()
覆盖了第一次调用的结果,因为所有的切片都共享同一个底层数组。
至于底层数组的重新分配,规范中有明确说明:
如果切片
s
的容量不足以容纳额外的值,append()
会分配一个新的足够大的底层数组,用于容纳现有的切片元素和额外的值。否则,append()
会重用底层数组。
由此可见:
- 如果被追加的切片的容量足够,
append()
不会复制底层存储。 - 如果容量不足,数组将会重新分配。
也就是说,给定一个切片s
,你想要追加N
个元素,只有当cap(s) - len(s) ≥ N
时,才不会进行重新分配。
因此,我怀疑你的问题不是关于意外的重新分配结果,而是关于Go语言中切片的概念。你应该理解的代码思想是,append()
返回结果切片值,你应该在调用后使用它,除非你完全理解其影响。
我建议你从这里开始,全面理解切片的使用方法和内部实现。
英文:
You might maintain a wrong idea of how slices work in Go.
When you append elements to a slice, the call to append()
returns a new slice. If reallocation did not happen, both slice values — the one you called append()
on and the one it returned back — share the same backing array but they will have different lengths; observe:
package main
import "fmt"
func main() {
a := make([]int, 0, 10)
b := append(a, 1, 2, 3)
c := append(a, 4, 3, 2)
fmt.Printf("a=%#v\nb=%#v\nc=%#v\n", a, b, c)
}
outputs:
a=[]int{}
b=[]int{4, 3, 2}
c=[]int{4, 3, 2}
So, len(a) == 0
, len(b) == 3
, len(c) == 3
, and the second call to append()
owerwrote what the first one did because all the slices share the same underlying array.
As to reallocation of the backing array, the spec is clear:
> If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large underlying array that fits both the existing slice elements and the additional values. Otherwise, append re-uses the underlying array.
From this, it follows that:
append()
never copies the underlying storage if the capacity of the slice being appeneded to is sufficient.- If there's not enough capacity, the array will be reallocated.
That is, given a slice s
to which you want to append N
elements, the reallocation won't be done iff cap(s) - len(s) ≥ N
.
Hence I suspect your problem is not about unexpected reallocation results but rather about the concept of slices as implemented in Go. The code idea to absorb is that append()
returns the resulting slice value, which you're supposed to be using after the call unless you fully understand the repercussions.
I recommend starting with this to fully understand them.
答案2
得分: 0
感谢您的反馈。
因此,控制内存分配的解决方案是显式地进行分配(这让我想起Go语言更像是一种系统语言而不是其他(脚本)语言):
package main
import (
"fmt"
)
func main() {
a1 := make([][]int, 3)
a2 := make([][]int, 3)
b := [][]int{{1, 1, 1}, {2, 2, 2}, {3, 3, 3}}
common1 := make([]int, 0)
common2 := make([]int, 0, 12) // 提供足够的容量
common1 = append(common1, []int{10, 20}...)
common2 = append(common2, []int{10, 20}...)
idx := 0
for _, k := range b {
a1[idx] = append(common1, k...) // 分配新的切片
a2[idx] = make([]int, len(common2), len(common2)+len(k))
copy(a2[idx], common2) // 复制和追加可能可以合并为单个复制步骤
a2[idx] = append(a2[idx], k...)
idx++
}
fmt.Println(a1)
fmt.Println(a2)
}
输出:
[[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
[[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
链接:https://play.golang.org/p/Id_wSZwb84
英文:
Thanx for your feedback.
So the solution to gain control of the memory allocation is to do it explicitely (which remembers me that Go is a more a system language than other (scripting) langs):
package main
import (
"fmt"
)
func main() {
a1 := make([][]int, 3)
a2 := make([][]int, 3)
b := [][]int{{1, 1, 1}, {2, 2, 2}, {3, 3, 3}}
common1 := make([]int, 0)
common2 := make([]int, 0, 12) // provide sufficient capacity
common1 = append(common1, []int{10, 20}...)
common2 = append(common2, []int{10, 20}...)
idx := 0
for _, k := range b {
a1[idx] = append(common1, k...) // new slice is allocated
a2[idx] = make([]int, len(common2), len(common2)+len(k))
copy(a2[idx], common2) // copy & append could probably be
a2[idx] = append(a2[idx], k...) // combined into a single copy step
idx++
}
fmt.Println(a1)
fmt.Println(a2)
}
output:
> [[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
>
> [[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论