英文:
`append` complexity
问题
这个循环在Go编程语言中的计算复杂度是多少?
var a []int
for i := 0 ; i < n ; i++ {
a = append(a, i)
}
append
操作是线性时间复杂度的(每次追加都重新分配内存并复制所有内容),还是摊销常数时间复杂度的(类似于许多语言中的向量类的实现方式)?
英文:
What is the computational complexity of this loop in the Go programming language?
var a []int
for i := 0 ; i < n ; i++ {
a = append(a, i)
}
Does append
operate in linear time (reallocating memory and copying everything on each append), or in amortized constant time (like the way vector classes in many languages are implemnted)?
答案1
得分: 23
根据《Go编程语言规范》所述,append
内置函数在需要时会重新分配内存。
如果切片s
的容量不足以容纳额外的值,append
会分配一个新的足够大的切片,以适应现有的切片元素和额外的值。因此,返回的切片可能引用不同的底层数组。
具体的增长算法取决于实现方式。对于当前的gc
编译器算法,请参见Go runtime
包中的slice.go
源文件中的growslice
函数。它的摊销时间是常数。
在一部分中,增长切片的计算量如下:
newcap := old.cap
doublecap := newcap + newcap
if cap > doublecap {
newcap = cap
} else {
if old.len < 1024 {
newcap = doublecap
} else {
for newcap < cap {
newcap += newcap / 4
}
}
}
补充说明:
《Go编程语言规范》允许语言的实现者以多种方式实现append
内置函数。
例如,新的分配只需要“足够大”。分配的数量可以是节俭的,即分配最小必要的数量,也可以是慷慨的,即分配超过最小必要数量以最小化多次调整大小的成本。Go gc
编译器使用了一种慷慨的动态数组摊销常数时间算法。
以下代码展示了append
内置函数的两种合法实现。慷慨的常数函数实现了与Go gc
编译器相同的摊销常数时间算法。节俭的变量函数在初始分配填满后,每次都会重新分配和复制所有内容。这里使用Go append
函数和Go gccgo
编译器作为对照。
package main
import "fmt"
// 慷慨的重新分配
func constant(s []int, x ...int) []int {
if len(s)+len(x) > cap(s) {
newcap := len(s) + len(x)
m := cap(s)
if m+m < newcap {
m = newcap
} else {
for {
if len(s) < 1024 {
m += m
} else {
m += m / 4
}
if !(m < newcap) {
break
}
}
}
tmp := make([]int, len(s), m)
copy(tmp, s)
s = tmp
}
if len(s)+len(x) > cap(s) {
panic("unreachable")
}
return append(s, x...)
}
// 节俭的重新分配
func variable(s []int, x ...int) []int {
if len(s)+len(x) > cap(s) {
tmp := make([]int, len(s), len(s)+len(x))
copy(tmp, s)
s = tmp
}
if len(s)+len(x) > cap(s) {
panic("unreachable")
}
return append(s, x...)
}
func main() {
s := []int{0, 1, 2}
x := []int{3, 4}
fmt.Println("data ", len(s), cap(s), s, len(x), cap(x), x)
a, c, v := s, s, s
for i := 0; i < 4096; i++ {
a = append(a, x...)
c = constant(c, x...)
v = variable(v, x...)
}
fmt.Println("append ", len(a), cap(a), len(x))
fmt.Println("constant", len(c), cap(c), len(x))
fmt.Println("variable", len(v), cap(v), len(x))
}
输出:
gc:
data 3 3 [0 1 2] 2 2 [3 4]
append 8195 9152 2
constant 8195 9152 2
variable 8195 8195 2
gccgo:
data 3 3 [0 1 2] 2 2 [3 4]
append 8195 9152 2
constant 8195 9152 2
variable 8195 8195 2
总结一下,根据实现方式的不同,一旦初始容量填满,append
内置函数可能会在每次调用时重新分配或不重新分配。
参考资料:
《Go编程语言规范》
英文:
The Go Programming Language Specification says that the append
built-in function reallocates if necessary.
> Appending to and copying slices
>
> If the capacity of s is not large enough to fit the additional values,
> append allocates a new, sufficiently large slice that fits both the
> existing slice elements and the additional values. Thus, the returned
> slice may refer to a different underlying array.
The precise algorithm to grow the target slice, when necessary, for an append is implementation dependent. For the current gc
compiler algorithm, see the growslice
function in the Go runtime
package slice.go
source file. It's amortized constant time.
In part, the amount-to-grow slice computation reads:
newcap := old.cap
doublecap := newcap + newcap
if cap > doublecap {
newcap = cap
} else {
if old.len < 1024 {
newcap = doublecap
} else {
for newcap < cap {
newcap += newcap / 4
}
}
}
ADDENDUM
The Go Programming Language Specification allows implementors of the language to implement the append
built-in function in a number of ways.
For example, new allocations only have to be "sufficiently large". The amount allocated may be parsimonius, allocating the minimum necessary amount, or generous, allocating more than the minimum necessary amount to minimize the cost of resizing many times. The Go gc
compiler uses a generous dynamic array amortized constant time algorithm.
The following code illustrates two legal implementations of the append
built-in function. The generous constant function implements the same amortized constant time algorithm as the Go gc
compiler. The parsimonius variable function, once the initial allocation is filled, reallocates and copies everything every time. The Go append
function and the Go gccgo
compiler are used as controls.
package main
import "fmt"
// Generous reallocation
func constant(s []int, x ...int) []int {
if len(s)+len(x) > cap(s) {
newcap := len(s) + len(x)
m := cap(s)
if m+m < newcap {
m = newcap
} else {
for {
if len(s) < 1024 {
m += m
} else {
m += m / 4
}
if !(m < newcap) {
break
}
}
}
tmp := make([]int, len(s), m)
copy(tmp, s)
s = tmp
}
if len(s)+len(x) > cap(s) {
panic("unreachable")
}
return append(s, x...)
}
// Parsimonious reallocation
func variable(s []int, x ...int) []int {
if len(s)+len(x) > cap(s) {
tmp := make([]int, len(s), len(s)+len(x))
copy(tmp, s)
s = tmp
}
if len(s)+len(x) > cap(s) {
panic("unreachable")
}
return append(s, x...)
}
func main() {
s := []int{0, 1, 2}
x := []int{3, 4}
fmt.Println("data ", len(s), cap(s), s, len(x), cap(x), x)
a, c, v := s, s, s
for i := 0; i < 4096; i++ {
a = append(a, x...)
c = constant(c, x...)
v = variable(v, x...)
}
fmt.Println("append ", len(a), cap(a), len(x))
fmt.Println("constant", len(c), cap(c), len(x))
fmt.Println("variable", len(v), cap(v), len(x))
}
Output:
gc:
data 3 3 [0 1 2] 2 2 [3 4]
append 8195 9152 2
constant 8195 9152 2
variable 8195 8195 2
gccgo:
data 3 3 [0 1 2] 2 2 [3 4]
append 8195 9152 2
constant 8195 9152 2
variable 8195 8195 2
To summarize, depending on the implementation, once the initial capacity is filled, the append
built-in function may or may not reallocate on every call.
References:
> Appending to and copying slices
>
> If the capacity of s is not large enough to fit the additional values,
> append
allocates a new, sufficiently large slice that fits both the
> existing slice elements and the additional values. Thus, the returned
> slice may refer to a different underlying array.
>
> Append to a slice specification discussion
>
> The spec (at tip and 1.0.3) states:
>
> "If the capacity of s is not large enough to fit the additional
> values, append
allocates a new, sufficiently large slice that fits
> both the existing slice elements and the additional values. Thus, the
> returned slice may refer to a different underlying array."
>
> Should this be an "If and only if"? For example, if I know the
> capacity of my slice is sufficiently long, am I assured that I will
> not change the underlying array?
>
> Rob Pike
>
> Yes you are so assured.
runtime slice.go source file
答案2
得分: 0
它不会在每次追加时重新分配内存,并且在文档中明确说明:
> 如果 s 的容量不足以容纳额外的值,则 append 会分配一个新的足够大的切片,该切片既适合现有的切片元素,也适合额外的值。因此,返回的切片可能引用不同的底层数组。
因此,所询问的复杂度是摊销常数时间。
英文:
It doesn't reallocate on every append and it is quite explicitly stated in the docs:
> If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large slice that fits both the existing slice elements and the additional values. Thus, the returned slice may refer to a different underlying array.
Amortized constant time is thus the complexity asked about.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论