`append`复杂度

huangapple go评论92阅读模式
英文:

`append` complexity

问题

这个循环在Go编程语言中的计算复杂度是多少?

var a []int
for i := 0 ; i < n ; i++ {
  a = append(a, i)
}

append操作是线性时间复杂度的(每次追加都重新分配内存并复制所有内容),还是摊销常数时间复杂度的(类似于许多语言中的向量类的实现方式)?

英文:

What is the computational complexity of this loop in the Go programming language?

var a []int
for i := 0 ; i &lt; n ; i++ {
  a = append(a, i)
}

Does append operate in linear time (reallocating memory and copying everything on each append), or in amortized constant time (like the way vector classes in many languages are implemnted)?

答案1

得分: 23

根据《Go编程语言规范》所述,append内置函数在需要时会重新分配内存。

如果切片s的容量不足以容纳额外的值,append会分配一个新的足够大的切片,以适应现有的切片元素和额外的值。因此,返回的切片可能引用不同的底层数组。

具体的增长算法取决于实现方式。对于当前的gc编译器算法,请参见Go runtime包中的slice.go源文件中的growslice函数。它的摊销时间是常数。

在一部分中,增长切片的计算量如下:

newcap := old.cap
doublecap := newcap + newcap
if cap > doublecap {
    newcap = cap
} else {
    if old.len < 1024 {
        newcap = doublecap
    } else {
        for newcap < cap {
            newcap += newcap / 4
        }
    }
}

补充说明:

《Go编程语言规范》允许语言的实现者以多种方式实现append内置函数。

例如,新的分配只需要“足够大”。分配的数量可以是节俭的,即分配最小必要的数量,也可以是慷慨的,即分配超过最小必要数量以最小化多次调整大小的成本。Go gc编译器使用了一种慷慨的动态数组摊销常数时间算法。

以下代码展示了append内置函数的两种合法实现。慷慨的常数函数实现了与Go gc编译器相同的摊销常数时间算法。节俭的变量函数在初始分配填满后,每次都会重新分配和复制所有内容。这里使用Go append函数和Go gccgo编译器作为对照。

package main

import "fmt"

// 慷慨的重新分配
func constant(s []int, x ...int) []int {
    if len(s)+len(x) > cap(s) {
        newcap := len(s) + len(x)
        m := cap(s)
        if m+m < newcap {
            m = newcap
        } else {
            for {
                if len(s) < 1024 {
                    m += m
                } else {
                    m += m / 4
                }
                if !(m < newcap) {
                    break
                }
            }
        }
        tmp := make([]int, len(s), m)
        copy(tmp, s)
        s = tmp
    }
    if len(s)+len(x) > cap(s) {
        panic("unreachable")
    }
    return append(s, x...)
}

// 节俭的重新分配
func variable(s []int, x ...int) []int {
    if len(s)+len(x) > cap(s) {
        tmp := make([]int, len(s), len(s)+len(x))
        copy(tmp, s)
        s = tmp
    }
    if len(s)+len(x) > cap(s) {
        panic("unreachable")
    }
    return append(s, x...)
}

func main() {
    s := []int{0, 1, 2}
    x := []int{3, 4}
    fmt.Println("data    ", len(s), cap(s), s, len(x), cap(x), x)
    a, c, v := s, s, s
    for i := 0; i < 4096; i++ {
        a = append(a, x...)
        c = constant(c, x...)
        v = variable(v, x...)
    }
    fmt.Println("append  ", len(a), cap(a), len(x))
    fmt.Println("constant", len(c), cap(c), len(x))
    fmt.Println("variable", len(v), cap(v), len(x))
}

输出:

gc:

data     3 3 [0 1 2] 2 2 [3 4]
append   8195 9152 2
constant 8195 9152 2
variable 8195 8195 2

gccgo:

data     3 3 [0 1 2] 2 2 [3 4]
append   8195 9152 2
constant 8195 9152 2
variable 8195 8195 2

总结一下,根据实现方式的不同,一旦初始容量填满,append内置函数可能会在每次调用时重新分配或不重新分配。

参考资料:

动态数组

摊销分析

《Go编程语言规范》

Go runtime slice.go源文件

数组、切片(和字符串):'append'的机制

英文:

The Go Programming Language Specification says that the append built-in function reallocates if necessary.

> Appending to and copying slices
>
> If the capacity of s is not large enough to fit the additional values,
> append allocates a new, sufficiently large slice that fits both the
> existing slice elements and the additional values. Thus, the returned
> slice may refer to a different underlying array.

The precise algorithm to grow the target slice, when necessary, for an append is implementation dependent. For the current gc compiler algorithm, see the growslice function in the Go runtime package slice.go source file. It's amortized constant time.

In part, the amount-to-grow slice computation reads:

	newcap := old.cap
doublecap := newcap + newcap
if cap &gt; doublecap {
newcap = cap
} else {
if old.len &lt; 1024 {
newcap = doublecap
} else {
for newcap &lt; cap {
newcap += newcap / 4
}
}
}

ADDENDUM

The Go Programming Language Specification allows implementors of the language to implement the append built-in function in a number of ways.

For example, new allocations only have to be "sufficiently large". The amount allocated may be parsimonius, allocating the minimum necessary amount, or generous, allocating more than the minimum necessary amount to minimize the cost of resizing many times. The Go gc compiler uses a generous dynamic array amortized constant time algorithm.

The following code illustrates two legal implementations of the append built-in function. The generous constant function implements the same amortized constant time algorithm as the Go gc compiler. The parsimonius variable function, once the initial allocation is filled, reallocates and copies everything every time. The Go append function and the Go gccgo compiler are used as controls.

package main
import &quot;fmt&quot;
// Generous reallocation
func constant(s []int, x ...int) []int {
if len(s)+len(x) &gt; cap(s) {
newcap := len(s) + len(x)
m := cap(s)
if m+m &lt; newcap {
m = newcap
} else {
for {
if len(s) &lt; 1024 {
m += m
} else {
m += m / 4
}
if !(m &lt; newcap) {
break
}
}
}
tmp := make([]int, len(s), m)
copy(tmp, s)
s = tmp
}
if len(s)+len(x) &gt; cap(s) {
panic(&quot;unreachable&quot;)
}
return append(s, x...)
}
// Parsimonious reallocation
func variable(s []int, x ...int) []int {
if len(s)+len(x) &gt; cap(s) {
tmp := make([]int, len(s), len(s)+len(x))
copy(tmp, s)
s = tmp
}
if len(s)+len(x) &gt; cap(s) {
panic(&quot;unreachable&quot;)
}
return append(s, x...)
}
func main() {
s := []int{0, 1, 2}
x := []int{3, 4}
fmt.Println(&quot;data    &quot;, len(s), cap(s), s, len(x), cap(x), x)
a, c, v := s, s, s
for i := 0; i &lt; 4096; i++ {
a = append(a, x...)
c = constant(c, x...)
v = variable(v, x...)
}
fmt.Println(&quot;append  &quot;, len(a), cap(a), len(x))
fmt.Println(&quot;constant&quot;, len(c), cap(c), len(x))
fmt.Println(&quot;variable&quot;, len(v), cap(v), len(x))
}

Output:

gc:

data     3 3 [0 1 2] 2 2 [3 4]
append   8195 9152 2
constant 8195 9152 2
variable 8195 8195 2

gccgo:

data     3 3 [0 1 2] 2 2 [3 4]
append   8195 9152 2
constant 8195 9152 2
variable 8195 8195 2

To summarize, depending on the implementation, once the initial capacity is filled, the append built-in function may or may not reallocate on every call.

References:

Dynamic array

Amortized analysis

> Appending to and copying slices
>
> If the capacity of s is not large enough to fit the additional values,
> append allocates a new, sufficiently large slice that fits both the
> existing slice elements and the additional values. Thus, the returned
> slice may refer to a different underlying array.
>
> Append to a slice specification discussion
>
> The spec (at tip and 1.0.3) states:
>
> "If the capacity of s is not large enough to fit the additional
> values, append allocates a new, sufficiently large slice that fits
> both the existing slice elements and the additional values. Thus, the
> returned slice may refer to a different underlying array."
>
> Should this be an "If and only if"? For example, if I know the
> capacity of my slice is sufficiently long, am I assured that I will
> not change the underlying array?
>
> Rob Pike
>
> Yes you are so assured.

runtime slice.go source file

Arrays, slices (and strings): The mechanics of 'append'

答案2

得分: 0

它不会在每次追加时重新分配内存,并且在文档中明确说明:

> 如果 s 的容量不足以容纳额外的值,则 append 会分配一个新的足够大的切片,该切片既适合现有的切片元素,也适合额外的值。因此,返回的切片可能引用不同的底层数组。

因此,所询问的复杂度是摊销常数时间。

英文:

It doesn't reallocate on every append and it is quite explicitly stated in the docs:

> If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large slice that fits both the existing slice elements and the additional values. Thus, the returned slice may refer to a different underlying array.

Amortized constant time is thus the complexity asked about.

huangapple
  • 本文由 发表于 2013年3月29日 19:37:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/15702419.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定