2017年2月11日 12:49:53go评论119阅读模式

英文:

go benchmark and gc: B/op alloc/op

问题

基准测试代码：

func BenchmarkSth(b *testing.B) {
    var x []int
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        x = append(x, i)
    }
}

结果：

BenchmarkSth-4    50000000    20.7 ns/op    40 B/op    0 allocs/op

问题：

40 B/op 是从哪里来的？（非常感谢任何追踪和说明的方法）
在没有分配的情况下，如何有 40 B/op？
B/op 和 allocs/op 哪一个会影响垃圾回收（GC），以及如何影响？
使用 append 函数真的可能有 0 B/op 吗？

英文:

benchmark code:

func BenchmarkSth(b *testing.B) {
	var x []int
	b.ResetTimer()
	for i := 0; i &lt; b.N; i++ {
		x = append(x, i)
	}
}

result:

BenchmarkSth-4    50000000    20.7 ns/op    40 B/op    0 allocs/op

question/s:

where did 40 B/op come from? (any way of tracing + instructions is greatly appreciated)
how is it possible to have 40 B/op while having 0 allocs?
which one affects GC and how? (B/op or allocs/op)
is it really possible to have 0 B/op using append?

答案1

得分: 5

《Go编程语言规范》

附加和复制切片

可变参数函数append将零个或多个值x附加到类型为S的切片s中，并返回结果切片，类型也为S。

append(s S, x ...T) S  // T是S的元素类型

如果s的容量不足以容纳附加的值，append会分配一个足够大的新底层数组，以容纳现有切片元素和附加的值。否则，append会重用底层数组。

对于你的示例，平均每个操作在需要时分配了[40, 41)字节的容量来增加切片的容量。容量增加使用了摊销的常数时间算法：当长度小于1024时，容量增加为原来的2倍，然后增加为原来容量的1.25倍。平均每个操作有[0, 1)次分配。

例如：

func BenchmarkMem(b *testing.B) {
    b.ReportAllocs()
    var x []int64
    var a, ac int64
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        c := cap(x)
        x = append(x, int64(i))
        if cap(x) != c {
            a++
            ac += int64(cap(x))
        }
    }
    b.StopTimer()
    sizeInt64 := int64(8)
    B := ac * sizeInt64 // 字节
    b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x))
}

输出：

BenchmarkMem-4    50000000    26.6 ns/op    40 B/op    0 allocs/op
--- BENCH: BenchmarkMem-4
    bench_test.go:32: op 1 B 8 alloc 1 lx 1 cx 1
    bench_test.go:32: op 100 B 2040 alloc 8 lx 100 cx 128
    bench_test.go:32: op 10000 B 386296 alloc 20 lx 10000 cx 12288
    bench_test.go:32: op 1000000 B 45188344 alloc 40 lx 1000000 cx 1136640
    bench_test.go:32: op 50000000 B 2021098744 alloc 57 lx 50000000 cx 50539520

对于op = 50000000，

B/op = floor(2021098744 / 50000000) = floor(40.421974888) = 40
allocs/op = floor(57 / 50000000) = floor(0.00000114) = 0

阅读：

《Go切片：用法和内部机制》

《数组、切片（和字符串）：'append'的机制》

《'append'的复杂度》

要使append的B/op（和allocs/op）为零，请在附加之前分配具有足够容量的切片。

例如，使用var x = make([]int64, 0, b.N)：

func BenchmarkZero(b *testing.B) {
    b.ReportAllocs()
    var x = make([]int64, 0, b.N)
    var a, ac int64
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        c := cap(x)
        x = append(x, int64(i))
        if cap(x) != c {
            a++
            ac += int64(cap(x))
        }
    }
    b.StopTimer()
    sizeInt64 := int64(8)
    B := ac * sizeInt64 // 字节
    b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x))
}

输出：

BenchmarkZero-4    100000000    11.7 ns/op    0 B/op    0 allocs/op
--- BENCH: BenchmarkZero-4
    bench_test.go:51: op 1 B 0 alloc 0 lx 1 cx 1
    bench_test.go:51: op 100 B 0 alloc 0 lx 100 cx 100
    bench_test.go:51: op 10000 B 0 alloc 0 lx 10000 cx 10000
    bench_test.go:51: op 1000000 B 0 alloc 0 lx 1000000 cx 1000000
    bench_test.go:51: op 100000000 B 0 alloc 0 lx 100000000 cx 100000000

注意，基准测试的CPU时间从大约26.6 ns/op减少到大约11.7 ns/op。

英文:

> The Go Programming Language Specification
>
> Appending to and copying slices
>
> The variadic function append appends zero or more values x to s of
> type S, which must be a slice type, and returns the resulting slice,
> also of type S.
>
> append(s S, x ...T) S // T is the element type of S
>
> If the capacity of s is not large enough to fit the additional values,
> append allocates a new, sufficiently large underlying array that fits
> both the existing slice elements and the additional values. Otherwise,
> append re-uses the underlying array.

For your example, on average, [40, 41) bytes per operation are allocated to increase the capacity of the slice when necessary. The capacity is increased using an amortized constant time algorithm: up to len 1024 increase to 2 times cap then increase to 1.25 times cap. On average, there are [0, 1) allocations per operation.

For example,

func BenchmarkMem(b *testing.B) {
	b.ReportAllocs()
	var x []int64
	var a, ac int64
	b.ResetTimer()
	for i := 0; i &lt; b.N; i++ {
		c := cap(x)
		x = append(x, int64(i))
		if cap(x) != c {
			a++
			ac += int64(cap(x))
		}
	}
	b.StopTimer()
	sizeInt64 := int64(8)
	B := ac * sizeInt64 // bytes
	b.Log(&quot;op&quot;, b.N, &quot;B&quot;, B, &quot;alloc&quot;, a, &quot;lx&quot;, len(x), &quot;cx&quot;, cap(x))
}

Output:

BenchmarkMem-4   	50000000	        26.6 ns/op	      40 B/op	       0 allocs/op
--- BENCH: BenchmarkMem-4
	bench_test.go:32: op 1 B 8 alloc 1 lx 1 cx 1
	bench_test.go:32: op 100 B 2040 alloc 8 lx 100 cx 128
	bench_test.go:32: op 10000 B 386296 alloc 20 lx 10000 cx 12288
	bench_test.go:32: op 1000000 B 45188344 alloc 40 lx 1000000 cx 1136640
	bench_test.go:32: op 50000000 B 2021098744 alloc 57 lx 50000000 cx 50539520

For op = 50000000,

B/op = floor(2021098744 / 50000000) = floor(40.421974888) = 40
allocs/op = floor(57 / 50000000) = floor(0.00000114) = 0

Read:

Go Slices: usage and internals

Arrays, slices (and strings): The mechanics of 'append'

'append' complexity

To have zero B/op (and zero allocs/op) for append, allocate a slice with sufficient capacity before appending.

For example, with var x = make([]int64, 0, b.N),

func BenchmarkZero(b *testing.B) {
	b.ReportAllocs()
	var x = make([]int64, 0, b.N)
	var a, ac int64
	b.ResetTimer()
	for i := 0; i &lt; b.N; i++ {
		c := cap(x)
		x = append(x, int64(i))
		if cap(x) != c {
			a++
			ac += int64(cap(x))
		}
	}
	b.StopTimer()
	sizeInt64 := int64(8)
	B := ac * sizeInt64 // bytes
	b.Log(&quot;op&quot;, b.N, &quot;B&quot;, B, &quot;alloc&quot;, a, &quot;lx&quot;, len(x), &quot;cx&quot;, cap(x))
}

Output:

BenchmarkZero-4   	100000000	        11.7 ns/op	       0 B/op	       0 allocs/op
--- BENCH: BenchmarkZero-4
	bench_test.go:51: op 1 B 0 alloc 0 lx 1 cx 1
	bench_test.go:51: op 100 B 0 alloc 0 lx 100 cx 100
	bench_test.go:51: op 10000 B 0 alloc 0 lx 10000 cx 10000
	bench_test.go:51: op 1000000 B 0 alloc 0 lx 1000000 cx 1000000
	bench_test.go:51: op 100000000 B 0 alloc 0 lx 100000000 cx 100000000

Note the reduction in benchmark CPU time from around 26.6 ns/op to around 11.7 ns/op.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

go基准测试和垃圾回收：B/op分配/操作

问题

答案1

Golang的Xml Unmarshal，没有值

What golang compiler will do when fmt.Println()

Golang BigQuery：将数组插入表中

$GOPATH目录和磁盘空间

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。