英文:
go benchmark and gc: B/op alloc/op
问题
基准测试代码:
func BenchmarkSth(b *testing.B) {
var x []int
b.ResetTimer()
for i := 0; i < b.N; i++ {
x = append(x, i)
}
}
结果:
BenchmarkSth-4 50000000 20.7 ns/op 40 B/op 0 allocs/op
问题:
- 40 B/op 是从哪里来的?(非常感谢任何追踪和说明的方法)
- 在没有分配的情况下,如何有 40 B/op?
- B/op 和 allocs/op 哪一个会影响垃圾回收(GC),以及如何影响?
- 使用 append 函数真的可能有 0 B/op 吗?
英文:
benchmark code:
func BenchmarkSth(b *testing.B) {
var x []int
b.ResetTimer()
for i := 0; i < b.N; i++ {
x = append(x, i)
}
}
result:
BenchmarkSth-4 50000000 20.7 ns/op 40 B/op 0 allocs/op
question/s:
- where did 40 B/op come from? (any way of tracing + instructions is greatly appreciated)
- how is it possible to have 40 B/op while having 0 allocs?
- which one affects GC and how? (B/op or allocs/op)
- is it really possible to have 0 B/op using append?
答案1
得分: 5
《Go编程语言规范》
附加和复制切片
可变参数函数append
将零个或多个值x
附加到类型为S
的切片s
中,并返回结果切片,类型也为S
。
append(s S, x ...T) S // T是S的元素类型
如果s
的容量不足以容纳附加的值,append
会分配一个足够大的新底层数组,以容纳现有切片元素和附加的值。否则,append
会重用底层数组。
对于你的示例,平均每个操作在需要时分配了[40, 41)字节的容量来增加切片的容量。容量增加使用了摊销的常数时间算法:当长度小于1024时,容量增加为原来的2倍,然后增加为原来容量的1.25倍。平均每个操作有[0, 1)次分配。
例如:
func BenchmarkMem(b *testing.B) {
b.ReportAllocs()
var x []int64
var a, ac int64
b.ResetTimer()
for i := 0; i < b.N; i++ {
c := cap(x)
x = append(x, int64(i))
if cap(x) != c {
a++
ac += int64(cap(x))
}
}
b.StopTimer()
sizeInt64 := int64(8)
B := ac * sizeInt64 // 字节
b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x))
}
输出:
BenchmarkMem-4 50000000 26.6 ns/op 40 B/op 0 allocs/op
--- BENCH: BenchmarkMem-4
bench_test.go:32: op 1 B 8 alloc 1 lx 1 cx 1
bench_test.go:32: op 100 B 2040 alloc 8 lx 100 cx 128
bench_test.go:32: op 10000 B 386296 alloc 20 lx 10000 cx 12288
bench_test.go:32: op 1000000 B 45188344 alloc 40 lx 1000000 cx 1136640
bench_test.go:32: op 50000000 B 2021098744 alloc 57 lx 50000000 cx 50539520
对于op = 50000000
,
B/op = floor(2021098744 / 50000000) = floor(40.421974888) = 40
allocs/op = floor(57 / 50000000) = floor(0.00000114) = 0
阅读:
《Go切片:用法和内部机制》
《数组、切片(和字符串):'append'的机制》
《'append'的复杂度》
要使append
的B/op
(和allocs/op
)为零,请在附加之前分配具有足够容量的切片。
例如,使用var x = make([]int64, 0, b.N)
:
func BenchmarkZero(b *testing.B) {
b.ReportAllocs()
var x = make([]int64, 0, b.N)
var a, ac int64
b.ResetTimer()
for i := 0; i < b.N; i++ {
c := cap(x)
x = append(x, int64(i))
if cap(x) != c {
a++
ac += int64(cap(x))
}
}
b.StopTimer()
sizeInt64 := int64(8)
B := ac * sizeInt64 // 字节
b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x))
}
输出:
BenchmarkZero-4 100000000 11.7 ns/op 0 B/op 0 allocs/op
--- BENCH: BenchmarkZero-4
bench_test.go:51: op 1 B 0 alloc 0 lx 1 cx 1
bench_test.go:51: op 100 B 0 alloc 0 lx 100 cx 100
bench_test.go:51: op 10000 B 0 alloc 0 lx 10000 cx 10000
bench_test.go:51: op 1000000 B 0 alloc 0 lx 1000000 cx 1000000
bench_test.go:51: op 100000000 B 0 alloc 0 lx 100000000 cx 100000000
注意,基准测试的CPU时间从大约26.6 ns/op减少到大约11.7 ns/op。
英文:
> The Go Programming Language Specification
>
> Appending to and copying slices
>
> The variadic function append appends zero or more values x to s of
> type S, which must be a slice type, and returns the resulting slice,
> also of type S.
>
> append(s S, x ...T) S // T is the element type of S
>
> If the capacity of s is not large enough to fit the additional values,
> append allocates a new, sufficiently large underlying array that fits
> both the existing slice elements and the additional values. Otherwise,
> append re-uses the underlying array.
For your example, on average, [40, 41) bytes per operation are allocated to increase the capacity of the slice when necessary. The capacity is increased using an amortized constant time algorithm: up to len 1024 increase to 2 times cap then increase to 1.25 times cap. On average, there are [0, 1) allocations per operation.
For example,
func BenchmarkMem(b *testing.B) {
b.ReportAllocs()
var x []int64
var a, ac int64
b.ResetTimer()
for i := 0; i < b.N; i++ {
c := cap(x)
x = append(x, int64(i))
if cap(x) != c {
a++
ac += int64(cap(x))
}
}
b.StopTimer()
sizeInt64 := int64(8)
B := ac * sizeInt64 // bytes
b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x))
}
Output:
BenchmarkMem-4 50000000 26.6 ns/op 40 B/op 0 allocs/op
--- BENCH: BenchmarkMem-4
bench_test.go:32: op 1 B 8 alloc 1 lx 1 cx 1
bench_test.go:32: op 100 B 2040 alloc 8 lx 100 cx 128
bench_test.go:32: op 10000 B 386296 alloc 20 lx 10000 cx 12288
bench_test.go:32: op 1000000 B 45188344 alloc 40 lx 1000000 cx 1136640
bench_test.go:32: op 50000000 B 2021098744 alloc 57 lx 50000000 cx 50539520
For op = 50000000
,
B/op = floor(2021098744 / 50000000) = floor(40.421974888) = 40
allocs/op = floor(57 / 50000000) = floor(0.00000114) = 0
Read:
Go Slices: usage and internals
Arrays, slices (and strings): The mechanics of 'append'
To have zero B/op (and zero allocs/op) for append, allocate a slice with sufficient capacity before appending.
For example, with var x = make([]int64, 0, b.N)
,
func BenchmarkZero(b *testing.B) {
b.ReportAllocs()
var x = make([]int64, 0, b.N)
var a, ac int64
b.ResetTimer()
for i := 0; i < b.N; i++ {
c := cap(x)
x = append(x, int64(i))
if cap(x) != c {
a++
ac += int64(cap(x))
}
}
b.StopTimer()
sizeInt64 := int64(8)
B := ac * sizeInt64 // bytes
b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x))
}
Output:
BenchmarkZero-4 100000000 11.7 ns/op 0 B/op 0 allocs/op
--- BENCH: BenchmarkZero-4
bench_test.go:51: op 1 B 0 alloc 0 lx 1 cx 1
bench_test.go:51: op 100 B 0 alloc 0 lx 100 cx 100
bench_test.go:51: op 10000 B 0 alloc 0 lx 10000 cx 10000
bench_test.go:51: op 1000000 B 0 alloc 0 lx 1000000 cx 1000000
bench_test.go:51: op 100000000 B 0 alloc 0 lx 100000000 cx 100000000
Note the reduction in benchmark CPU time from around 26.6 ns/op to around 11.7 ns/op.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论