英文:
Memory overhead of maps in Go
问题
在gccgo和gc中,Go maps每个条目的内存开销是多少?
英文:
map[byte]byte{0:10} should be using least 2 bytes, one for value and one per key. But as each hashmap implmentation, there is also a hidden cost per item.
What is the memory overhead per entry in Go maps in both gccgo and gc?
答案1
得分: 8
这是Nick程序的跨平台重新实现。我在其中发现了一些错误,并进行了更多的测量数据点的添加。
注意:为了允许更宽的“entries”范围,下面的测量映射是map[int16]byte
。
package main
import (
"fmt"
"runtime"
"unsafe"
)
func Alloc() uint64 {
var stats runtime.MemStats
runtime.GC()
runtime.ReadMemStats(&stats)
return stats.Alloc - uint64(unsafe.Sizeof(hs[0]))*uint64(cap(hs))
}
var hs = []*map[int16]byte{}
func main() {
hs := []*map[int16]byte{}
n := 1000
before := Alloc()
for i := 0; i < n; i++ {
h := map[int16]byte{}
hs = append(hs, &h)
}
after := Alloc()
emptyPerMap := float64(after-before) / float64(n)
fmt.Printf("Bytes used for %d empty maps: %d, bytes/map %.1f\n", n, after-before, emptyPerMap)
hs = nil
k := 1
for p := 1; p < 16; p++ {
before = Alloc()
for i := 0; i < n; i++ {
h := map[int16]byte{}
for j := 0; j < k; j++ {
h[int16(j)] = byte(j)
}
hs = append(hs, &h)
}
after = Alloc()
fullPerMap := float64(after-before) / float64(n)
fmt.Printf("Bytes used for %d maps with %d entries: %d, bytes/map %.1f\n", n, k, after-before, fullPerMap)
fmt.Printf("Bytes per entry %.1f\n", (fullPerMap-emptyPerMap)/float64(k))
k *= 2
}
}
输出:
jnml@fsc-r630:~/src/tmp$ go build && ./tmp && go version && uname -a
Bytes used for 1000 empty maps: 146816, bytes/map 146.8
Bytes used for 1000 maps with 1 entries: 147040, bytes/map 147.0
Bytes per entry 0.2
Bytes used for 1000 maps with 2 entries: 147040, bytes/map 147.0
Bytes per entry 0.1
Bytes used for 1000 maps with 4 entries: 247136, bytes/map 247.1
Bytes per entry 25.1
Bytes used for 1000 maps with 8 entries: 439056, bytes/map 439.1
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 818688, bytes/map 818.7
Bytes per entry 42.0
Bytes used for 1000 maps with 32 entries: 1194688, bytes/map 1194.7
Bytes per entry 32.7
Bytes used for 1000 maps with 64 entries: 2102976, bytes/map 2103.0
Bytes per entry 30.6
Bytes used for 1000 maps with 128 entries: 4155072, bytes/map 4155.1
Bytes per entry 31.3
Bytes used for 1000 maps with 256 entries: 6698688, bytes/map 6698.7
Bytes per entry 25.6
Bytes used for 1000 maps with 512 entries: 14142976, bytes/map 14143.0
Bytes per entry 27.3
Bytes used for 1000 maps with 1024 entries: 51349184, bytes/map 51349.2
Bytes per entry 50.0
Bytes used for 1000 maps with 2048 entries: 102467264, bytes/map 102467.3
Bytes per entry 50.0
Bytes used for 1000 maps with 4096 entries: 157214816, bytes/map 157214.8
Bytes per entry 38.3
Bytes used for 1000 maps with 8192 entries: 407031200, bytes/map 407031.2
Bytes per entry 49.7
Bytes used for 1000 maps with 16384 entries: 782616864, bytes/map 782616.9
Bytes per entry 47.8
go version devel +83b0b94af636 Sat Mar 09 16:25:30 2013 +1100 linux/amd64
Linux fsc-r630 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
jnml@fsc-r630:~/src/tmp$
很高兴看到这些数字有所改善(大约提高了4倍)。发布版本(1.0.3)的数字只稍微高一些:
jnml@fsc-r630:~/src/tmp$ go build && ./tmp
Bytes used for 1000 empty maps: 144192, bytes/map 144.2
Bytes used for 1000 maps with 1 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 2 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 4 entries: 315648, bytes/map 315.6
Bytes per entry 42.9
Bytes used for 1000 maps with 8 entries: 436288, bytes/map 436.3
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 885824, bytes/map 885.8
Bytes per entry 46.4
Bytes used for 1000 maps with 32 entries: 1331264, bytes/map 1331.3
Bytes per entry 37.1
Bytes used for 1000 maps with 64 entries: 2292800, bytes/map 2292.8
Bytes per entry 33.6
Bytes used for 1000 maps with 128 entries: 4935920, bytes/map 4935.9
Bytes per entry 37.4
Bytes used for 1000 maps with 256 entries: 12164160, bytes/map 12164.2
Bytes per entry 47.0
Bytes used for 1000 maps with 512 entries: 29887808, bytes/map 29887.8
Bytes per entry 58.1
Bytes used for 1000 maps with 1024 entries: 56840768, bytes/map 56840.8
Bytes per entry 55.4
Bytes used for 1000 maps with 2048 entries: 108736064, bytes/map 108736.1
Bytes per entry 53.0
Bytes used for 1000 maps with 4096 entries: 184368752, bytes/map 184368.8
Bytes per entry 45.0
Bytes used for 1000 maps with 8192 entries: 431340576, bytes/map 431340.6
Bytes per entry 52.6
Bytes used for 1000 maps with 16384 entries: 815378816, bytes/map 815378.8
Bytes per entry 49.8
jnml@fsc-r630:~/src/tmp$
英文:
Here's a cross-platform reimplementation of Nick's program. It includes changes where I think it was flawed. It also adds more measured data points.
Note: To allow for a wider "entries" range, the measured map bellow is map[int16]byte
.
package main
import (
"fmt"
"runtime"
"unsafe"
)
func Alloc() uint64 {
var stats runtime.MemStats
runtime.GC()
runtime.ReadMemStats(&stats)
return stats.Alloc - uint64(unsafe.Sizeof(hs[0]))*uint64(cap(hs))
}
var hs = []*map[int16]byte{}
func main() {
hs := []*map[int16]byte{}
n := 1000
before := Alloc()
for i := 0; i < n; i++ {
h := map[int16]byte{}
hs = append(hs, &h)
}
after := Alloc()
emptyPerMap := float64(after-before) / float64(n)
fmt.Printf("Bytes used for %d empty maps: %d, bytes/map %.1f\n", n, after-before, emptyPerMap)
hs = nil
k := 1
for p := 1; p < 16; p++ {
before = Alloc()
for i := 0; i < n; i++ {
h := map[int16]byte{}
for j := 0; j < k; j++ {
h[int16(j)] = byte(j)
}
hs = append(hs, &h)
}
after = Alloc()
fullPerMap := float64(after-before) / float64(n)
fmt.Printf("Bytes used for %d maps with %d entries: %d, bytes/map %.1f\n", n, k, after-before, fullPerMap)
fmt.Printf("Bytes per entry %.1f\n", (fullPerMap-emptyPerMap)/float64(k))
k *= 2
}
}
Output
jnml@fsc-r630:~/src/tmp$ go build && ./tmp && go version && uname -a
Bytes used for 1000 empty maps: 146816, bytes/map 146.8
Bytes used for 1000 maps with 1 entries: 147040, bytes/map 147.0
Bytes per entry 0.2
Bytes used for 1000 maps with 2 entries: 147040, bytes/map 147.0
Bytes per entry 0.1
Bytes used for 1000 maps with 4 entries: 247136, bytes/map 247.1
Bytes per entry 25.1
Bytes used for 1000 maps with 8 entries: 439056, bytes/map 439.1
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 818688, bytes/map 818.7
Bytes per entry 42.0
Bytes used for 1000 maps with 32 entries: 1194688, bytes/map 1194.7
Bytes per entry 32.7
Bytes used for 1000 maps with 64 entries: 2102976, bytes/map 2103.0
Bytes per entry 30.6
Bytes used for 1000 maps with 128 entries: 4155072, bytes/map 4155.1
Bytes per entry 31.3
Bytes used for 1000 maps with 256 entries: 6698688, bytes/map 6698.7
Bytes per entry 25.6
Bytes used for 1000 maps with 512 entries: 14142976, bytes/map 14143.0
Bytes per entry 27.3
Bytes used for 1000 maps with 1024 entries: 51349184, bytes/map 51349.2
Bytes per entry 50.0
Bytes used for 1000 maps with 2048 entries: 102467264, bytes/map 102467.3
Bytes per entry 50.0
Bytes used for 1000 maps with 4096 entries: 157214816, bytes/map 157214.8
Bytes per entry 38.3
Bytes used for 1000 maps with 8192 entries: 407031200, bytes/map 407031.2
Bytes per entry 49.7
Bytes used for 1000 maps with 16384 entries: 782616864, bytes/map 782616.9
Bytes per entry 47.8
go version devel +83b0b94af636 Sat Mar 09 16:25:30 2013 +1100 linux/amd64
Linux fsc-r630 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
jnml@fsc-r630:~/src/tmp$
It's nice to see the numbers are better (by a factor of about 4x). The numbers for the release version (1.0.3) are only slightly higher:
jnml@fsc-r630:~/src/tmp$ go build && ./tmp
Bytes used for 1000 empty maps: 144192, bytes/map 144.2
Bytes used for 1000 maps with 1 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 2 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 4 entries: 315648, bytes/map 315.6
Bytes per entry 42.9
Bytes used for 1000 maps with 8 entries: 436288, bytes/map 436.3
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 885824, bytes/map 885.8
Bytes per entry 46.4
Bytes used for 1000 maps with 32 entries: 1331264, bytes/map 1331.3
Bytes per entry 37.1
Bytes used for 1000 maps with 64 entries: 2292800, bytes/map 2292.8
Bytes per entry 33.6
Bytes used for 1000 maps with 128 entries: 4935920, bytes/map 4935.9
Bytes per entry 37.4
Bytes used for 1000 maps with 256 entries: 12164160, bytes/map 12164.2
Bytes per entry 47.0
Bytes used for 1000 maps with 512 entries: 29887808, bytes/map 29887.8
Bytes per entry 58.1
Bytes used for 1000 maps with 1024 entries: 56840768, bytes/map 56840.8
Bytes per entry 55.4
Bytes used for 1000 maps with 2048 entries: 108736064, bytes/map 108736.1
Bytes per entry 53.0
Bytes used for 1000 maps with 4096 entries: 184368752, bytes/map 184368.8
Bytes per entry 45.0
Bytes used for 1000 maps with 8192 entries: 431340576, bytes/map 431340.6
Bytes per entry 52.6
Bytes used for 1000 maps with 16384 entries: 815378816, bytes/map 815378.8
Bytes per entry 49.8
jnml@fsc-r630:~/src/tmp$
答案2
得分: 3
每个映射条目的开销不是一个固定值,因为它取决于每个映射条目的桶数。
关于映射内部结构的文章非常好:https://www.ardanlabs.com/blog/2013/12/macro-view-of-map-internals-in-go.html
Go映射的哈希表结构化为一个桶的数组。桶的数量始终等于2的幂。
...
映射的增长方式
当我们继续向映射中添加或删除键值对时,映射查找的效率开始下降。决定何时扩展哈希表的负载阈值基于以下四个因素:
% overflow:具有溢出桶的桶的百分比
bytes/entry:每个键值对使用的开销字节数
hitprobe:查找键时需要检查的条目数
missprobe:查找不存在的键时需要检查的条目数
例如,一个非常简单的基准测试可以显示出每个条目的开销在增加一个条目时的显著增加:
func Benchmark(b *testing.B) {
m := make(map[int64]struct{})
// 也会重置内存统计
b.ResetTimer()
for i := 0; i < b.N; i++ {
m[int64(i)] = struct{}{}
}
}
对106496个条目进行基准测试:
go test -bench . -benchtime 106496x -benchmem Benchmark-2 106495 65.7 ns/op 31 B/op 0 allocs/op
例如,每个条目31字节
现在增加一个条目的数量:
go test -bench . -benchtime 106497x -benchmem Benchmark-2 106497 65.7 ns/op 57 B/op 0 allocs/op
例如,每个条目57字节
增加一个条目的数量导致底层桶的数量翻倍,从而导致额外的开销。当添加更多条目时,开销将减少,直到再次将桶的数量翻倍。
英文:
Overhead per map entry is not a constant value, since it depends on a number of buckets per map entry.
There is a great article on the map internals: https://www.ardanlabs.com/blog/2013/12/macro-view-of-map-internals-in-go.html
> The hash table for a Go map is structured as an array of buckets. The number of buckets is always equal to a power of 2.
...
> How Maps Grow
> As we continue to add or remove key/value pairs from the map, the efficiency of the map lookups begin to deteriorate. The load threshold values that determine when to grow the hash table are based on these four factors:
> % overflow : Percentage of buckets which have an overflow bucket
> bytes/entry : Number of overhead bytes used per key/value pair
> hitprobe : Number of entries that need to be checked when looking up a key
> missprobe : Number of entries that need to be checked when looking up an absent key
For example a very simple benchmark can show a dramatic increase in overhead per entry when increasing number of entries just by 1:
func Benchmark(b *testing.B) {
m := make(map[int64]struct{})
// also resets mem stats
b.ResetTimer()
for i := 0; i < b.N; i++ {
m[int64(i)] = struct{}{}
}
}
Benching with 106496 entries:
go test -bench . -benchtime 106496x -benchmem
Benchmark-2 106495 65.7 ns/op 31 B/op 0 allocs/op
e.g. 31 bytes per entry
Now increase the number of entries by one:
go test -bench . -benchtime 106497x -benchmem
Benchmark-2 106497 65.7 ns/op 57 B/op 0 allocs/op
e.g. 57 bytes per entry
Increasing the number of entries by 1 resulted in the doubling of the number of underlying buckets, which resulted in an additional overhead. The overhead will decrease when more entries are added, until the number of buckets is doubled again.
答案3
得分: 2
似乎涉及到了一个缓冲区,并且只在需要时才会增长。不过我无法确定gccgo是否也是如此,我只是在playground上尝试了一下。基本上,它为空的映射分配了128字节,然后在需要时进行扩展。
你可以在这里看到:http://play.golang.org/p/RjohbSOq0x
英文:
It seems like there is a buffer involved, and it grows only when needed. I can't tell for gccgo, though, I just tried it on the playground. Basically, it allocates 128 bytes for the empty map, then it grows when needed.
You can see it here : http://play.golang.org/p/RjohbSOq0x
答案4
得分: 2
这是一个用来测量映射开销的实验。仅适用于Linux。
package main
import (
"fmt"
"io/ioutil"
"log"
"os"
"runtime"
"strconv"
"strings"
)
func ReadRss() int {
data, err := ioutil.ReadFile("/proc/self/statm")
if err != nil {
log.Fatal(err)
}
rss, err := strconv.Atoi(strings.Fields(string(data))[1])
if err != nil {
log.Fatal(err)
}
return rss * os.Getpagesize()
}
func main() {
hs := []*map[byte]byte{}
before := ReadRss()
n := 10000
for i := 0; i < n; i++ {
h := map[byte]byte{}
hs = append(hs, &h)
}
after := ReadRss()
empty_per_map := float64(after-before)/float64(n)
fmt.Printf("10000个空映射使用的字节数:%d,每个映射的字节数:%.1f\n", n, after-before, empty_per_map)
hs = nil
runtime.GC()
before = ReadRss()
for i := 0; i < n; i++ {
h := map[byte]byte{}
for j := byte(0); j < 100; j++ {
h[j] = j
}
hs = append(hs, &h)
}
after = ReadRss()
full_per_map := float64(after-before)/float64(n)
fmt.Printf("10000个带有100个条目的映射使用的字节数:%d,每个映射的字节数:%.1f\n", n, after-before, full_per_map)
fmt.Printf("每个条目的字节数:%.1f\n", (full_per_map - empty_per_map)/100)
}
在我的64位Linux机器上使用go 1.0.3打印出以下结果:
10000个空映射使用的字节数:1695744,每个映射的字节数:169.6
10000个带有100个条目的映射使用的字节数:43876352,每个映射的字节数:4387.6
每个条目的字节数:42.2
或者使用go 1.0:
10000个空映射使用的字节数:1388544,每个映射的字节数:138.9
10000个带有100个条目的映射使用的字节数:199323648,每个映射的字节数:19932.4
每个条目的字节数:197.9
内存测量是使用Linux操作系统而不是Go的内部内存统计信息进行的。由于以4k页面返回,所以内存测量是粗略的,因此创建了大量的映射。
所以大致上每个映射的成本约为170字节,每个条目的成本约为42字节(对于1.0来说更多)。
英文:
Here is an experiment to measure the overhead of maps. Works under Linux only.
package main
import (
"fmt"
"io/ioutil"
"log"
"os"
"runtime"
"strconv"
"strings"
)
func ReadRss() int {
data, err := ioutil.ReadFile("/proc/self/statm")
if err != nil {
log.Fatal(err)
}
rss, err := strconv.Atoi(strings.Fields(string(data))[1])
if err != nil {
log.Fatal(err)
}
return rss * os.Getpagesize()
}
func main() {
hs := []*map[byte]byte{}
before := ReadRss()
n := 10000
for i := 0; i < n; i++ {
h := map[byte]byte{}
hs = append(hs, &h)
}
after := ReadRss()
empty_per_map := float64(after-before)/float64(n)
fmt.Printf("Bytes used for %d empty maps: %d, bytes/map %.1f\n", n, after-before, empty_per_map)
hs = nil
runtime.GC()
before = ReadRss()
for i := 0; i < n; i++ {
h := map[byte]byte{}
for j := byte(0); j < 100; j++ {
h[j] = j
}
hs = append(hs, &h)
}
after = ReadRss()
full_per_map := float64(after-before)/float64(n)
fmt.Printf("Bytes used for %d maps with 100 entries: %d, bytes/map %.1f\n", n, after-before, full_per_map)
fmt.Printf("Bytes per entry %.1f\n", (full_per_map - empty_per_map)/100)
}
It prints this on my 64 bit Linux machine using go 1.0.3
Bytes used for 10000 empty maps: 1695744, bytes/map 169.6
Bytes used for 10000 maps with 100 entries: 43876352, bytes/map 4387.6
Bytes per entry 42.2
Or using go 1.0
Bytes used for 10000 empty maps: 1388544, bytes/map 138.9
Bytes used for 10000 maps with 100 entries: 199323648, bytes/map 19932.4
Bytes per entry 197.9
The memory measurements are done using the Linux OS rather than Go's internal memory stats. The memory measurements are coarse as they are returned in 4k pages, hence the large number of maps created.
So in round numbers each map costs about 170 bytes and each entry costs 42 bytes using go 1.0.3 (much more for 1.0)
答案5
得分: 1
Bytes used for 1000 empty maps: 0, bytes/map 0.0
Bytes used for 1000 maps with 1 entries: 112192, bytes/map 112.2
Bytes per entry 112.2
Bytes used for 1000 maps with 2 entries: 113472, bytes/map 113.5
Bytes per entry 56.7
Bytes used for 1000 maps with 4 entries: 110912, bytes/map 110.9
Bytes per entry 27.7
Bytes used for 1000 maps with 8 entries: 112192, bytes/map 112.2
Bytes per entry 14.0
Bytes used for 1000 maps with 16 entries: 231600, bytes/map 231.6
Bytes per entry 14.5
Bytes used for 1000 maps with 32 entries: 413768, bytes/map 413.8
Bytes per entry 12.9
Bytes used for 1000 maps with 64 entries: 736920, bytes/map 736.9
Bytes per entry 11.5
Bytes used for 1000 maps with 128 entries: 1419624, bytes/map 1419.6
Bytes per entry 11.1
Bytes used for 1000 maps with 256 entries: 2735192, bytes/map 2735.2
Bytes per entry 10.7
Bytes used for 1000 maps with 512 entries: 5655168, bytes/map 5655.2
Bytes per entry 11.0
Bytes used for 1000 maps with 1024 entries: 10919888, bytes/map 10919.9
Bytes per entry 10.7
Bytes used for 1000 maps with 2048 entries: 21224528, bytes/map 21224.5
Bytes per entry 10.4
Bytes used for 1000 maps with 4096 entries: 42391024, bytes/map 42391.0
Bytes per entry 10.3
Bytes used for 1000 maps with 8192 entries: 84613344, bytes/map 84613.3
Bytes per entry 10.3
Bytes used for 1000 maps with 16384 entries: 169152560, bytes/map 169152.6
Bytes per entry 10.3
英文:
/* Hal3 Mon Jul 18 20:58:16 BST 2016 go version go1.5.1 linux/amd64
Bytes used for 1000 empty maps: 0, bytes/map 0.0
Bytes used for 1000 maps with 1 entries: 112192, bytes/map 112.2
Bytes per entry 112.2
Bytes used for 1000 maps with 2 entries: 113472, bytes/map 113.5
Bytes per entry 56.7
Bytes used for 1000 maps with 4 entries: 110912, bytes/map 110.9
Bytes per entry 27.7
Bytes used for 1000 maps with 8 entries: 112192, bytes/map 112.2
Bytes per entry 14.0
Bytes used for 1000 maps with 16 entries: 231600, bytes/map 231.6
Bytes per entry 14.5
Bytes used for 1000 maps with 32 entries: 413768, bytes/map 413.8
Bytes per entry 12.9
Bytes used for 1000 maps with 64 entries: 736920, bytes/map 736.9
Bytes per entry 11.5
Bytes used for 1000 maps with 128 entries: 1419624, bytes/map 1419.6
Bytes per entry 11.1
Bytes used for 1000 maps with 256 entries: 2735192, bytes/map 2735.2
Bytes per entry 10.7
Bytes used for 1000 maps with 512 entries: 5655168, bytes/map 5655.2
Bytes per entry 11.0
Bytes used for 1000 maps with 1024 entries: 10919888, bytes/map 10919.9
Bytes per entry 10.7
Bytes used for 1000 maps with 2048 entries: 21224528, bytes/map 21224.5
Bytes per entry 10.4
Bytes used for 1000 maps with 4096 entries: 42391024, bytes/map 42391.0
Bytes per entry 10.3
Bytes used for 1000 maps with 8192 entries: 84613344, bytes/map 84613.3
Bytes per entry 10.3
Bytes used for 1000 maps with 16384 entries: 169152560, bytes/map 169152.6
Bytes per entry 10.3
Mon Jul 18 20:58:25 BST 2016 */
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论