英文:
Pooling Maps in Golang
问题
我很好奇是否有人在Go中尝试过池化(pool)地图?我之前读过关于池化缓冲区的文章,我想知道是否可以通过类似的推理来池化地图,如果需要频繁创建和销毁地图,或者是否有任何理由认为,从一开始来看,这可能不是高效的。当一个地图返回到池中时,需要遍历它并删除所有元素,但似乎一个常见的建议是创建一个新的地图,而不是删除已经分配并重用的地图中的条目,这让我觉得池化地图可能没有那么有益。
英文:
I was curious if anyone has tried to pool maps in Go before? I've read about pooling buffers previously, and I was wondering if by similar reasoning it could make sense to pool maps if one has to create and destroy them frequently or if there was any reason why, a priori, it might not be efficient. When a map is returned to the pool, one would have to iterate through it and delete all elements, but it seems a popular recommendation is to create a new map instead of deleting the entries in a map which has already been allocated and reusing it which makes me think that pooling maps may not be as beneficial.
答案1
得分: 5
如果你的地图通过删除或添加条目而发生(很多)大小变化,这将导致新的分配,并且不会有池化的好处。
如果你的地图大小不会改变,只有键的值会改变,那么池化将是一个成功的优化。
当你读取类似表格的结构时,比如CSV文件或数据库表,这种方法会很有效。每一行都包含完全相同的列,所以你不需要清除任何条目。
下面的基准测试显示在运行go test -benchmem -bench .
时没有分配。
package mappool
import "testing"
const SIZE = 1000000
func BenchmarkMap(b *testing.B) {
m := make(map[int]int)
for i := 0; i < SIZE; i++ {
m[i] = i
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for i := 0; i < SIZE; i++ {
m[i] = m[i] + 1
}
}
}
英文:
If your maps change (a lot) in size by deleting or adding entries this will cause new allocations and there will be no benefit of pooling them.
If your maps will not change in size but only the values of the keys will change then pooling will be a successful optimization.
This will work well when you read table-like structures, for instance CSV files or database tables. Each row will contain exactly the same columns, so you don't need to clear any entry.
The benchmark below shows no allocation when run with go test -benchmem -bench .
to
package mappool
import "testing"
const SIZE = 1000000
func BenchmarkMap(b *testing.B) {
m := make(map[int]int)
for i := 0; i < SIZE; i++ {
m[i] = i
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for i := 0; i < SIZE; i++ {
m[i] = m[i] + 1
}
}
}
答案2
得分: 2
根据@Grzegorz Żur的说法,如果你的地图大小变化不大,那么池化是有帮助的。为了测试这一点,我做了一个池化胜出的基准测试。在我的机器上输出结果如下:
池化时间:115.977微秒
非池化时间:160.828微秒
基准测试代码:
package main
import (
"fmt"
"math/rand"
"time"
)
const BenchIters = 1000
func main() {
pool := map[int]int{}
poolTime := benchmark(func() {
useMapForSomething(pool)
// 通过清空地图返回到池中。
for key := range pool {
delete(pool, key)
}
})
nopoolTime := benchmark(func() {
useMapForSomething(map[int]int{})
})
fmt.Println("池化时间:", poolTime)
fmt.Println("非池化时间:", nopoolTime)
}
func useMapForSomething(m map[int]int) {
for i := 0; i < 1000; i++ {
m[rand.Intn(300)] += 5
}
}
// benchmark 测量函数 f 的平均执行时间。
func benchmark(f func()) time.Duration {
start := time.Now().UnixNano()
for i := 0; i < BenchIters; i++ {
f()
}
return time.Nanosecond * time.Duration((time.Now().UnixNano()-start)/BenchIters)
}
英文:
Like @Grzegorz Żur says, if your maps don't change in size very much, then pooling is helpful. To test this, I made a benchmark where pooling wins out. The output on my machine is:
Pool time: 115.977µs
No-pool time: 160.828µs
Benchmark code:
package main
import (
"fmt"
"math/rand"
"time"
)
const BenchIters = 1000
func main() {
pool := map[int]int{}
poolTime := benchmark(func() {
useMapForSomething(pool)
// Return to pool by clearing the map.
for key := range pool {
delete(pool, key)
}
})
nopoolTime := benchmark(func() {
useMapForSomething(map[int]int{})
})
fmt.Println("Pool time:", poolTime)
fmt.Println("No-pool time:", nopoolTime)
}
func useMapForSomething(m map[int]int) {
for i := 0; i < 1000; i++ {
m[rand.Intn(300)] += 5
}
}
// benchmark measures how long f takes, on average.
func benchmark(f func()) time.Duration {
start := time.Now().UnixNano()
for i := 0; i < BenchIters; i++ {
f()
}
return time.Nanosecond * time.Duration((time.Now().UnixNano()-start)/BenchIters)
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论