英文:
Why does Go map vs slice performance have 10x speed difference here
问题
我刚刚解决了欧拉计划中的第23个问题,但我注意到在性能方面,map[int]bool和[]bool之间存在很大的差异。
我有一个函数,用于计算一个数的真因子之和:
func divisorsSum(n int) int {
sum := 1
for i := 2; i*i <= n; i++ {
if n%i == 0 {
sum += i
if n/i != i {
sum += n / i
}
}
}
return sum
}
然后在main函数中,我这样写:
func main() {
start := time.Now()
defer func() {
elapsed := time.Since(start)
fmt.Printf("%s\n", elapsed)
}()
n := 28123
abundant := []int{}
for i := 12; i <= n; i++ {
if divisorsSum(i) > i {
abundant = append(abundant, i)
}
}
sums := map[int]bool{}
for i := 0; i < len(abundant); i++ {
for j := i; j < len(abundant); j++ {
if abundant[i]+abundant[j] > n {
break
}
sums[abundant[i]+abundant[j]] = true
}
}
sum := 0
for i := 1; i <= 28123; i++ {
if _, ok := sums[i]; !ok {
sum += i
}
}
fmt.Println(sum)
}
这段代码在我的电脑上运行需要450毫秒。但是,如果我将主要代码更改为使用bool切片而不是map,如下所示:
func main() {
start := time.Now()
defer func() {
elapsed := time.Since(start)
fmt.Printf("%s\n", elapsed)
}()
n := 28123
abundant := []int{}
for i := 12; i <= n; i++ {
if divisorsSum(i) > i {
abundant = append(abundant, i)
}
}
sums := make([]bool, n)
for i := 0; i < len(abundant); i++ {
for j := i; j < len(abundant); j++ {
if abundant[i]+abundant[j] < n {
sums[abundant[i]+abundant[j]] = true
} else {
break
}
}
}
sum := 0
for i := 0; i < len(sums); i++ {
if !sums[i] {
sum += i
}
}
fmt.Println(sum)
}
现在只需要40毫秒,速度不到之前的1/10。我原以为map应该具有更快的查找速度。这里的性能差异是怎么回事?
英文:
I just solved problem 23 on Project Euler, but I noticed a big difference between map[int]bool, and []bool in terms of performance.
I have a function that sums up the proper divisors of a number:
func divisorsSum(n int) int {
sum := 1
for i := 2; i*i <= n; i++ {
if n%i == 0 {
sum += i
if n/i != i {
sum += n / i
}
}
}
return sum
}
And then in main I do like this:
func main() {
start := time.Now()
defer func() {
elapsed := time.Since(start)
fmt.Printf("%s\n", elapsed)
}()
n := 28123
abundant := []int{}
for i := 12; i <= n; i++ {
if divisorsSum(i) > i {
abundant = append(abundant, i)
}
}
sums := map[int]bool{}
for i := 0; i < len(abundant); i++ {
for j := i; j < len(abundant); j++ {
if abundant[i]+abundant[j] > n {
break
}
sums[abundant[i]+abundant[j]] = true
}
}
sum := 0
for i := 1; i <= 28123; i++ {
if _, ok := sums[i]; !ok {
sum += i
}
}
fmt.Println(sum)
}
This code takes 450ms on my computer. But if I change the main code to below with slice of bool instead of map like this:
func main() {
start := time.Now()
defer func() {
elapsed := time.Since(start)
fmt.Printf("%s\n", elapsed)
}()
n := 28123
abundant := []int{}
for i := 12; i <= n; i++ {
if divisorsSum(i) > i {
abundant = append(abundant, i)
}
}
sums := make([]bool, n)
for i := 0; i < len(abundant); i++ {
for j := i; j < len(abundant); j++ {
if abundant[i]+abundant[j] < n {
sums[abundant[i]+abundant[j]] = true
} else {
break
}
}
}
sum := 0
for i := 0; i < len(sums); i++ {
if !sums[i] {
sum += i
}
}
fmt.Println(sum)
}
Now it takes only 40ms, below 1/10 of the speed from previous. I thought maps were supposed to have faster look ups. What is up with the performance difference here?
答案1
得分: 7
你可以对代码进行分析,但一般来说,有两个主要原因:
-
在第二个示例中,你预先分配了
sums
的所需大小。这意味着它不需要再扩展,这样非常高效,没有垃圾回收压力,也没有重新分配内存等。尝试提前创建具有所需大小的映射表,看看它能改善多少性能。 -
我不知道Go语言哈希表的内部实现,但一般来说,通过整数索引对数组/切片进行随机访问非常高效,而哈希表在此基础上增加了开销,特别是如果它对整数进行哈希(这可能是为了获得更好的分布)。
英文:
You can profile your code and see, but in general, there are two main reasons:
-
You pre-allocate
sums
in the second example to its desired size. This means it never has to grow, and all this is very efficient, there's no GC pressure, no reallocs, etc. Try creating the map with the desired size in advance and see how much it improves things. -
I don't know the internal implementation of Go's hash map, but in general, random access of an array/slice by integer index is super efficient, and a hash table adds overhead on top of it, especially if it hashes the integers (it might do so to create better distribution).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论