为什么在迭代过程中向映射(map)添加项目会产生不一致的结果?

huangapple go评论108阅读模式
英文:

Why adding items to map during its iteration produce inconsistent result?

问题

根据Go Spec的说明:

如果在迭代过程中创建了映射条目,该条目可能会在迭代期间产生,也可能会被跳过。

因此,根据这个说明,我期望以下代码至少会打印出数字1,而要打印出多少个数字是不可预测的,并且每次运行程序时都会有所不同:

package main

import (
	"fmt"
)

func main() {

	test := make(map[int]int)

	test[1] = 1
	j := 2
	for i, v := range test {
		fmt.Println(i, v)
		test[j] = j
		j++
	}

	return

}

Go playground链接

在我的笔记本电脑上(Go版本1.8),最多打印到8,在playground上(仍然是版本1.8),只打印到3!我对playground的结果不太在意,因为它的go不是原生的,但我想知道为什么在我的本地环境中它从来没有打印超过8?即使我尝试在每次迭代中添加更多的项,以增加超过8的可能性,但没有任何区别。

编辑:基于@Schwern的答案,我自己的解释

当使用make函数创建映射时,如果没有指定大小参数,只会分配一个桶,并且在Go中,每个桶的大小为8个元素,因此当范围开始时,它看到映射只有一个桶,最多会迭代8次。如果我使用大于7的大小参数,例如make(map[int]int, 8),将创建两个桶,并且有可能在添加的项上进行超过8次的迭代。

英文:

From Go Spec:

> If map entries are created during iteration, that entry may be produced during the iteration or may be skipped.

So what I expect from that statement is that the following code should at least print number 1, and how many more numbers which are going to be printed is not predictable and is different each time you run the program:

package main

import (
	"fmt"
)

func main() {

	test := make(map[int]int)

	test[1] = 1
	j := 2
	for i, v := range test {
		fmt.Println(i, v)
		test[j] = j
		j++
	}

	return

}

Go playground link

On my own laptop (Go version 1.8) at maximum it prints till 8, in playground (still version 1.8) it prints exactly till 3!
I don't care much about the result from playground since its go is not vanilla but I wonder why on my local it never prints more than 8? even I tried to add more items in each iteration to make the possibility of going over 8 higher but there's no difference.

EDIT: my own explanation based on @Schwern 's answer

when the map is created with make function and without any size parameter only 1 bucket is assigned and in go each bucket has a size of 8 elements, so when the range starts it sees that the map has only 1 bucket and it will iterate at maximum 8 times. If I use a size parameter bigger than 7 like make(map[int]int, 8) two buckets is created and there would be possibility that I get more than 8 iterations over the added items.

答案1

得分: 1

这是大多数哈希表设计固有的问题。以下是一个简单的解释,省略了很多不必要的细节。

在底层,哈希表是一个数组。每个键都使用哈希函数映射到数组中的一个元素。例如,"foo"可能映射到元素8,"bar"可能映射到元素4,依此类推。一些元素是空的。

for k,v := range hash 遍历这个数组,以它们出现的任意顺序。为了避免碰撞攻击,顺序是不可预测的。

当你向哈希表添加元素时,它会添加到底层的数组中。它甚至可能需要分配一个新的、更大的数组。新的键可能会落在哈希表数组的任意位置,这是不可预测的。

因此,如果在遍历哈希表时添加更多的键值对,任何在当前索引之前放入数组的键值对都不会被看到;遍历已经过了那个点。在当前索引之后放入的任何键值对可能会被看到;遍历尚未到达那个点,但数组可能会被重新分配,键值对可能会被重新哈希。

但我想知道为什么在我的本地环境中它从来没有打印超过8个键值对

因为底层数组的长度可能是8。Go语言可能会以2的幂次分配底层数组,并且可能从8开始。range hash 可能会首先检查底层数组的长度,并且即使它增长了,也不会继续遍历。

长话短说:在遍历哈希表时不要添加键值对。

英文:

This is an issue inherent in the design of most hash tables. Here's a simple explanation hand waving a lot of unnecessary detail.

Under the hood, a hash table is an array. Each key is mapped onto an element in the array using a hash function. For example, "foo" might map to element 8, "bar" might map to element 4, and so on. Some elements are empty.

for k,v := range hash iterates through this array in whatever order they happen to appear. The ordering is unpredictable to avoid a collision attack.

When you add to a hash, it adds to the underlying array. It might even have to allocate a new, larger array. It's unpredictable where that new key will land in the hash's array.

So if you add more pairs while you're iterating through the hash, any pair that gets put into the array before the current index won't be seen; the iteration has already past that point. Anything that gets put after might be seen; the iteration has yet to reach that point, but the array might get reallocated and the pairs possibly rehashed.

> but I wonder why on my local it never prints more than 8

Because the underlying array is probably of length 8. Go likely allocates the underlying array in powers of 2 and probably starts at 8. The range hash probably starts by checking the length of the underlying array and will not go further, even if it's grown.

Long story short: don't add keys to a hash while iterating through it.

huangapple
  • 本文由 发表于 2017年6月9日 10:43:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/44448472.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定