地图迭代对于随机选择键是否足够随机?

huangapple go评论91阅读模式
英文:

Is map iteration sufficiently random for randomly selecting keys?

问题

我可以依赖于映射的随机迭代顺序来实现 Web 应用程序中客户端的随机“配对”吗?我尝试过查找相关信息,但似乎找不到关于这种随机性的详细说明。

算法大致如下:

var clients map[Client]struct{}

func PairClient(c Client) (Client, error) {
    for m := range clients {
        if m != c {
            return m, nil
        }
    }
    return nil, fmt.Errorf("lobby: insufficient number of clients")
}

在连接的客户端数量超过1000时,这样做是否足够,还是应该维护一个单独的客户端切片,并从中随机选择?

英文:

Can I rely on the random iteration order of maps to implement a random "pairing" of clients in a web application? I've tried looking around but can't seem to find a breakdown of how random this randomness is.

The algorithm would look something like:

var clients map[Client]struct{}

func PairClient(c Client) (Client, error) {
    for m := range clients {
        if m != c {
            return m, nil
        }
    }
    return nil, fmt.Errorf("lobby: insufficient number of clients")
}

Would this be sufficient when there are >1000 connected clients, or should I maintain a separate slice of clients and randomly select from that?

答案1

得分: 8

尽管它被称为随机(随机化)(规范博客哈希映射源码另一个博客SO),但分布远非完美。

为什么呢?因为我们喜欢地图的速度快,而更好的随机分布往往需要更多的计算和/或更大的延迟。必须做出妥协。而且,这并不是为了提供一个质量好的“洗牌”功能,而是为了防止开发人员依赖于稳定的迭代顺序(因为即使没有明确的随机化,它也可能发生变化)。

但是这种分布有多好呢?很容易得到一个“感觉”。让我们创建一个包含10对键值对的映射,并开始多次迭代它。然后统计第一个索引(键)的分布:

m := map[int]int{}
for i := 0; i < 10; i++ {
    m[i] = i
}

dist := make([]int, 10)
for i := 0; i < 100000; i++ {
    for idx := range m {
        dist[idx]++
        break
    }
}

fmt.Println("Distribution:", dist)

输出结果(在Go Playground上尝试):

Distribution: [25194 24904 6196 6134 6313 6274 6297 6189 6189 6310]

前两个键(01)的遇到次数大约是其他键的4倍,而其他键的概率大致相同。

你可以看出这对于真正的(甚至是好的)随机来说是相当糟糕的,但这不是重点。它足以提供不同的迭代顺序(而且重要的是:它很快)。

英文:

Even though it said to be random (randomized) (spec, blog, hashmap source, another blog, SO), the distribution is far from being perfect.

Why? Because we like maps being fast, and better random distribution tends to require more computation and/or bigger delays. A compromise had to be made. And because the intention is not to provide a quality "shuffle" functionality by for range, but only to prevent developers relying on stable iteration order (because it could change even without explicit randomization).

But "how good" may this distribution be? Easy to get a "taste". Let's create a map of 10 pairs, and start iterating over it lot of times. And let's count the distribution of the very first index (key):

m := map[int]int{}
for i := 0; i &lt; 10; i++ {
	m[i] = i
}

dist := make([]int, 10)
for i := 0; i &lt; 100000; i++ {
	for idx := range m {
		dist[idx]++
		break
	}
}

fmt.Println(&quot;Distribution:&quot;, dist)

Output (try it on the Go Playground):

Distribution: [25194 24904 6196 6134 6313 6274 6297 6189 6189 6310]

The first 2 keys (0 and 1) were roughly encountered 4 times more than the rest which had roughly the same probability.

You can tell it's pretty bad for being true (or even good) random, but that's not the point. It's good enough to provide varying iteration order (and importantly: it's fast).

答案2

得分: 5

根据规范

>3. 对于映射的迭代顺序没有指定,并且不能保证从一次迭代到下一次迭代顺序相同。(...)

这基本上意味着映射的迭代顺序是未定义的。即使在默认的Go编译器中它是随机的,其他编译器或其他版本可能会有所不同。

英文:

From the spec:

>3. The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. (...)

Which essentially means that the order in which maps are iterated is undefined. Even if it's random right now with the default Go compiler, other compilers or other versions might differ.

huangapple
  • 本文由 发表于 2016年12月7日 22:15:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/41019703.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定