2021年7月22日 02:23:05go评论70阅读模式

英文:

Inner workings of `rand.Intn` function - GoLang

问题

这段代码是关于Go语言中实现随机函数的源代码。下面是对代码的解释：

首先，randomFormat函数定义了一个字符串数组formats，包含了一些格式化字符串。然后，通过调用rand.Intn(len(formats))来生成一个随机数，该随机数作为索引用于选择一个格式化字符串，并将其作为结果返回。

接下来，Intn函数是Rand结构体的一个方法，用于生成一个介于0和n之间的伪随机数。首先，它会检查n的值是否小于等于0，如果是，则会引发一个错误。然后，它会检查n是否小于等于2^31-1，如果是，则调用Int31n方法生成一个int32类型的随机数并返回。否则，它会调用Int63n方法生成一个int64类型的随机数并返回。

Int31n函数是Rand结构体的另一个方法，用于生成一个介于0和n之间的非负伪随机数。首先，它会检查n的值是否小于等于0，如果是，则会引发一个错误。然后，它会检查n是否是2的幂，如果是，则通过位运算将生成的随机数与(n-1)进行按位与操作，并返回结果。否则，它会计算一个最大值max，然后生成一个随机数v，如果v大于max，则继续生成随机数，直到v小于等于max为止。最后，返回v对n取模的结果。

Int63n函数与Int31n函数类似，只是它生成的是int64类型的随机数。

最后，还定义了一些其他辅助函数和接口。

希望这些解释对你有帮助！如果还有其他问题，请随时提问。

英文:

Somehow, I happened to look at source code for Go on how it implements Random function when passed a length of array.

Here's the calling code

func randomFormat() string {
	formats := []string{
		&quot;Hi, %v. Welcome!&quot;,
		&quot;Great to see you, %v!&quot;,
		&quot;Hail, %v! Well met!&quot;,
	}
	return formats[rand.Intn(len(formats))]
}

Go Source code: main part

func (r *Rand) Intn(n int) int {
	if n &lt;= 0 {
		panic(&quot;invalid argument to Intn&quot;)
	}
	if n &lt;= 1&lt;&lt;31-1 {
		return int(r.Int31n(int32(n)))
	}
	return int(r.Int63n(int64(n)))
}

Go Source code: reference part - Most of devs have this already on their machines or go repo.

// Int31n returns, as an int32, a non-negative pseudo-random number in [0,n).
// It panics if n &lt;= 0.
func (r *Rand) Int31n(n int32) int32 {
	if n &lt;= 0 {
		panic(&quot;invalid argument to Int31n&quot;)
	}
	if n&amp;(n-1) == 0 { // n is power of two, can mask
		return r.Int31() &amp; (n - 1)
	}
	max := int32((1 &lt;&lt; 31) - 1 - (1&lt;&lt;31)%uint32(n))
	v := r.Int31()
	for v &gt; max {
		v = r.Int31()
	}
	return v % n
}
// It panics if n &lt;= 0.
func (r *Rand) Int63n(n int64) int64 {
	if n &lt;= 0 {
		panic(&quot;invalid argument to Int63n&quot;)
	}
	if n&amp;(n-1) == 0 { // n is power of two, can mask
		return r.Int63() &amp; (n - 1)
	}
	max := int64((1 &lt;&lt; 63) - 1 - (1&lt;&lt;63)%uint64(n))
	v := r.Int63()
	for v &gt; max {
		v = r.Int63()
	}
	return v % n
}
func (r *Rand) Int31() int32 { return int32(r.Int63() &gt;&gt; 32) }
func (r *Rand) Int63() int64 { return r.src.Int63() }

type Source interface {
	Int63() int64
	Seed(seed int64)
}

I want to understand how the random function works encapsulating all inner functions. I am overwhelmed by the code and if someone has to plan the steps out in plain English what would those be?

For example, I don't get the logic for doing minus 1 in

if n <= 1<<31-1

Then, I don't get any of the head or toe of Int31n function

  if n&amp;(n-1) == 0 { // n is power of two, can mask
        return r.Int31() &amp; (n - 1)
    }
    max := int32((1 &lt;&lt; 31) - 1 - (1&lt;&lt;31)%uint32(n))
    v := r.Int31()
    for v &gt; max {
        v = r.Int31()
    }
    return v % n

答案1

得分: 5

这更多是关于算法而不是关于Go的问题，但其中涉及了一些Go的部分。无论如何，我将从算法问题开始。

缩小均匀随机数生成器的范围

假设我们有一个均匀分布的随机数生成器，它返回一个介于0和7之间（包括0和7）的数字。也就是说，随着时间的推移，它会返回大约相同数量的0、1、2、...、7，但它们之间没有明显的模式。

现在，如果我们想要一个介于0和7之间的均匀分布的随机数，这个生成器就很完美。我们只需使用它即可。但是，如果我们想要一个介于0和6之间的均匀分布的随机数呢？

我们可以这样写：

func randMod7() int {
    return generate() % 7
}

这样，如果generate()返回7（它有1/8的概率这样做），我们将将该值转换为零。但是这样，我们将2/8的时间得到零，而不是1/8的时间。我们将以平均每个实际零一次和每个7一次的频率得到1、2、3、4、5和6，以及2次零。

因此，我们需要丢弃任何出现的7：

func randMod7() int {
    for {
        if i := generate() < 7 {
            return i
        }
        // 哎呀，得到了7，再试一次
    }
}

现在，如果我们有一个名为generate()的均匀随机数生成器，它返回一个介于0和（比如）11之间（共12个可能的值），而我们想要一个介于0和3之间（共4个可能的值）的值，我们可以使用generate() % 4，因为12个可能的结果将等概率地分为3组四个数。如果我们想要一个介于0和5之间（包括5）的值，我们可以使用generate() % 6，因为12个可能的结果将等概率地分为两组六个数。实际上，我们只需要检查均匀数生成器的范围的质因数分解，看看哪些模数适用。12的质因数是2、2、3，所以2、3、4和6在这里都适用。任何其他模数，比如generate() % 10，都会产生有偏差的结果：0和1出现2/12的时间，而2到9出现1/12的时间。（注意：generate() % 12也可以工作，但没有什么意义。）

在我们的特定情况中，我们有两个不同的均匀随机数生成器可用。一个是Int31()，它产生介于0和0x7fffffff（2147483647十进制，或2³¹ - 1，或1<<31 - 1）之间的值。另一个是Int63()，它产生介于0和0x7fffffffffffffff（9223372036854775807，或2⁶³ - 1，或1<<63 - 1）之间的值。这些范围可以容纳2³¹和2⁶³个值，因此它们的质因数分解是31个2或63个2。

这意味着我们可以计算Int31() mod 2^k，对于任何在0到31之间的整数k，而不会破坏我们的均匀性。对于Int63()，我们可以使用k范围从0到63。

引入计算机

现在，从数学和计算机的角度来看，对于任何非负整数n（在[0..0x7ffffff]或[0..0x7fffffffffffffff]范围内），以及一个在正确范围内的非负整数k（不超过31或63），计算整数n mod 2^k的结果与计算该整数并进行k位的位掩码操作产生的结果是相同的。为了获得设置的位数，我们想要取1<<k并减去1。如果k是4，我们得到1<<4或16。减去1，我们得到15，或0xf，其中有四个1位。

因此：

n % (1 << k)

和：

n & (1<<k - 1)

产生相同的结果。具体来说，当k==4时，这是n%16或n&0xf。当k==5时，这是n%32或n&0x1f。尝试一下k==0和k==63。

引入Go语言

现在，我们准备在Go中考虑所有这些。我们注意到，int（普通的、未修饰的int）保证能够容纳-2147483648到+2147483647（-0x80000000到+0x7fffffff）之间的值。它可能扩展到-0x8000000000000000到+0x7ffffffffffffff。

与此同时，int32始终处理较小的范围，而int64始终处理较大的范围。普通的int是这两者中的一个不同的类型，但实现了其中一个的相同范围。我们只是不知道是哪一个。

我们的Int31实现返回一个在0..0x7ffffff范围内的均匀分布的随机数。（它通过返回r.Int63()的高32位来实现这一点，尽管这是一个实现细节。）我们的Int63实现返回一个在0..0x7ffffffffffffff范围内的均匀分布的随机数。

你在这里展示的Intn函数：

func (r *Rand) Intn(n int) int {
    if n <= 0 {
        panic("invalid argument to Intn")
    }
    if n <= 1<<31-1 {
        return int(r.Int31n(int32(n)))
    }
    return int(r.Int63n(int64(n)))
}

根据n的值选择其中一个函数：如果n小于或等于0x7fffffff（1<<31 - 1），结果适合int32，因此它使用int32(n)将n转换为int32，调用r.Int31n，然后将结果转换回int。否则，n的值超过了0x7fffffff，意味着int具有更大的范围，我们必须使用更大范围的生成器r.Int63n。其余部分与类型相关的代码相同。

代码可以每次都执行：

return int(r.Int63n(int64(n)))

但在64位机器上，64位算术可能会很慢。（这里有很多可能和可能，如果你现在自己编写这段代码，你应该从性能分析/基准测试代码开始。Go的作者确实做过这个，尽管这是很多年前的事情；在那个时候，这种花哨的东西是值得的。）

Shrinking the range of a uniform random number generator

Suppose that we have a uniform-distribution random number generator that returns a number between, say, 0 and 7 inclusive. That is, it will, over time, return about the same number of 0s, 1s, 2s, ..., 7s, but with no apparent pattern between them.

Now, if we want a uniformly distributed random number between 0 and 7, this thing is perfect. That's what it returns. We just use it. But what if we want a uniformly distributed random number between 0 and 6 instead?

We could write:

func randMod7() int {
    return generate() % 7
}

so that if generate() returns 7 (which it has a 1 out of 8 chance of doing), we convert that value to zero. But then we'll get zero back 2 out of 8 times, instead of 1 out of 8 times. We'll get 1, 2, 3, 4, 5, and 6 back 1 out of 8 times, and zero 2 out of 8 times, on average: once for each actual zero, and once for each 7.

What we need to do, then, is throw away any occurrences of 7:

func randMod7() int {
    for {
        if i := generate() &lt; 7 {
            return i
        }
        // oops, got 7, try again
    }
}

Now, if we had a uniform-random-number-generator named generate() that returned a value between 0 and (say) 11 (12 possible values) and we wanted a value between 0 and 3 (four possible values), we could just use generate() % 4, because the 12 possible results would fall into 3 groups of four with equal probability. If we wanted a value between 0 and 5 inclusive, we could use generate() % 6, because the 12 possible results would fall into two groups of 6 with equal probability. In fact, all we need to do is examine the prime factorization of the range of our uniform number generator to see what moduli work. The factors of 12 are 2, 2, 3; so 2, 3, 4, and 6 all work here. Any other modulus, such as generate() % 10, produce a biased result: 0 and 1 occur 2 out of 12 times, but 2 through 9 occur 1 out of 12 times. (Note: generate() % 12 also works, but is kind of pointless.)

In our particular case, we have two different uniform random number generators available. One, Int31(), produces values between 0 and 0x7fffffff (2147483647 decimal, or 231 - 1, or 1<<31 - 1) inclusive. The other, Int63(), produces values between 0 and 0x7fffffffffffffff (9223372036854775807, or 263 - 1, or 1<<63 - 1). These are ranges that hold 231 and 263 values respectively, and hence their prime factorization is 31 2s, or 63 2s.

What this means is that we can compute Int31() mod 2k, for any integer k in zero to 31 inclusive, without messing up our uniformity. With Int63(), we can do the same with k ranging all the way up to 63.

Introducing the computer

Now, mathematically-and-computer-ly speaking, given any nonnegative integer n in [0..0x7ffffff] or [0..0x7fffffffffffffff], and a non-negative integer k in the right range (no more than 31 or 63 respectively), computing that integer n mod 2k produces the same result as computing that integer and doing a bit-mask operation with k bits set. To get that number of set bits, we want to take 1<<k and subtract 1. If k is, say, 4, we get 1<<4 or 16. Subtracting 1, we get 15, or 0xf, which has four 1 bits in it.

So:

n % (1 &lt;&lt; k)

and:

n &amp; (1&lt;&lt;k - 1)

produce the same result. Concretely, when k==4, this is n%16 or n&0xf. When k==5 this is n%32 or n&0x1f. Try it for k==0 and k==63.

Introducing Go-the-language

We're now ready to consider doing all of this in Go. We note that int (plain, unadorned int) is guaranteed to be able to hold values between -2147483648 and +2147483647 (-0x80000000 through +0x7fffffff) respectively. It may extend all the way to -0x8000000000000000 through +0x7ffffffffffffff.

Meanwhile, int32 always handles the smaller range and int64 always handles the larger range. The plain int is a different type from these other two, but implements the same range as one of the two. We just don't know which one.

Our Int31 implementation returns a uniformly distributed random number in the 0..0x7ffffff range. (It does this by returning the upper 32 bits of r.Int63(), though this is an implementation detail.) Our Int63 implementation returns a uniformly distributed random number in the 0..0x7ffffffffffffff range.

The Intn function you show here:

func (r *Rand) Intn(n int) int {
    if n &lt;= 0 {
        panic(&quot;invalid argument to Intn&quot;)
    }
    if n &lt;= 1&lt;&lt;31-1 {
        return int(r.Int31n(int32(n)))
    }
    return int(r.Int63n(int64(n)))
}

just picks one of the two functions, based on the value of n: if it's less than or equal to 0x7fffffff (1<<31 - 1), the result fits in int32, so it uses int32(n) to convert n to int32, calls r.Int31n, and converts the result back to int. Otherwise, the value of n exceeds 0x7fffffff, implying that int has the larger range and we must use the larger-range generator, r.Int63n. The rest is the same except for types.

The code could just do:

return int(r.Int63n(int64(n)))

every time, but on 32-bit machines, where 64-bit arithmetic may be slow, this might be slow. (There's a lot of may and might here and if you were writing this yourself today, you should start by profiling / benchmarking the code. The Go authors did do this, though this was many years ago; at that time it was worth doing this fancy stuff.)

More bit-manipulation

The insides of both functions Int31n and Int63n are quite similar; the main difference is the types involved, and then in a few places, the maximum values. Again, the reason for this is at least partly historical: on some (mostly old now) computers, the Int63n variant is significantly slower than the Int32n variant. (In some non-Go language, we might write these as generics and then have the compiler generate a type-specific version automatically.) So let's just look at the Int63 variant:

func (r *Rand) Int63n(n int64) int64 {
    if n &lt;= 0 {
        panic(&quot;invalid argument to Int63n&quot;)
    }
    if n&amp;(n-1) == 0 { // n is power of two, can mask
        return r.Int63() &amp; (n - 1)
    }
    max := int64((1 &lt;&lt; 63) - 1 - (1&lt;&lt;63)%uint64(n))
    v := r.Int63()
    for v &gt; max {
        v = r.Int63()
    }
    return v % n
}

The argument n has type int64, so that its value will not exceed 263-1 or 0x7fffffffffffffff or 9223372036854775807. But it could be negative, and negative values won't work right, so the first thing we do is test for that and panic if so. We also panic if the input is zero (this is something of a choice, but it's useful to note it now).

Next we have the n&(n-1) == 0 test. This is a test for powers of two, with one slight flaw, and it works in many languages (those that have bit-masking):

A power of two is always represented as a single set bit, in the binary representation of a number. For instance, 2 itself is 000000012, 4 is 000000102, 8 is 000001002, and so on, through 128 being 100000002. (Since I only "drew" eight bits this series maxes out at 128.)
Subtracting 1 from that number causes a borrow: that bit goes to zero, and all the lesser bits become 1. For instance, 100000002 - 1 is 011111112.
AND-ing these two together produces zero if there was just the single bit set initially. If not—for instance, if we have the value 130 or 100000102 initially, subtracting 1 produces 100000012—there's no borrow out of the top bit, so the top bit is set in both inputs and therefore is set in the AND-ed result.

The slight flaw is that if the initial value is zero, then we have 0-1, which produces all-1s; 0&0xffffffffffffffff is zero too, but zero is not an integer power of two. (20 is 1, not 0.) This minor flaw is not important for our purpose here, because we already made sure to panic for this case: it just doesn't happen.

Now we have the most complicated line of all:

    max := int64((1 &lt;&lt; 63) - 1 - (1&lt;&lt;63)%uint64(n))

The recurring 63s here are because we have a value range going from zero to 263-1. 1<<63 - 1 is (still, again, always) 9223372036854775807 or 0x7fffffffffffffff. Meanwhile, 1<<63, without 1 subtracted from it, is 9223372036854775808 or 0x8000000000000000. This value does not fit into int64 but it does fit into uint64. So if we turn n into a uint64, we can compute uint64(9223372036854775808) % uint64(n), which is what the % expression does. By using uint64 for this calculation, we ensure that it doesn't overflow.

But: what is this calculation all about? Well, go back to our example with a generate() that produces values in [0..7]. When we wanted a number in [0..5], we had to discard both 6 and 7. That's what we're going for here: we want to find the value above which we should discard values.

If we were to take 8%6, we'd get 2. 8 is one bigger than the maximum that our 3-bit generate() would generate. 8%6 == 2 is the number of "high values" that we have to discard: 8-2 = 6 and we want to discard values that are 6 or more. Subtract 1 from this, and we get 7-2 = 5; we can accept numbers in this input range, from 0 to 5 inclusive.

So, this somewhat fancy calculation for setting max is just a way to find out what the maximum value we like is. Values that are greater than max need to be tossed out.

This particular calculation works nicely even if n is much less than our generator returns. For instance, suppose we had a four-bit generator, returning values in the [0..15] range, and we wanted a number in [0..2]. Our n is therefore 3 (to indicate that we want a number in [0..2]). We compute 16%3 to get 1. We then take 15 (one less than our maximum output value) - 1 to get 14 as our maximum acceptable value. That is, we would allow numbers in [0..14], but exclude 15.

With a 63-bit generator returning values in [0..9223372036854775807], and n==3, we would set max to 9223372036854775805. That's what we want: it throws out the two biasing values, 9223372036854775806 and 9223372036854775807.

The remainder of the code simply does that:

    v := r.Int63()
    for v &gt; max {
        v = r.Int63()
    }
    return v % n

We pick one Int63-range number. If it exceeds max, we pick another one and check again, until we pick one that is in the [0..max] range, inclusive of max.

Once we get a number that is in range, we use % n to shrink the range if needed. For instance, if the range is [0..2], we use v % 3. If v is (say) 14, 14%3 is 2. Our actual max is, again, 9223372036854775805, and whatever v is, between 0 and that, v%3 is between 0 and 2 and remains uniformly distributed, with no slight bias to 0 and 1 (9223372036854775806 would give us that one extra 0, and 9223372036854775807 would give us that one extra 1).

(Now repeat the above for int32 and 32 and 1<<32, for the Int31 function.)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

`rand.Intn`函数的内部工作原理 – Go语言

问题

答案1

缩小均匀随机数生成器的范围

引入计算机

引入Go语言

更多位操作

Shrinking the range of a uniform random number generator

Introducing the computer

Introducing Go-the-language

More bit-manipulation

如何调试docker-compose？配置路径在哪里设置？

如何在Go中获取系统命令的输出？

将算法从Python移植到Go语言

Simple way to copy a file

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论