英文:
How to properly seed random number generator
问题
我正在尝试在Go中生成一个随机字符串,以下是我目前编写的代码:
package main
import (
"bytes"
"fmt"
"math/rand"
"time"
)
func main() {
fmt.Println(randomString(10))
}
func randomString(l int) string {
var result bytes.Buffer
var temp string
for i := 0; i < l; {
if string(randInt(65, 90)) != temp {
temp = string(randInt(65, 90))
result.WriteString(temp)
i++
}
}
return result.String()
}
func randInt(min int, max int) int {
rand.Seed(time.Now().UTC().UnixNano())
return min + rand.Intn(max-min)
}
我的实现非常慢。使用time
进行种子生成会在一定时间内产生相同的随机数,所以循环一遍又一遍。我该如何改进我的代码?
英文:
I am trying to generate a random string in Go and here is the code I have written so far:
package main
import (
"bytes"
"fmt"
"math/rand"
"time"
)
func main() {
fmt.Println(randomString(10))
}
func randomString(l int) string {
var result bytes.Buffer
var temp string
for i := 0; i < l; {
if string(randInt(65, 90)) != temp {
temp = string(randInt(65, 90))
result.WriteString(temp)
i++
}
}
return result.String()
}
func randInt(min int, max int) int {
rand.Seed(time.Now().UTC().UnixNano())
return min + rand.Intn(max-min)
}
My implementation is very slow. Seeding using time
brings the same random number for a certain time, so the loop iterates again and again. How can I improve my code?
答案1
得分: 298
每次设置相同的种子,你都会得到相同的序列。所以当你在一个快速循环中将种子设置为时间时,你可能会多次使用相同的种子调用它。
在你的情况下,当你调用randInt
函数直到得到一个不同的值时,你正在等待时间(由Nano返回)发生变化。
对于所有伪随机库,你只需要设置种子一次,例如在初始化程序时,除非你特别需要重现给定的序列(通常只用于调试和单元测试)。
之后,你只需调用Intn
来获取下一个随机整数。
将rand.Seed(time.Now().UTC().UnixNano())
这一行从randInt函数移动到main的开头,一切都会更快。并且去掉.UTC()
的调用,因为:
UnixNano返回t作为Unix时间,即从1970年1月1日UTC开始经过的纳秒数。
还要注意,我认为你可以简化你的字符串构建:
package main
import (
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
fmt.Println(randomString(10))
}
func randomString(l int) string {
bytes := make([]byte, l)
for i := 0; i < l; i++ {
bytes[i] = byte(randInt(65, 90))
}
return string(bytes)
}
func randInt(min int, max int) int {
return min + rand.Intn(max-min)
}
英文:
Each time you set the same seed, you get the same sequence. So of course if you're setting the seed to the time in a fast loop, you'll probably call it with the same seed many times.
In your case, as you're calling your randInt
function until you have a different value, you're waiting for the time (as returned by Nano) to change.
As for all pseudo-random libraries, you have to set the seed only once, for example when initializing your program unless you specifically need to reproduce a given sequence (which is usually only done for debugging and unit testing).
After that you simply call Intn
to get the next random integer.
Move the rand.Seed(time.Now().UTC().UnixNano())
line from the randInt function to the start of the main and everything will be faster. And lose the .UTC()
call since:
> UnixNano returns t as a Unix time, the number of nanoseconds elapsed since January 1, 1970 UTC.
Note also that I think you can simplify your string building:
package main
import (
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
fmt.Println(randomString(10))
}
func randomString(l int) string {
bytes := make([]byte, l)
for i := 0; i < l; i++ {
bytes[i] = byte(randInt(65, 90))
}
return string(bytes)
}
func randInt(min int, max int) int {
return min + rand.Intn(max-min)
}
答案2
得分: 149
我不明白为什么人们要使用时间值进行种子生成。根据我的经验,这从来都不是一个好主意。例如,尽管系统时钟可能以纳秒表示,但系统的时钟精度并不是纳秒级别。
这个程序不应该在Go playground上运行,但是如果你在自己的机器上运行它,你可以得到一个关于你可以期望的精度类型的粗略估计。我看到的增量大约是1000000纳秒,即1毫秒的增量。这是20位未被使用的熵。*而高位大部分是恒定的!*大约一天有大约24位的熵,这是非常容易被暴力破解的(可能会导致漏洞)。
这对你的影响程度会有所不同,但是你可以通过简单地使用crypto/rand.Read
作为种子的来源来避免基于时钟的种子值的陷阱。它将为您提供您在随机数中可能正在寻找的非确定性质量(即使实际实现本身仅限于一组不同且确定的随机序列)。
import (
crypto_rand "crypto/rand"
"encoding/binary"
math_rand "math/rand"
)
func init() {
var b [8]byte
_, err := crypto_rand.Read(b[:])
if err != nil {
panic("无法使用加密安全的随机数生成器为math/rand包提供种子")
}
math_rand.Seed(int64(binary.LittleEndian.Uint64(b[:])))
}
顺便提一下,与您的问题相关。您可以使用此方法创建自己的rand.Source
,以避免使用锁来保护源的成本。rand
包的实用函数很方便,但它们在内部使用锁来防止源被并发使用。如果您不需要这样做,可以通过创建自己的Source
并以非并发方式使用它来避免使用锁。无论如何,您不应该在迭代之间重新生成随机数生成器的种子,它从未设计成这样使用。
编辑:我曾在ITAM/SAM工作,我们构建的客户端(当时)使用了基于时钟的种子。在Windows更新后,公司的许多机器在大致相同的时间重新启动。这导致上游服务器基础设施遭受了一次无意识的DoS攻击,因为客户端使用系统运行时间来生成随机数,并且这些机器最终几乎随机选择了相同的时间段进行报告。它们本应该在大约一个小时的时间段内分散负载,但这并没有发生。请负责任地进行种子生成!
英文:
I don't understand why people are seeding with a time value. This has in my experience never been a good idea. For example, while the system clock is maybe represented in nanoseconds, the system's clock precision isn't nanoseconds.
This program should not be run on the Go playground but if you run it on your machine you get a rough estimate on what type of precision you can expect. I see increments of about 1000000 ns, so 1 ms increments. That's 20 bits of entropy that are not used. All the while the high bits are mostly constant!? Roughly ~24 bits of entropy over a day which is very brute forceable (which can create vulnerabilities).
The degree that this matters to you will vary but you can avoid pitfalls of clock based seed values by simply using the crypto/rand.Read
as source for your seed. It will give you that non-deterministic quality that you are probably looking for in your random numbers (even if the actual implementation itself is limited to a set of distinct and deterministic random sequences).
import (
crypto_rand "crypto/rand"
"encoding/binary"
math_rand "math/rand"
)
func init() {
var b [8]byte
_, err := crypto_rand.Read(b[:])
if err != nil {
panic("cannot seed math/rand package with cryptographically secure random number generator")
}
math_rand.Seed(int64(binary.LittleEndian.Uint64(b[:])))
}
As a side note but in relation to your question. You can create your own rand.Source
using this method to avoid the cost of having locks protecting the source. The rand
package utility functions are convenient but they also use locks under the hood to prevent the source from being used concurrently. If you don't need that you can avoid it by creating your own Source
and use that in a non-concurrent way. Regardless, you should NOT be reseeding your random number generator between iterations, it was never designed to be used that way.
Edit: I used to work in ITAM/SAM and the client we built (then) used a clock based seed. After a Windows update a lot of machines in the company fleet rebooted at roughly the same time. This caused an involtery DoS attack on upstream server infrastructure because the clients was using system up time to seed randomness and these machines ended up more or less randomly picking the same time slot to report in. They were meant to smear the load over a period of an hour or so but that did not happen. Seed responsbily!
答案3
得分: 18
只返回翻译好的部分:
只是为了保存下来:有时候使用一个初始字符集字符串生成一个随机字符串可能更好。如果字符串应该由人工手动输入,那么排除0、O、1和l可以帮助减少用户错误。
// 生成固定大小的随机字符串
func srand(size int) string {
buf := make([]byte, size)
for i := 0; i < size; i++ {
buf[i] = alpha[rand.Intn(len(alpha))]
}
return string(buf)
}
我通常在init()
块中设置种子。它们在这里有文档:http://golang.org/doc/effective_go.html#init
英文:
just to toss it out for posterity: it can sometimes be preferable to generate a random string using an initial character set string. This is useful if the string is supposed to be entered manually by a human; excluding 0, O, 1, and l can help reduce user error.
var alpha = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ23456789"
// generates a random string of fixed size
func srand(size int) string {
buf := make([]byte, size)
for i := 0; i < size; i++ {
buf[i] = alpha[rand.Intn(len(alpha))]
}
return string(buf)
}
and I typically set the seed inside of an init()
block. They're documented here: http://golang.org/doc/effective_go.html#init
答案4
得分: 15
package main
import (
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
var bytes int
for i := 0; i < 10; i++ {
bytes = rand.Intn(6) + 1
fmt.Println(bytes)
}
}
英文:
OK why so complex!
package main
import (
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed( time.Now().UnixNano())
var bytes int
for i:= 0 ; i < 10 ; i++{
bytes = rand.Intn(6)+1
fmt.Println(bytes)
}
//fmt.Println(time.Now().UnixNano())
}
This is based off the dystroy's code but fitted for my needs.
It's die six (rands ints 1 =< i =< 6
)
func randomInt (min int , max int ) int {
var bytes int
bytes = min + rand.Intn(max)
return int(bytes)
}
The function above is the exactly same thing.
I hope this information was of use.
答案5
得分: 10
使用hash/maphash
来非确定性地种子math/rand
生成器是最好的方法(playground):
package main
import (
"fmt"
"hash/maphash"
"math/rand"
)
func main() {
r := rand.New(rand.NewSource(int64(new(maphash.Hash).Sum64())))
fmt.Println(r.Int())
}
与time.Now()
相比,maphash
保证了不同的种子(即使在不同的机器上)。与crypto/rand
相比,它速度更快,并且只需要一行代码。
英文:
The best way to non-deterministically seed the math/rand
generator is with hash/maphash
(playground):
package main
import (
"fmt"
"hash/maphash"
"math/rand"
)
func main() {
r := rand.New(rand.NewSource(int64(new(maphash.Hash).Sum64())))
fmt.Println(r.Int())
}
Compared to time.Now()
, maphash
guarantees distinct seeds (even on different machines). Compared to crypto/rand
, it is much faster, and is a one-liner.
答案6
得分: 8
从Go 1.20(2022年第四季度)开始,种子随机数生成器的正确方法也可以是...什么都不做。
如果没有调用Seed
函数,生成器将在程序启动时随机种子。
提案“math/rand
: seed global generator randomly”已被接受(2022年10月),并且实现已经开始:
- CL 443058:
math/rand
: 自动种子全局源
> 实现提案#54880,自动种子全局源。
> 这不是一个破坏性变更的理由是,任何在包的init
函数或导出的API中使用全局源的用法显然必须是有效的 - 也就是说,如果一个包在init
时或在导出的API中改变了它消耗的随机性的数量,那显然不是需要发布该包的v2版本的破坏性变更。
在包的全局源位置上的这种每个包的变化与以不同方式种子全局源是无法区分的。因此,如果每个包的变化是有效的,那么自动种子也是有效的。
>
> 当然,自动种子意味着包将不太可能依赖于全局源的具体结果,因此在将来发生这种每个包的变化时不会中断。
>
> 在需要旧的全局源序列并希望恢复旧行为的程序中可以调用Seed(1)
。当然,这些程序仍然会受到上述每个包的变化的影响,对于它们来说,最好分配本地源而不是继续使用全局源。
从issue 20661和CL 436955中还可以注意到,math/rand.Read
已被弃用:对于几乎所有用例,crypto/rand.Read
更合适。
正如这里所指出的:
> 可以使用gosec
和golanglint-ci
一起使用,如下所示,并关注G404
代码:
>
> golangci-lint run --disable-all --enable gosec
英文:
With Go 1.20 (Q4 2022), the proper way to seed random number generator could also be... to do nothing.
If Seed
is not called, the generator will be seeded randomly at program startup.
The proposal "math/rand
: seed global generator randomly" is accepted (Oct. 2022), and the implementation has started:
- CL 443058:
math/rand
: auto-seed global source
> Implement proposal #54880, to automatically seed the global source.
> The justification for this not being a breaking change is that any
use of the global source in a package's init
function or exported API
clearly must be valid - that is, if a package changes how much
randomness it consumes at init
time or in an exported API, that
clearly isn't the kind of breaking change that requires issuing a v2
of that package.
That kind of per-package change in the position of the global source is indistinguishable from seeding the global source differently. So if the per-package change is valid, so is auto-seeding.
>
> And then, of course, auto-seeding means that packages will be
far less likely to depend on the specific results of the global source
and therefore not break when those kinds of per-package changes
happen in the future.
>
> Seed(1)
can be called in programs that need the old sequence from
the global source and want to restore the old behavior.
Of course, those programs will still be broken by the per-package
changes just described, and it would be better for them to allocate
local sources rather than continue to use the global one.
From issue 20661 and CL 436955, note also that math/rand.Read
is deprecated: For almost all use cases, crypto/rand.Read
is more appropriate.
As noted here:
> One can use gosec
linter with golanglint-ci
like so
and watch for G404
code:
>
> golangci-lint run --disable-all --enable gosec
答案7
得分: 1
我尝试了下面的程序,并且每次都看到不同的字符串
package main
import (
"fmt"
"math/rand"
"time"
)
func RandomString(count int){
rand.Seed(time.Now().UTC().UnixNano())
for(count > 0 ){
x := Random(65,91)
fmt.Printf("%c",x)
count--;
}
}
func Random(min, max int) (int){
return min+rand.Intn(max-min)
}
func main() {
RandomString(12)
}
在我的控制台上的输出是
D:\james\work\gox>go run rand.go
JFBYKAPEBCRC
D:\james\work\gox>go run rand.go
VDUEBIIDFQIB
D:\james\work\gox>go run rand.go
VJYDQPVGRPXM
英文:
I tried the program below and saw different string each time
package main
import (
"fmt"
"math/rand"
"time"
)
func RandomString(count int){
rand.Seed(time.Now().UTC().UnixNano())
for(count > 0 ){
x := Random(65,91)
fmt.Printf("%c",x)
count--;
}
}
func Random(min, max int) (int){
return min+rand.Intn(max-min)
}
func main() {
RandomString(12)
}
And the output on my console is
D:\james\work\gox>go run rand.go
JFBYKAPEBCRC
D:\james\work\gox>go run rand.go
VDUEBIIDFQIB
D:\james\work\gox>go run rand.go
VJYDQPVGRPXM
答案8
得分: 0
@[Denys Séguret]已经发布了正确的答案。但在我的情况下,我需要每次都有一个新的种子,因此下面是代码:
如果你需要快速函数,我会像这样使用。
func RandInt(min, max int) int {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
return r.Intn(max-min) + min
}
func RandFloat(min, max float64) float64 {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
return min + r.Float64()*(max-min)
}
英文:
@[Denys Séguret] has posted correct. But In my case I need new seed everytime hence below code;
Incase you need quick functions. I use like this.
func RandInt(min, max int) int {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
return r.Intn(max-min) + min
}
func RandFloat(min, max float64) float64 {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
return min + r.Float64()*(max-min)
}
答案9
得分: -1
这是一个生成随机字符串的代码。它使用了math/rand和time包来生成随机数种子,并根据输入的最小和最大长度以及可读性参数生成随机字符串。在main函数中,对srand函数进行了多次调用来测试不同参数下的随机字符串生成。
英文:
It's nano seconds, what are the chances of getting the same seed twice.
Anyway, thanks for the help, here is my end solution based on all the input.
package main
import (
"math/rand"
"time"
)
func init() {
rand.Seed(time.Now().UTC().UnixNano())
}
// generates a random string
func srand(min, max int, readable bool) string {
var length int
var char string
if min < max {
length = min + rand.Intn(max-min)
} else {
length = min
}
if readable == false {
char = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
} else {
char = "ABCDEFHJLMNQRTUVWXYZabcefghijkmnopqrtuvwxyz23479"
}
buf := make([]byte, length)
for i := 0; i < length; i++ {
buf[i] = char[rand.Intn(len(char)-1)]
}
return string(buf)
}
// For testing only
func main() {
println(srand(5, 5, true))
println(srand(5, 5, true))
println(srand(5, 5, true))
println(srand(5, 5, false))
println(srand(5, 7, true))
println(srand(5, 10, false))
println(srand(5, 50, true))
println(srand(5, 10, false))
println(srand(5, 50, true))
println(srand(5, 10, false))
println(srand(5, 50, true))
println(srand(5, 10, false))
println(srand(5, 50, true))
println(srand(5, 4, true))
println(srand(5, 400, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
println(srand(6, 5, true))
}
答案10
得分: -1
如果你的目标只是生成一个随机数的字符串,那么我认为没有必要通过多次函数调用或每次重置种子来复杂化它。
最重要的步骤是在实际运行rand.Init(x)
之前只调用一次种子函数。Seed使用提供的种子值将默认源初始化为确定性状态。因此,建议在实际函数调用之前调用它一次,以生成伪随机数。
以下是创建随机数字符串的示例代码:
package main
import (
"fmt"
"math/rand"
"time"
)
func main(){
rand.Seed(time.Now().UnixNano())
var s string
for i:=0;i<10;i++{
s+=fmt.Sprintf("%d ",rand.Intn(7))
}
fmt.Printf(s)
}
我使用Sprintf的原因是它允许简单的字符串格式化。
另外,在rand.Intn(7)
中,Intn返回一个非负的伪随机数作为int类型,范围在[0,7)之间。
英文:
If your aim is just to generate a sting of random number then I think it's unnecessary to complicate it with multiple function calls or resetting seed every time.
The most important step is to call seed function just once before actually running rand.Init(x)
. Seed uses the provided seed value to initialize the default Source to a deterministic state. So, It would be suggested to call it once before the actual function call to pseudo-random number generator.
Here is a sample code creating a string of random numbers
package main
import (
"fmt"
"math/rand"
"time"
)
func main(){
rand.Seed(time.Now().UnixNano())
var s string
for i:=0;i<10;i++{
s+=fmt.Sprintf("%d ",rand.Intn(7))
}
fmt.Printf(s)
}
The reason I used Sprintf is because it allows simple string formatting.
Also, In rand.Intn(7)
Intn returns, as an int, a non-negative pseudo-random number in [0,7).
答案11
得分: -1
每次在for循环内调用randint()方法时,都会设置一个不同的种子并根据时间生成一个序列。但是由于for循环在计算机中运行得很快,在很短的时间内种子几乎相同,并且由于时间的原因生成了一个非常相似的序列。因此,在randint()方法外设置种子就足够了。
package main
import (
"bytes"
"fmt"
"math/rand"
"time"
)
var r = rand.New(rand.NewSource(time.Now().UTC().UnixNano()))
func main() {
fmt.Println(randomString(10))
}
func randomString(l int) string {
var result bytes.Buffer
var temp string
for i := 0; i < l; {
if string(randInt(65, 90)) != temp {
temp = string(randInt(65, 90))
result.WriteString(temp)
i++
}
}
return result.String()
}
func randInt(min int, max int) int {
return min + r.Intn(max-min)
}
英文:
Every time the randint() method is called inside the for loop a different seed is set and a sequence is generated according to the time. But as for loop runs fast in your computer in a small time the seed is almost same and a very similar sequence is generated to the past one due to the time. So setting the seed outside the randint() method is enough.
package main
import (
"bytes"
"fmt"
"math/rand"
"time"
)
var r = rand.New(rand.NewSource(time.Now().UTC().UnixNano()))
func main() {
fmt.Println(randomString(10))
}
func randomString(l int) string {
var result bytes.Buffer
var temp string
for i := 0; i < l; {
if string(randInt(65, 90)) != temp {
temp = string(randInt(65, 90))
result.WriteString(temp)
i++
}
}
return result.String()
}
func randInt(min int, max int) int {
return min + r.Intn(max-min)
}
答案12
得分: -3
Small update due to golang api change, please omit .UTC() :
time.Now().UnixNano() -> time.Now().UnixNano()
import (
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
fmt.Println(randomInt(100, 1000))
}
func randInt(min int, max int) int {
return min + rand.Intn(max-min)
}
英文:
Small update due to golang api change, please omit .UTC() :
time.Now().UTC().UnixNano() -> time.Now().UnixNano()
import (
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
fmt.Println(randomInt(100, 1000))
}
func randInt(min int, max int) int {
return min + rand.Intn(max-min)
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论