生成一个确定性的整数,使其与另一个整数没有重复。

huangapple go评论73阅读模式
英文:

Generating a deterministic int from another with no duplicates

问题

我正在寻找创建一个确定性数字生成函数,其中输入的数字将始终生成相同的数字,但没有两个数字会生成相同的结果。

例如:

1 -> 3
2 -> 5
3 -> 4
4 -> 2
5 -> 1

然而,我需要这个函数适用于所有可以由特定数据类型表示的数字,例如 int64。

这感觉像是一件非常简单的事情,或者完全不可能。是否有一种随机数生成方案可以保证这种分布,而无需创建一个包含所有可能数字的数组,随机排序,然后使用索引(同时使我耗尽内存)?

非常感谢
F

英文:

I'm looking to create a deterministic number generation function where the input number will always generate the same number, but no two numbers will end up generating the same result.

E.g.:

1 -> 3
2 -> 5
3 -> 4
4 -> 2
5 -> 1

However I need this to work for all numbers that can be represented by a specific datatype, e.g. an int64.

This feels like something that should either be really straightforward, or completely impossible. Is there some random number generation scheme that guarantees this sort of distribution without me having to create an array of all possible numbers, randomly sort, and then use the index (and making me run out of memory in the meantime)?

Many thanks
F

答案1

得分: 6

你需要的转换公式是:

f(P) = (mP + s) mod n

// n = 范围 - 对于 uint64 是 2^64
// s < 范围,即 < 2^64
// m = 必须与 n 互质

这是在 仿射密码 中使用的 模运算

mod 确保结果在所需范围内,s 是一个简单的偏移量,m 应该与 n 互质
互质意味着 nm 不应该有任何公共因子。
由于 n 是 2^64,它的唯一因子是数字 2 - 所以 m 基本上不能是偶数(即不能被 2 整除):

所以对于 uint64 范围:

var (
    m = uint64(39293)    // 一些非偶数
    s = uint64(75321908) // 2^64 以下的某个随机数
)

func transform(p uint64) uint64 {
    return p*m + s // 隐式地对 2^64 进行了模运算,由于类型的大小
}

这可能看起来像魔法,但你可以通过 uint16 来验证它的工作原理:

https://go.dev/play/p/EKB6SH3-SGu

(因为运行 uint64 需要相当多的资源 生成一个确定性的整数,使其与另一个整数没有重复。


更新

对于有符号数(即 int64),逻辑没有任何区别。由于我们知道 uint64int64 之间有一个唯一的一对一映射,一个方法就是将输入和输出从 uint64 转换为 int64,反之亦然:

// 原始的无符号版本
func transform(p uint64) uint64 {
    return m*p + s
}

func signedTransform(p int64) int64 {
    return int64(transform(uint64(p)))
}

这里是一个 int16 的示例,证明没有冲突:

https://go.dev/play/p/Fkw5FLMK0Fu

英文:

The transformation formula you need is:

f(P) = (mP + s) mod n

// n = range - so for uint64 2^64
// s &lt; range i.e. &lt; 2^64
// m = must be coprime with n

This is modular arithmetic as used in the Affine cipher.

The mod ensures it is within desired range, s is a simple shift and m should be coprime with n.
Coprime simply means n and m should not share any common factors.
Since n is 2^64 its only factors are the number 2 - so m should basically not be even (i.e. not divisible by 2):

So for uint64 range:

var (
    m = uint64(39293)    // some non-even number
    s = uint64(75321908) // some random number below 2^64
)

func transform(p uint64) uint64 {
    return p*m + s // implicitly mod&#39;ed 2^64 by the type&#39;s size
}

This may appear magic, but you can convince yourself it works with uint16:

https://go.dev/play/p/EKB6SH3-SGu

(as uint64 would take quite some resources to run 生成一个确定性的整数,使其与另一个整数没有重复。


Update:

For signed numbers (i.e. int64) the logic is no different. Since we know we have a unique one-to-one mapping with uint64 one approach would simply be to convert inputs & outputs from uint64 to int64 and vice versa:

// original unsigned version
func transform(p uint64) uint64 {
    return m*p + s
}

func signedTransform(p int64) int64 {
    return int64(transform(uint64(p)))
}

again here's a int16 example proving there's no collisions:

https://go.dev/play/p/Fkw5FLMK0Fu

答案2

得分: 3

添加到colm.anseo的答案中,这种映射也被称为线性同余生成器

Xn+1 = (aXn+c) mod m

当c ≠ 0时,正确选择的参数可以使周期等于m,对于所有的种子值。这将发生当且仅当:

  • m和c互质,
  • a-1可被m的所有质因数整除,
  • 如果m可被4整除,则a-1可被4整除。

这三个要求被称为Hull-Dobell定理。

对于64位LCG,来自Knuth的a=6364136223846793005和c=1442695040888963407看起来不错。

请注意,LCG映射是一对一的,它将整个[0...264-1]区域映射到自身。如果你想要,你可以反转它。作为RNG,它具有明显的跳跃能力。

英文:

To add to colm.anseo answer, this kind of mapping is also known as Linear congruential generator.

X<sub>n+1</sub> = (aX<sub>n</sub>+c) mod m

When c ≠ 0, correctly chosen parameters allow a period equal to m, for all seed values. This will occur if and only if:

  • m and c are relatively prime,
  • a-1 is divisible by all prime factors of m,
  • a-1 is divisible by 4 if m is divisible by 4.

These three requirements are referred to as the Hull–Dobell Theorem.

For 64bit LCG a=6364136223846793005 and c=1442695040888963407 from Knuth looks good.

Note, that LCG mapping is 1-to-1, it maps whole [0...2<sup>64</sup>-1] region to itself. You could invert it if you want. And as RNG it has distinctive ability for jump-ahead.

huangapple
  • 本文由 发表于 2022年5月10日 22:19:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/72188054.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定