如何将整数哈希值划分为范围?

huangapple go评论71阅读模式
英文:

How to subdivide integer hash into ranges

问题

我有一个无符号64位数字,表示尾数或分数(表示范围为[0..1),其中0.0映射为00xffffff..映射为一个接近于1.0的数字)

现在我想将这个范围分成相等的“桶”,并回答:给定随机数key,它将落在范围的哪个部分?

可以通过以下代码更容易地实现:

func BucketIndex(key, buckets uint64) uint64 {
    return uint64(float64(key) / (math.Pow(2, 64) / float64(buckets)))
}

我尝试通过将2^64分为两部分来“修改”它,就像我将范围缩小到32位一样,并在64位中进行运算:

// ~=key / ((1 << 64) / buckets)
return ((key >> 32) * buckets) >> 32

但是范围不再相等了...
例如,当buckets==3时,一个三分之一的范围将位于0x5555555600000000,而不是位于0x5555555555555556
这是一个令人沮丧的故事,所以我想问你是否知道更好的方法来找到(1 << 64) / buckets

英文:

I have unsigned 64bit number, representing mantissa, or fraction (which represent range from [0..1), where 0.0 maps to 0 and 0xffffff.. maps to a number "just before 1.0")

Now i want to split this range into equal buckets - and to answer - given random number key, to which part of the range it will fall to?

Its easier to get from following code:

func BucketIndex(key, buckets uint64) uint64 {
    return uint64(float64(key) / ((math.Pow(2, 64) / float64(buckets)))
}

My attempt to "hack this over" - was to split 2^64 to two, like if I will reduce range to 32bit, and operate in 64bit in order to conduct math:

// ~=key / ((1 &lt;&lt; 64) / buckets)
return ((key &gt;&gt; 32) * buckets) &gt;&gt; 32

but ranges stopped to be equal..
eg one third (buckets==3) will be at 0x5555555600000000, instead of being at 0x5555555555555556
thats sad story, so im asking do you know of a better methods of finding (1 &lt;&lt; 64) / buckets?

答案1

得分: 2

如果buckets是(编译时)常量,你可以使用常量表达式来计算桶的大小:常量可以是任意大小。否则,你可以在运行时使用big.Int来计算它,并存储结果(这样你就不必一直使用big.Int进行计算)。

使用常量表达式,在编译时

为了实现向上取整的整数除法,将除数减1加到被除数上:

const (
    max        = math.MaxUint64 + 1
    buckets    = 3
    bucketSize = uint64((max + buckets - 1) / buckets)
)

在运行时使用big.Int

我们也可以使用上述相同的逻辑来使用big.Int。另一种方法是使用Int.DivMod()(而不是添加buckets - 1),如果mod大于零,则将结果增加1。

func calcBucketSize(max, buckets *big.Int) uint64 {
    max = max.Add(max, buckets)
    max = max.Add(max, big.NewInt(-1))
    return max.Div(max, buckets).Uint64()
}

var bucketSize = calcBucketSize(new(big.Int).SetUint64(math.MaxUint64), big.NewInt(3))
英文:

If buckets is (compile-time) constant, you may use constant expression to calculate bucket size: constants are of arbitrary size. Else you may use big.Int to calculate it at runtime, and store the result (so you don't have to use big.Int calculations all the time).

Using a constant expression, at compile-time

To achieve an integer division rounding up, add divisor - 1 to the dividend:

const (
    max        = math.MaxUint64 + 1
    buckets    = 3
    bucketSize = uint64((max + buckets - 1) / buckets)
)

Using big.Int, at runtime

We can use the above same logic with big.Int too. An alternative would be to use Int.DivMod() (instead of adding buckets -1), and if mod is greater than zero, increment the result by 1.

func calcBucketSize(max, buckets *big.Int) uint64 {
    max = max.Add(max, buckets)
    max = max.Add(max, big.NewInt(-1))
    return max.Div(max, buckets).Uint64()
}

var bucketSize = calcBucketSize(new(big.Int).SetUint64(math.MaxUint64), big.NewInt(3))

huangapple
  • 本文由 发表于 2023年2月6日 22:18:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/75362517.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定