英文:
Why does my rolling adler32 checksum not work in go? (modulo arithmetic)
问题
我正在使用Go语言实现一个滚动哈希版本的adler32校验和。
这个答案对我验证数学计算很有帮助。然而,我在正确实现它的Go语言代码方面遇到了困难。
我编写了以下代码:
func roll(adler, n, leave, enter uint32) uint32 {
a := adler & 0xffff
b := adler >> 16
a = (a + enter - leave) % MOD
b = (b - n*leave - 1 + a) % MOD
return b<<16 | a
}
我对它进行了各种输入的测试,结果都很好,直到我决定在随机数据上运行它。这里有一个示例,在这个示例中它不起作用(我找到了几个类似的示例)。
令我困惑的是,相同的Python代码在这些输入上完美地工作:
def roll(adler, n, leave, enter):
a = adler & 0xffff
b = adler >> 16
a = (a + enter - leave) % MOD
b = (b - n*leave - 1 + a) % MOD
return b<<16 | a
为了确保,我包含了证明,证明了这在Python中是有效的。请注意,Python的校验和与Go的非滚动版本的校验和匹配(而且这部分直接来自Go的核心库)。
我研究了所有其他有问题的示例的结果,并发现我在校验和的最低有效位("a"位)上从未出错。而且,错误始终相同,等于0xe10000
。我怀疑Go语言在uint32整数上处理模运算的方式可能是导致这个问题的原因。
发生了什么情况,我该如何修复我的代码?
英文:
I am implementing, in go, a rolling version of the adler32 checksum.
This answer was helpful to double check my maths. However I am struggling at implementing it correctly in golang.
I wrote the following code:
func roll(adler, n, leave, enter uint32) uint32 {
a := adler & 0xffff
b := adler >> 16
a = (a + enter - leave) % MOD
b = (b - n*leave - 1 + a) % MOD
return b<<16 | a
}
It tested it on various inputs and it worked fine, until I decided to run it on random data. Here is a sample where it does not work (I found several of them).
What is baffling me is that the same code in python works perfectly on those inputs:
def roll(adler, n, leave, enter):
a = adler & 0xffff
b = adler >> 16
a = (a + enter - leave) % MOD
b = (b - n*leave - 1 + a) % MOD
return b<<16 | a
For good measure, I am including proof that this works in python. Note that the python checksum matches the non-rolling version of the go checksum (and that part is directly from the go core libraries).
I studied my results on all the other problematic samples, and found that I am never making a mistake on the least significant bits of the checksum (the "a" bits). Also, the error is consistently the same, equals to 0xe10000
. I suspect a peculiarity of how go handles modulo operations on uint32 integers to be the cause of this.
What is happening and how do I fix my code?
答案1
得分: 4
在Python中,整数是有符号的。而在golang版本中,你声明了所有的整数都是无符号的。这就是区别所在。
当一个较小的无符号数减去一个无符号数时,你会得到一个巨大的无符号数,它在除法运算中产生的余数与小的负差异不同。当你进行包装时,实际上是在加上2的32次方。2的32次方对65521取模得到225,即0xe1
,这就是为什么你在b
中看到了这种差异。在b
的计算中更容易发生包装,但如果a
在该步骤中非常小,也可能发生包装。
根据@samgak的评论,你还必须关注不同语言中有符号值的%
运算符的定义。因此,在进行% MOD
之前,适用于不同约定的解决方案是通过添加足够多的MOD
使值变为正数。对于a
,只需添加MOD
。对于b
,添加(1 + n * leave / MOD) * MOD
。
请注意确保中间值不会溢出。如果n*leave
足够大以包装所使用的整数类型,go中的代码可能会给出错误的结果。
英文:
The integers in Python are signed. You declared all the integers in the golang version to be unsigned. That's the difference.
When an unsigned number is subtracted from a smaller unsigned number, you get a huge unsigned number that gives a different remainder on division than the small negative difference would. When you wrap, you are effectively adding 2<sup>32</sup>. 2<sup>32</sup> mod 65521 is 225, or 0xe1
, which is why you're seeing that difference in b
. It's much more likely to wrap on the b
calculation, but it can happen for a
as well, if a
happens to be very small at that step.
Per the comment by @samgak, you also have to worry about the definition of the % operator in different languages for signed values. So the solution that works across different conventions would be to make the values positive by adding as many MOD
s as necessary, before doing the % MOD
. For a
, just add MOD
. For b
, add (1 + n * leave / MOD) * MOD
.
Take care to make sure that the intermediate values don't overflow. The code in go can give erroneous results if n*leave
is large enough to wrap the integer type being used.
答案2
得分: 0
我不了解go语言,但是这里有一些可能性:
b := adler >> 16 改为 b := (adler >> 16) & 0xffff
b = (b - n*leave - 1 + a) % MOD ... 如果括号中的表达式是负数会发生什么?
return b<<16 | a ... 检查运算符优先级;是 (b<<16)|a 还是 b<<(16|a)?
是32位机还是64位机?
英文:
I don't know go, but here are some possibilities:
b := adler >> 16 change to b := (adler >> 16) & 0xffff
b = (b - n*leave - 1 + a) % MOD ... what happens if expression in () is negative?
return b<<16 | a ... check operator precedence; (b<<16)|a or b<<(16|a) ?
32-bit machine or 64-bit?
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论