为什么我的滚动adler32校验和在Go语言中不起作用?(模运算)

huangapple go评论74阅读模式
英文:

Why does my rolling adler32 checksum not work in go? (modulo arithmetic)

问题

我正在使用Go语言实现一个滚动哈希版本的adler32校验和

这个答案对我验证数学计算很有帮助。然而,我在正确实现它的Go语言代码方面遇到了困难。

我编写了以下代码:

func roll(adler, n, leave, enter uint32) uint32 {
    a := adler & 0xffff
    b := adler >> 16

    a = (a + enter - leave) % MOD
    b = (b - n*leave - 1 + a) % MOD
    return b<<16 | a
}

我对它进行了各种输入的测试,结果都很好,直到我决定在随机数据上运行它。这里有一个示例,在这个示例中它不起作用(我找到了几个类似的示例)。

令我困惑的是,相同的Python代码在这些输入上完美地工作:

def roll(adler, n, leave, enter):
    a = adler & 0xffff
    b = adler >> 16

    a = (a + enter - leave) % MOD
    b = (b - n*leave - 1 + a) % MOD
    return b<<16 | a

为了确保,我包含了证明,证明了这在Python中是有效的。请注意,Python的校验和与Go的非滚动版本的校验和匹配(而且这部分直接来自Go的核心库)。

我研究了所有其他有问题的示例的结果,并发现我在校验和的最低有效位("a"位)上从未出错。而且,错误始终相同,等于0xe10000。我怀疑Go语言在uint32整数上处理模运算的方式可能是导致这个问题的原因。

发生了什么情况,我该如何修复我的代码?

英文:

I am implementing, in go, a rolling version of the adler32 checksum.

This answer was helpful to double check my maths. However I am struggling at implementing it correctly in golang.

I wrote the following code:

func roll(adler, n, leave, enter uint32) uint32 {
	a := adler &amp; 0xffff
	b := adler &gt;&gt; 16

	a = (a + enter - leave) % MOD
	b = (b - n*leave - 1 + a) % MOD
	return b&lt;&lt;16 | a
}

It tested it on various inputs and it worked fine, until I decided to run it on random data. Here is a sample where it does not work (I found several of them).

What is baffling me is that the same code in python works perfectly on those inputs:

def roll(adler, n, leave, enter):
    a = adler &amp; 0xffff
    b = adler &gt;&gt; 16

    a = (a + enter - leave) % MOD
    b = (b - n*leave - 1 + a) % MOD
    return b&lt;&lt;16 | a

For good measure, I am including proof that this works in python. Note that the python checksum matches the non-rolling version of the go checksum (and that part is directly from the go core libraries).

I studied my results on all the other problematic samples, and found that I am never making a mistake on the least significant bits of the checksum (the "a" bits). Also, the error is consistently the same, equals to 0xe10000. I suspect a peculiarity of how go handles modulo operations on uint32 integers to be the cause of this.

What is happening and how do I fix my code?

答案1

得分: 4

在Python中,整数是有符号的。而在golang版本中,你声明了所有的整数都是无符号的。这就是区别所在。

当一个较小的无符号数减去一个无符号数时,你会得到一个巨大的无符号数,它在除法运算中产生的余数与小的负差异不同。当你进行包装时,实际上是在加上2的32次方。2的32次方对65521取模得到225,即0xe1,这就是为什么你在b中看到了这种差异。在b的计算中更容易发生包装,但如果a在该步骤中非常小,也可能发生包装。

根据@samgak的评论,你还必须关注不同语言中有符号值的%运算符的定义。因此,在进行% MOD之前,适用于不同约定的解决方案是通过添加足够多的MOD使值变为正数。对于a,只需添加MOD。对于b,添加(1 + n * leave / MOD) * MOD

请注意确保中间值不会溢出。如果n*leave足够大以包装所使用的整数类型,go中的代码可能会给出错误的结果。

英文:

The integers in Python are signed. You declared all the integers in the golang version to be unsigned. That's the difference.

When an unsigned number is subtracted from a smaller unsigned number, you get a huge unsigned number that gives a different remainder on division than the small negative difference would. When you wrap, you are effectively adding 2<sup>32</sup>. 2<sup>32</sup> mod 65521 is 225, or 0xe1, which is why you're seeing that difference in b. It's much more likely to wrap on the b calculation, but it can happen for a as well, if a happens to be very small at that step.

Per the comment by @samgak, you also have to worry about the definition of the % operator in different languages for signed values. So the solution that works across different conventions would be to make the values positive by adding as many MODs as necessary, before doing the % MOD. For a, just add MOD. For b, add (1 + n * leave / MOD) * MOD.

Take care to make sure that the intermediate values don't overflow. The code in go can give erroneous results if n*leave is large enough to wrap the integer type being used.

答案2

得分: 0

我不了解go语言,但是这里有一些可能性:

b := adler >> 16 改为 b := (adler >> 16) & 0xffff

b = (b - n*leave - 1 + a) % MOD ... 如果括号中的表达式是负数会发生什么?

return b<<16 | a ... 检查运算符优先级;是 (b<<16)|a 还是 b<<(16|a)?

是32位机还是64位机?
英文:

I don't know go, but here are some possibilities:

b := adler &gt;&gt; 16 change to b := (adler &gt;&gt; 16) &amp; 0xffff

b = (b - n*leave - 1 + a) % MOD ... what happens if expression in () is negative?

return b&lt;&lt;16 | a ... check operator precedence; (b&lt;&lt;16)|a or b&lt;&lt;(16|a) ?

32-bit machine or 64-bit?

huangapple
  • 本文由 发表于 2016年12月6日 07:14:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/40985080.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定