英文:
How to efficiently hash (SHA 256) in golang data where only the last few bytes changes
问题
假设你有80个字节的数据,只有最后4个字节是不断变化的,你如何使用Go语言高效地对这80个字节进行哈希处理?实际上,前76个字节是相同的,而最后4个字节不断变化。理想情况下,你希望保留对前76个字节的哈希摘要的副本,只需不断更改最后4个字节。
英文:
Assuming you had 80 bytes of data and only the last 4 bytes was constantly changing, how would you efficiently hash the total 80 bytes using Go. In essence, the first 76 bytes are the same, while the last 4 bytes keeps changing. Ideally, you want to keep a copy of the hash digest for the first 76 bytes and just keep changing the last 4.
答案1
得分: 14
你可以在Go Playground上尝试以下示例。基准测试结果在末尾。
注意:下面的实现不适用于并发使用;我故意这样设计以使其更简单和更快。
仅使用公共API时最快(始终对所有输入进行哈希)
Go的哈希算法的一般概念和接口是hash.Hash
接口。这不允许您保存哈希器的状态并返回或回退到保存的状态。因此,使用Go标准库的公共哈希API,您总是必须从头计算哈希。
公共API提供的是重用已构造的哈希器来计算新输入的哈希,使用Hash.Reset()
方法。这样做很好,因为不需要(内存)分配来计算多个哈希值。此外,您可以利用可选的切片,该切片可以传递给Hash.Sum()
,用于将当前哈希附加到其中。这样做很好,因为不需要分配来接收哈希结果。
以下是利用这些优势的示例:
type Cached1 struct {
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached1() *Cached1 {
return &Cached1{hasher: sha256.New()}
}
func (c *Cached1) Sum(data []byte) []byte {
c.hasher.Reset()
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
测试数据
我们将使用以下测试数据:
var fixed = bytes.Repeat([]byte{1}, 76)
var variantA = []byte{1, 1, 1, 1}
var variantB = []byte{2, 2, 2, 2}
var data = append(append([]byte{}, fixed...), variantA...)
var data2 = append(append([]byte{}, fixed...), variantB...)
var c1 = NewCached1()
首先,让我们获得真实的结果(以验证我们的哈希器是否正常工作):
fmt.Printf("%x\n", sha256.Sum256(data))
fmt.Printf("%x\n", sha256.Sum256(data2))
输出:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
现在让我们检查我们的Cached1
哈希器:
fmt.Printf("%x\n", c1.Sum(data))
fmt.Printf("%x\n", c1.Sum(data2))
输出相同:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
更快但可能会中断(在未来的Go版本中):仅对最后4个字节进行哈希
现在让我们看一个不太灵活的解决方案,它只真正计算前76个固定部分的哈希。
crypto/sha256
包的哈希器是未导出的sha256.digest
类型(更准确地说是指向此类型的指针):
// digest represents the partial evaluation of a checksum.
type digest struct {
h [8]uint32
x [chunk]byte
nx int
len uint64
is224 bool // mark if this digest is SHA-224
}
digest
结构类型的值基本上保存了哈希器的当前状态。
我们可以做的是将前76个字节的固定部分提供给哈希器,然后将此结构值保存起来。当我们需要计算某个80字节数据的哈希值,其中前76个字节相同时,我们可以使用此保存的值作为起点,然后提供变化的最后4个字节。
请注意,只需保存此结构值就足够了,因为它不包含指针和描述符类型(如切片和映射)。否则,我们还必须复制它们,但我们很“幸运”。因此,如果未来的crypto/sha256
实现添加了指针或切片字段,此解决方案需要进行调整。
由于sha256.digest
是未导出的,我们只能使用反射(reflect
包)来实现我们的目标,这本质上会增加一些计算延迟。
以下是执行此操作的示例实现:
type Cached2 struct {
origv reflect.Value
hasherv reflect.Value
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached2(fixed []byte) *Cached2 {
h := sha256.New()
h.Write(fixed)
c := &Cached2{origv: reflect.ValueOf(h).Elem()}
hasherv := reflect.New(c.origv.Type())
c.hasher = hasherv.Interface().(hash.Hash)
c.hasherv = hasherv.Elem()
return c
}
func (c *Cached2) Sum(data []byte) []byte {
// 设置固定哈希的状态:
c.hasherv.Set(c.origv)
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
进行测试:
var c2 = NewCached2(fixed)
fmt.Printf("%x\n", c2.Sum(variantA))
fmt.Printf("%x\n", c2.Sum(variantB))
输出仍然相同:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
所以它有效。
“终极”最快解决方案
如果不涉及反射,Cached2
可能会更快。如果我们想要更快的解决方案,我们可以直接将sha256.digest
类型及其方法复制到我们的包中,这样我们就可以直接使用它,而不必使用反射。
如果我们这样做,我们将可以访问digest
结构值,并且我们可以简单地进行复制,如下所示:
var d digest
// 初始化 d
saved := d
还原它的方式如下:
d = saved
我只是将crypto/sha256
包克隆到我的工作区,并将digest
类型更改/导出为Digest
,仅用于演示目的。然后,使用此mysha256.Digest
类型,我像这样实现了Cached3
:
type Cached3 struct {
orig mysha256.Digest
result [sha256.Size]byte
}
func NewCached3(fixed []byte) *Cached3 {
var d mysha256.Digest
d.Reset()
d.Write(fixed)
return &Cached3{orig: d}
}
func (c *Cached3) Sum(data []byte) []byte {
// 复制固定哈希:
d := c.orig
d.Write(data)
return d.Sum(c.result[:0])
}
进行测试:
var c3 = NewCached3(fixed)
fmt.Printf("%x\n", c3.Sum(variantA))
fmt.Printf("%x\n", c3.Sum(variantB))
输出仍然相同。所以这也有效。
基准测试
我们可以使用以下代码对性能进行基准测试:
func BenchmarkCached1(b *testing.B) {
for i := 0; i < b.N; i++ {
c1.Sum(data)
c1.Sum(data2)
}
}
func BenchmarkCached2(b *testing.B) {
for i := 0; i < b.N; i++ {
c2.Sum(variantA)
c2.Sum(variantB)
}
}
func BenchmarkCached3(b *testing.B) {
for i := 0; i < b.N; i++ {
c3.Sum(variantA)
c3.Sum(variantB)
}
}
基准测试结果(go test -bench . -benchmem
):
BenchmarkCached1-4 1000000 1569 ns/op 0 B/op 0 allocs/op
BenchmarkCached2-4 2000000 926 ns/op 0 B/op 0 allocs/op
BenchmarkCached3-4 2000000 872 ns/op 0 B/op 0 allocs/op
Cached2
比Cached1
快大约41%,这是相当明显和不错的。与Cached2
相比,Cached3
只提供了一个“小”的性能提升,另外6%。Cached3
比Cached1
快44%。
还要注意,这些解决方案都不使用任何分配,这也很好。
结论
对于额外的40%或44%,我可能不会选择Cached2
或Cached3
解决方案。当然,这取决于性能对您的重要性。如果性能很重要,我认为Cached2
解决方案在最小增加复杂性和明显性能提升之间提供了一个很好的折衷方案。它确实存在风险,因为未来的Go实现可能会破坏它;如果这是一个问题,Cached3
通过复制当前实现来解决这个问题(并且还稍微提高了性能)。
英文:
You can try the following examples on the Go Playground. Benchmark results is at the end.
Note: the implementations below are not safe for concurrent use; I intentionally made them like this to be simpler and faster.
Fastest when using only public API (always hashes all input)
The general concept and interface of Go's hash algorithms is the hash.Hash
interface. This does not allow you to save the state of the hasher and to return or rewind to the saved state. So using the public hash APIs of the Go standard lib, you always have to calculate the hash from start.
What the public API offers is to reuse an already constructed hasher to calculate the hash of a new input, using the Hash.Reset()
method. This is nice so that no (memory) allocations will be needed to calculate multiple hash values. Also you may take advantage of the optional slice that may be passed to Hash.Sum()
which is used to append the current hash to. This is nice so that no allocations will be needed to receive the hash results either.
Here's an example that takes advantage of these:
type Cached1 struct {
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached1() *Cached1 {
return &Cached1{hasher: sha256.New()}
}
func (c *Cached1) Sum(data []byte) []byte {
c.hasher.Reset()
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
Test data
We'll use the following test data:
var fixed = bytes.Repeat([]byte{1}, 76)
var variantA = []byte{1, 1, 1, 1}
var variantB = []byte{2, 2, 2, 2}
var data = append(append([]byte{}, fixed...), variantA...)
var data2 = append(append([]byte{}, fixed...), variantB...)
var c1 = NewCached1()
First let's get authentic results (to verify if our hasher works correctly):
fmt.Printf("%x\n", sha256.Sum256(data))
fmt.Printf("%x\n", sha256.Sum256(data2))
Output:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
Now let's check our Cached1
hasher:
fmt.Printf("%x\n", c1.Sum(data))
fmt.Printf("%x\n", c1.Sum(data2))
Output is the same:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
Even faster but may break (in future Go releases): hashes only the last 4 bytes
Now let's see a less flexible solution which truly calculates the hash of the first 76 fixed part only once.
The hasher of the crypto/sha256
package is the unexported sha256.digest
type (more precisely a pointer to this type):
// digest represents the partial evaluation of a checksum.
type digest struct {
h [8]uint32
x [chunk]byte
nx int
len uint64
is224 bool // mark if this digest is SHA-224
}
A value of the digest
struct type basically holds the current state of the hasher.
What we may do is feed the hasher the fixed, first 76 bytes, and then save this struct value. When we need to caclulate the hash of some 80 bytes data where the first 76 is the same, we use this saved value as a starting point, and then feed the varying last 4 bytes.
Note that it's enough to simply save this struct value as it contains no pointers and no descriptor types like slices and maps. Else we would also have to make a copy of those, but we're "lucky". So this solution would need adjustment if a future implementation of crypto/sha256
would add a pointer or slice field for example.
Since sha256.digest
is unexported, we can only use reflection (reflect
package) to achieve our goals, which inherently will add some delays to computation.
Example implementation that does this:
type Cached2 struct {
origv reflect.Value
hasherv reflect.Value
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached2(fixed []byte) *Cached2 {
h := sha256.New()
h.Write(fixed)
c := &Cached2{origv: reflect.ValueOf(h).Elem()}
hasherv := reflect.New(c.origv.Type())
c.hasher = hasherv.Interface().(hash.Hash)
c.hasherv = hasherv.Elem()
return c
}
func (c *Cached2) Sum(data []byte) []byte {
// Set state of the fixed hash:
c.hasherv.Set(c.origv)
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
Testing it:
var c2 = NewCached2(fixed)
fmt.Printf("%x\n", c2.Sum(variantA))
fmt.Printf("%x\n", c2.Sum(variantB))
Output is again the same:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
So it works.
The "ultimate", fastest solution
Cached2
could be faster if reflection would not be involved. If we want an even faster solution, simply we can make a copy of the sha256.digest
type and its methods into our package, so we can directly use it without having to resort to reflection.
If we do this, we will have access to the digest
struct value, and we can simply make a copy of it like:
var d digest
// init d
saved := d
And restoring it is like:
d = saved
I simply "cloned" the crypto/sha256
package to my workspace, and changed / exported the digest
type as Digest
just for demonstration purposes. Then using this mysha256.Digest
type I implemented Cached3
like this:
type Cached3 struct {
orig mysha256.Digest
result [sha256.Size]byte
}
func NewCached3(fixed []byte) *Cached3 {
var d mysha256.Digest
d.Reset()
d.Write(fixed)
return &Cached3{orig: d}
}
func (c *Cached3) Sum(data []byte) []byte {
// Make a copy of the fixed hash:
d := c.orig
d.Write(data)
return d.Sum(c.result[:0])
}
Testing it:
var c3 = NewCached3(fixed)
fmt.Printf("%x\n", c3.Sum(variantA))
fmt.Printf("%x\n", c3.Sum(variantB))
Output again is the same. So this works too.
Benchmarks
We can benchmark performance with this code:
func BenchmarkCached1(b *testing.B) {
for i := 0; i < b.N; i++ {
c1.Sum(data)
c1.Sum(data2)
}
}
func BenchmarkCached2(b *testing.B) {
for i := 0; i < b.N; i++ {
c2.Sum(variantA)
c2.Sum(variantB)
}
}
func BenchmarkCached3(b *testing.B) {
for i := 0; i < b.N; i++ {
c3.Sum(variantA)
c3.Sum(variantB)
}
}
Benchmark results (go test -bench . -benchmem
):
BenchmarkCached1-4 1000000 1569 ns/op 0 B/op 0 allocs/op
BenchmarkCached2-4 2000000 926 ns/op 0 B/op 0 allocs/op
BenchmarkCached3-4 2000000 872 ns/op 0 B/op 0 allocs/op
Cached2
is approximately 41% faster than Cached1
which is quite noticable and nice. Cached3
only gives a "little" performance boost compared to Cached2
, another 6%. Cached3
is 44% faster than Cached1
.
Also note that none of the solutions use any allocations which is also nice.
Conclusion
For that extra 40% or 44%, I would probably not go for the Cached2
or Cached3
solutions. Of course it really depends on how important the performance is to you. If it is important, I think the Cached2
solution presents a fine compromise between minimum added complexity and the noticeable performance gain. It does pose a threat as future Go implementations may break it; if it is a problem, Cached3
solves this by copying the current implementation (and also improves its performance a little).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论