英文:
How to check if []byte is all zeros in go
问题
有没有一种方法可以检查一个字节切片是否为空或为0,而不需要检查每个元素或使用反射?
theByteVar := make([]byte, 128)
if "theByteVar is empty or zeroes" {
doSomething()
}
我找到了一种看起来很奇怪的解决方案,即保留一个空的字节数组进行比较。
theByteVar := make([]byte, 128)
emptyByteVar := make([]byte, 128)
// 填充任意值
theByteVar[1] = 2
if reflect.DeepEqual(theByteVar, emptyByteVar) == false {
doSomething(theByteVar)
}
肯定有更好/更快的解决方案。
更新:对于1000次循环进行了一些比较,使用反射的方法是最差的...
Equal Loops: 1000 in true in 19.197μs
Contains Loops: 1000 in true in 34.507μs
AllZero Loops: 1000 in true in 117.275μs
Reflect Loops: 1000 in true in 14.616277ms
英文:
Is there a way to check if a byte slice is empty or 0 without checking each element or using reflect?
theByteVar := make([]byte, 128)
if "theByteVar is empty or zeroes" {
doSomething()
}
One solution which seems weird that I found was to keep an empty byte array for comparison.
theByteVar := make([]byte, 128)
emptyByteVar := make([]byte, 128)
// fill with anything
theByteVar[1] = 2
if reflect.DeepEqual(theByteVar,empty) == false {
doSomething(theByteVar)
}
For sure there must be a better/quicker solution.
Thanks
UPDATE did some comparison for 1000 loops and the reflect way is the worst by far...
Equal Loops: 1000 in true in 19.197µs
Contains Loops: 1000 in true in 34.507µs
AllZero Loops: 1000 in true in 117.275µs
Reflect Loops: 1000 in true in 14.616277ms
答案1
得分: 7
将其与仅包含零的另一个切片进行比较,这需要读取(和比较)2个切片。
在这里使用单个for循环将更有效:
for _, v := range theByteVar {
if v != 0 {
doSomething(theByteVar)
break
}
}
如果您确实需要在多个地方使用它,请将其封装在一个实用函数中:
func allZero(s []byte) bool {
for _, v := range s {
if v != 0 {
return false
}
}
return true
}
然后使用它:
if !allZero(theByteVar) {
doSomething(theByteVar)
}
英文:
Comparing it with another slice containing only zeros, that requires reading (and comparing) 2 slices.
Using a single for loop will be more efficient here:
for _, v := range theByteVar {
if v != 0 {
doSomething(theByteVar)
break
}
}
If you do need to use it in multiple places, wrap it in a utility function:
func allZero(s []byte) bool {
for _, v := range s {
if v != 0 {
return false
}
}
return true
}
And then using it:
if !allZero(theByteVar) {
doSomething(theByteVar)
}
答案2
得分: 4
另一种解决方案借鉴了C语言的一个思想,可以通过在Go中使用unsafe
包来实现。
这个思路很简单,不再逐个检查[]byte
中的每个字节,而是在每个步骤中检查byte[i:i+8]
的值,它是一个uint64
值。通过这样做,我们可以一次检查8个字节,而不是在每次迭代中只检查一个字节。
下面的代码只是展示了这个思路,不是最佳实践。
func IsAllBytesZero(data []byte) bool {
n := len(data)
// 将n向下取整到最接近的8的倍数
// 通过清除最后3位来实现。
nlen8 := n &^ 0b111
i := 0
for ; i < nlen8; i += 8 {
b := *(*uint64)(unsafe.Pointer(&data[i]))
if b != 0 {
return false
}
}
for ; i < n; i++ {
if data[i] != 0 {
return false
}
}
return true
}
基准测试
测试用例:
只测试最坏情况(所有元素都为零)
方法:
- IsAllBytesZero:使用
unsafe
包的解决方案 - NaiveCheckAllBytesAreZero:循环迭代整个字节数组并进行检查的方法。
- CompareAllBytesWithFixedEmptyArray:使用
bytes.Compare
解决方案,并预先分配固定大小的空字节数组。 - CompareAllBytesWithDynamicEmptyArray:使用
bytes.Compare
解决方案,没有预先分配固定大小的空字节数组。
结果
BenchmarkIsAllBytesZero10-8 254072224 4.68 ns/op
BenchmarkIsAllBytesZero100-8 132266841 9.09 ns/op
BenchmarkIsAllBytesZero1000-8 19989015 55.6 ns/op
BenchmarkIsAllBytesZero10000-8 2344436 507 ns/op
BenchmarkIsAllBytesZero100000-8 1727826 679 ns/op
BenchmarkNaiveCheckAllBytesAreZero10-8 234153582 5.15 ns/op
BenchmarkNaiveCheckAllBytesAreZero100-8 30038720 38.2 ns/op
BenchmarkNaiveCheckAllBytesAreZero1000-8 4300405 291 ns/op
BenchmarkNaiveCheckAllBytesAreZero10000-8 407547 2666 ns/op
BenchmarkNaiveCheckAllBytesAreZero100000-8 43382 27265 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10-8 415171356 2.71 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100-8 218871330 5.51 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray1000-8 56569351 21.0 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10000-8 6592575 177 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100000-8 567784 2104 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10-8 64215448 19.8 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100-8 32875428 35.4 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray1000-8 8580890 140 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10000-8 1277070 938 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100000-8 121256 10355 ns/op
总结
假设我们讨论的是稀疏零字节数组的情况。根据基准测试,如果性能是一个问题,那么朴素的检查解决方案将是一个糟糕的选择。如果你不想在项目中使用unsafe
包,那么考虑使用bytes.Compare
解决方案并预先分配空数组作为替代方案。
一个有趣的观点是,来自unsafe
包的性能差异很大,但它基本上优于上述所有其他解决方案。我认为这与CPU缓存机制有关。
英文:
Another solution borrows an idea from C. It could be achieved by using the unsafe
package in Go.
The idea is simple, instead of checking each byte from []byte, we can check the value of byte[i:i+8]
, which is a uint64
value, in each steps. By doing this we can check 8 bytes instead of checking only one byte in each iteration.
Below codes are not best practice but only show the idea.
func IsAllBytesZero(data []byte) bool {
n := len(data)
// Round n down to the nearest multiple of 8
// by clearing the last 3 bits.
nlen8 := n & ^0b111
i := 0
for ; i < nlen8; i += 8 {
b := *(*uint64)(unsafe.Pointer(&data[i]))
if b != 0 {
return false
}
}
for ; i < n; i++ {
if data[i] != 0 {
return false
}
}
return true
}
Benchmark
Testcases:
Only test for worst cases (all elements are zero)
Methods:
- IsAllBytesZero:
unsafe
package solution - NaiveCheckAllBytesAreZero: a loop to iterate the whole byte array and check it.
- CompareAllBytesWithFixedEmptyArray: using
bytes.Compare
solution with pre-allocated fixed size empty byte array. - CompareAllBytesWithDynamicEmptyArray: using
bytes.Compare
solution without pre-allocated fixed size empty byte array.
Results
BenchmarkIsAllBytesZero10-8 254072224 4.68 ns/op
BenchmarkIsAllBytesZero100-8 132266841 9.09 ns/op
BenchmarkIsAllBytesZero1000-8 19989015 55.6 ns/op
BenchmarkIsAllBytesZero10000-8 2344436 507 ns/op
BenchmarkIsAllBytesZero100000-8 1727826 679 ns/op
BenchmarkNaiveCheckAllBytesAreZero10-8 234153582 5.15 ns/op
BenchmarkNaiveCheckAllBytesAreZero100-8 30038720 38.2 ns/op
BenchmarkNaiveCheckAllBytesAreZero1000-8 4300405 291 ns/op
BenchmarkNaiveCheckAllBytesAreZero10000-8 407547 2666 ns/op
BenchmarkNaiveCheckAllBytesAreZero100000-8 43382 27265 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10-8 415171356 2.71 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100-8 218871330 5.51 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray1000-8 56569351 21.0 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10000-8 6592575 177 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100000-8 567784 2104 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10-8 64215448 19.8 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100-8 32875428 35.4 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray1000-8 8580890 140 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10000-8 1277070 938 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100000-8 121256 10355 ns/op
Summary
Assumed that we're talking about the condition in sparse zero byte array. According to the benchmark, if performance is an issue, the naive check solution would be a bad idea. And, if you don't want to use unsafe
package in your project, then consider using bytes.Compare
solution with pre-allocated empty array as an alternative.
An interesting point could be pointed out is that the performance comes from unsafe
package varies a lot, but it basically outperform all other solution mentioned above. I think it was relevant to the CPU cache mechanism.
答案3
得分: 3
你可以使用bytes.Equal或bytes.Contains来将其与一个零初始化的字节切片进行比较,参见https://play.golang.org/p/mvUXaTwKjP。我还没有检查过性能,但希望已经进行了优化。如果需要的话,你可以尝试其他解决方案并比较性能指标。
英文:
You can possibly use bytes.Equal or bytes.Contains to compare with a zero initialized byte slice, see https://play.golang.org/p/mvUXaTwKjP, I haven't checked for performance, but hopefully it's been optimized. You might want to try out other solutions and compare the performance numbers, if needed.
答案4
得分: 2
我认为在循环内部使用binary or
而不是if condition
会更好(更快):
func isZero(bytes []byte) bool {
b := byte(0)
for _, s := range bytes {
b |= s
}
return b == 0
}
通过使用前面答案中提到的uint64
的方法,我们甚至可以进一步优化这个函数:
func isZero(bytes []byte) bool {
var b uint64
for _, s := range bytes {
b |= uint64(s)
}
return b == 0
}
希望对你有帮助!
英文:
I think it is better (faster) if binary or
is used instead of if condition
inside loop:
func isZero(bytes []byte) bool {
b := byte(0)
for _, s := range bytes {
b |= s
}
return b == 0
}
One can optimize this even more by using idea with uint64
mentioned in previous answers
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论