How to check if []byte is all zeros in go

huangapple go评论87阅读模式
英文:

How to check if []byte is all zeros in go

问题

有没有一种方法可以检查一个字节切片是否为空或为0,而不需要检查每个元素或使用反射?

theByteVar := make([]byte, 128)

if "theByteVar is empty or zeroes" {
   doSomething()
}

我找到了一种看起来很奇怪的解决方案,即保留一个空的字节数组进行比较。

theByteVar := make([]byte, 128)
emptyByteVar := make([]byte, 128)

// 填充任意值
theByteVar[1] = 2

if reflect.DeepEqual(theByteVar, emptyByteVar) == false {
   doSomething(theByteVar)
}

肯定有更好/更快的解决方案。

更新:对于1000次循环进行了一些比较,使用反射的方法是最差的...

Equal Loops: 1000 in true in 19.197μs
Contains Loops: 1000 in true in 34.507μs
AllZero Loops: 1000 in true in 117.275μs
Reflect Loops: 1000 in true in 14.616277ms
英文:

Is there a way to check if a byte slice is empty or 0 without checking each element or using reflect?

theByteVar := make([]byte, 128)

if "theByteVar is empty or zeroes" {
   doSomething()
}

One solution which seems weird that I found was to keep an empty byte array for comparison.

theByteVar := make([]byte, 128)
emptyByteVar := make([]byte, 128)

// fill with anything
theByteVar[1] = 2

if reflect.DeepEqual(theByteVar,empty) == false {
   doSomething(theByteVar)
}

For sure there must be a better/quicker solution.

Thanks

UPDATE did some comparison for 1000 loops and the reflect way is the worst by far...

Equal Loops: 1000 in true in 19.197µs
Contains Loops: 1000 in true in 34.507µs
AllZero Loops: 1000 in true in 117.275µs
Reflect Loops: 1000 in true in 14.616277ms

答案1

得分: 7

将其与仅包含零的另一个切片进行比较,这需要读取(和比较)2个切片。

在这里使用单个for循环将更有效:

for _, v := range theByteVar {
    if v != 0 {
        doSomething(theByteVar)
        break
    }
}

如果您确实需要在多个地方使用它,请将其封装在一个实用函数中:

func allZero(s []byte) bool {
    for _, v := range s {
        if v != 0 {
            return false
        }
    }
    return true
}

然后使用它:

if !allZero(theByteVar) {
    doSomething(theByteVar)
}
英文:

Comparing it with another slice containing only zeros, that requires reading (and comparing) 2 slices.

Using a single for loop will be more efficient here:

for _, v := range theByteVar {
    if v != 0 {
        doSomething(theByteVar)
        break
    }
}

If you do need to use it in multiple places, wrap it in a utility function:

func allZero(s []byte) bool {
	for _, v := range s {
		if v != 0 {
			return false
		}
	}
	return true
}

And then using it:

if !allZero(theByteVar) {
    doSomething(theByteVar)
}

答案2

得分: 4

另一种解决方案借鉴了C语言的一个思想,可以通过在Go中使用unsafe包来实现。

这个思路很简单,不再逐个检查[]byte中的每个字节,而是在每个步骤中检查byte[i:i+8]的值,它是一个uint64值。通过这样做,我们可以一次检查8个字节,而不是在每次迭代中只检查一个字节。

下面的代码只是展示了这个思路,不是最佳实践。

func IsAllBytesZero(data []byte) bool {
	n := len(data)

    // 将n向下取整到最接近的8的倍数
    // 通过清除最后3位来实现。
    nlen8 := n &^ 0b111
	i := 0

	for ; i < nlen8; i += 8 {
		b := *(*uint64)(unsafe.Pointer(&data[i]))
		if b != 0 {
			return false
		}
	}

	for ; i < n; i++ {
		if data[i] != 0 {
			return false
		}
	}

	return true
}

基准测试

测试用例:

只测试最坏情况(所有元素都为零)

方法:

  • IsAllBytesZero:使用unsafe包的解决方案
  • NaiveCheckAllBytesAreZero:循环迭代整个字节数组并进行检查的方法。
  • CompareAllBytesWithFixedEmptyArray:使用bytes.Compare解决方案,并预先分配固定大小的空字节数组。
  • CompareAllBytesWithDynamicEmptyArray:使用bytes.Compare解决方案,没有预先分配固定大小的空字节数组。

结果

BenchmarkIsAllBytesZero10-8                             	254072224	         4.68 ns/op
BenchmarkIsAllBytesZero100-8                            	132266841	         9.09 ns/op
BenchmarkIsAllBytesZero1000-8                           	19989015	        55.6 ns/op
BenchmarkIsAllBytesZero10000-8                          	 2344436	       507 ns/op
BenchmarkIsAllBytesZero100000-8                         	 1727826	       679 ns/op
BenchmarkNaiveCheckAllBytesAreZero10-8                  	234153582	         5.15 ns/op
BenchmarkNaiveCheckAllBytesAreZero100-8                 	30038720	        38.2 ns/op
BenchmarkNaiveCheckAllBytesAreZero1000-8                	 4300405	       291 ns/op
BenchmarkNaiveCheckAllBytesAreZero10000-8               	  407547	      2666 ns/op
BenchmarkNaiveCheckAllBytesAreZero100000-8              	   43382	     27265 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10-8         	415171356	         2.71 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100-8        	218871330	         5.51 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray1000-8       	56569351	        21.0 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10000-8      	 6592575	       177 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100000-8     	  567784	      2104 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10-8       	64215448	        19.8 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100-8      	32875428	        35.4 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray1000-8     	 8580890	       140 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10000-8    	 1277070	       938 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100000-8   	  121256	     10355 ns/op

总结

假设我们讨论的是稀疏零字节数组的情况。根据基准测试,如果性能是一个问题,那么朴素的检查解决方案将是一个糟糕的选择。如果你不想在项目中使用unsafe包,那么考虑使用bytes.Compare解决方案并预先分配空数组作为替代方案。

一个有趣的观点是,来自unsafe包的性能差异很大,但它基本上优于上述所有其他解决方案。我认为这与CPU缓存机制有关。

英文:

Another solution borrows an idea from C. It could be achieved by using the unsafe package in Go.

The idea is simple, instead of checking each byte from []byte, we can check the value of byte[i:i+8], which is a uint64 value, in each steps. By doing this we can check 8 bytes instead of checking only one byte in each iteration.

Below codes are not best practice but only show the idea.

func IsAllBytesZero(data []byte) bool {
	n := len(data)

    // Round n down to the nearest multiple of 8
    // by clearing the last 3 bits.
    nlen8 := n &amp; ^0b111
	i := 0

	for ; i &lt; nlen8; i += 8 {
		b := *(*uint64)(unsafe.Pointer(&amp;data[i]))
		if b != 0 {
			return false
		}
	}

	for ; i &lt; n; i++ {
		if data[i] != 0 {
			return false
		}
	}

	return true
}

Benchmark

Testcases:

Only test for worst cases (all elements are zero)

Methods:

  • IsAllBytesZero: unsafe package solution
  • NaiveCheckAllBytesAreZero: a loop to iterate the whole byte array and check it.
  • CompareAllBytesWithFixedEmptyArray: using bytes.Compare solution with pre-allocated fixed size empty byte array.
  • CompareAllBytesWithDynamicEmptyArray: using bytes.Compare solution without pre-allocated fixed size empty byte array.

Results

BenchmarkIsAllBytesZero10-8                             	254072224	         4.68 ns/op
BenchmarkIsAllBytesZero100-8                            	132266841	         9.09 ns/op
BenchmarkIsAllBytesZero1000-8                           	19989015	        55.6 ns/op
BenchmarkIsAllBytesZero10000-8                          	 2344436	       507 ns/op
BenchmarkIsAllBytesZero100000-8                         	 1727826	       679 ns/op
BenchmarkNaiveCheckAllBytesAreZero10-8                  	234153582	         5.15 ns/op
BenchmarkNaiveCheckAllBytesAreZero100-8                 	30038720	        38.2 ns/op
BenchmarkNaiveCheckAllBytesAreZero1000-8                	 4300405	       291 ns/op
BenchmarkNaiveCheckAllBytesAreZero10000-8               	  407547	      2666 ns/op
BenchmarkNaiveCheckAllBytesAreZero100000-8              	   43382	     27265 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10-8         	415171356	         2.71 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100-8        	218871330	         5.51 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray1000-8       	56569351	        21.0 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10000-8      	 6592575	       177 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100000-8     	  567784	      2104 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10-8       	64215448	        19.8 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100-8      	32875428	        35.4 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray1000-8     	 8580890	       140 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10000-8    	 1277070	       938 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100000-8   	  121256	     10355 ns/op

Summary

Assumed that we're talking about the condition in sparse zero byte array. According to the benchmark, if performance is an issue, the naive check solution would be a bad idea. And, if you don't want to use unsafe package in your project, then consider using bytes.Compare solution with pre-allocated empty array as an alternative.

An interesting point could be pointed out is that the performance comes from unsafe package varies a lot, but it basically outperform all other solution mentioned above. I think it was relevant to the CPU cache mechanism.

答案3

得分: 3

你可以使用bytes.Equal或bytes.Contains来将其与一个零初始化的字节切片进行比较,参见https://play.golang.org/p/mvUXaTwKjP。我还没有检查过性能,但希望已经进行了优化。如果需要的话,你可以尝试其他解决方案并比较性能指标。

英文:

You can possibly use bytes.Equal or bytes.Contains to compare with a zero initialized byte slice, see https://play.golang.org/p/mvUXaTwKjP, I haven't checked for performance, but hopefully it's been optimized. You might want to try out other solutions and compare the performance numbers, if needed.

答案4

得分: 2

我认为在循环内部使用binary or而不是if condition会更好(更快):

func isZero(bytes []byte) bool {
	b := byte(0)
	for _, s := range bytes {
		b |= s
	}
	return b == 0
}

通过使用前面答案中提到的uint64的方法,我们甚至可以进一步优化这个函数:

func isZero(bytes []byte) bool {
	var b uint64
	for _, s := range bytes {
		b |= uint64(s)
	}
	return b == 0
}

希望对你有帮助!

英文:

I think it is better (faster) if binary or is used instead of if condition inside loop:

func isZero(bytes []byte) bool {
b := byte(0)
for _, s := range bytes {
b |= s
}
return b == 0
}

One can optimize this even more by using idea with uint64 mentioned in previous answers


</details>

huangapple
  • 本文由 发表于 2017年8月4日 20:05:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/45506424.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定