Compare string and byte slice in Go without copy

huangapple go评论72阅读模式
英文:

Compare string and byte slice in Go without copy

问题

检查Go字符串和字节切片是否包含相同的字节的最佳方法是什么?最简单的方法str == string(byteSlice)效率低下,因为它首先复制了byteSlice

我正在寻找一个以字符串作为参数的Equal(a, b []byte)版本,但没有找到合适的。

英文:

What is the best way to check that Go string and a byte slice contain the same bytes? The simplest str == string(byteSlice) is inefficient as it copies byteSlice first.

I was looking for a version of Equal(a, b []byte) that takes a string as its argument, but could not find anything suitable.

答案1

得分: 14

从Go 1.5开始,编译器在将字符串与字符串进行比较时会对字符串(字节)进行优化,使用堆栈分配的临时变量。因此,从Go 1.5开始,str == string(byteSlice)成为比较字符串和字节切片的规范且高效的方式。

英文:

Starting from Go 1.5 the compiler optimizes string(bytes) when comparing to a string using a stack-allocated temporary. Thus since Go 1.5

str == string(byteSlice)

became a canonical and efficient way to compare string to a byte slice.

答案2

得分: 8

如果你对这个可能在以后的版本中出现问题的事实足够放心(虽然不太可能),你可以使用unsafe

func unsafeCompare(a string, b []byte) int {
    abp := *(*[]byte)(unsafe.Pointer(&a))
    return bytes.Compare(abp, b)
}

func unsafeEqual(a string, b []byte) bool {
    bbp := *(*string)(unsafe.Pointer(&b))
    return a == bbp
}

playground

Benchmarks:

// 使用:
// aaa = strings.Repeat("a", 100)
// bbb = []byte(strings.Repeat("a", 99) + "b")

// go 1.5
BenchmarkCopy-8 20000000 75.4 ns/op
BenchmarkPetersEqual-8 20000000 83.1 ns/op
BenchmarkUnsafe-8 100000000 12.2 ns/op
BenchmarkUnsafeEqual-8 200000000 8.94 ns/op
// go 1.4
BenchmarkCopy 10000000 233 ns/op
BenchmarkPetersEqual 20000000 72.3 ns/op
BenchmarkUnsafe 100000000 15.5 ns/op
BenchmarkUnsafeEqual 100000000 10.7 ns/op

英文:

If you're comfortable enough with the fact that this can break on a later release (doubtful though), you can use unsafe:

func unsafeCompare(a string, b []byte) int {
	abp := *(*[]byte)(unsafe.Pointer(&a))
	return bytes.Compare(abp, b)
}

func unsafeEqual(a string, b []byte) bool {
	bbp := *(*string)(unsafe.Pointer(&b))
	return a == bbp
}

<kbd>playground</kbd>

Benchmarks:

// using:
// 	aaa = strings.Repeat(&quot;a&quot;, 100)
//	bbb = []byte(strings.Repeat(&quot;a&quot;, 99) + &quot;b&quot;)

// go 1.5
BenchmarkCopy-8         20000000                75.4 ns/op
BenchmarkPetersEqual-8  20000000                83.1 ns/op
BenchmarkUnsafe-8       100000000               12.2 ns/op
BenchmarkUnsafeEqual-8  200000000               8.94 ns/op
// go 1.4
BenchmarkCopy           10000000                233  ns/op
BenchmarkPetersEqual    20000000                72.3 ns/op
BenchmarkUnsafe         100000000               15.5 ns/op
BenchmarkUnsafeEqual    100000000               10.7 ns/op

答案3

得分: 8

《Go编程语言规范》

字符串类型

字符串类型表示一组字符串值。字符串值是一个(可能为空的)字节序列。预声明的字符串类型是string。

可以使用内置函数len来发现字符串s的长度(以字节为单位)。可以通过整数索引0到len(s)-1来访问字符串的字节。

例如,

package main

import "fmt"

func equal(s string, b []byte) bool {
    if len(s) != len(b) {
        return false
    }
    for i, x := range b {
        if x != s[i] {
            return false
        }
    }
    return true
}

func main() {
    s := "equal"
    b := []byte(s)
    fmt.Println(equal(s, b))
    s = "not" + s
    fmt.Println(equal(s, b))
}

输出:

true
false
英文:

> The Go Programming Language Specification
>
> String types
>
> A string type represents the set of string values. A string value is a
> (possibly empty) sequence of bytes. The predeclared string type is
> string.
>
> The length of a string s (its size in bytes) can be discovered using
> the built-in function len. A string's bytes can be accessed by integer
> indices 0 through len(s)-1.

For example,

package main

import &quot;fmt&quot;

func equal(s string, b []byte) bool {
	if len(s) != len(b) {
		return false
	}
	for i, x := range b {
		if x != s[i] {
			return false
		}
	}
	return true
}

func main() {
	s := &quot;equal&quot;
	b := []byte(s)
	fmt.Println(equal(s, b))
	s = &quot;not&quot; + s
	fmt.Println(equal(s, b))
}

Output:

true
false

答案4

得分: 0

没有理由使用unsafe包或其他东西来比较[]bytestring。Go编译器现在已经足够聪明,可以优化这样的转换。

以下是一个基准测试的结果:

BenchmarkEqual-8                172135624                6.96 ns/op <--
BenchmarkUnsafe-8               179866616                6.65 ns/op <--
BenchmarkUnsafeEqual-8          175588575                6.85 ns/op <--
BenchmarkCopy-8                 23715144                 47.3 ns/op
BenchmarkPetersEqual-8          24709376                 47.3 ns/op

只需将字节切片转换为字符串并进行比较:

var (
	aaa = strings.Repeat("a", 100)
	bbb = []byte(strings.Repeat("a", 99) + "b")
)

func BenchmarkEqual(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = aaa == string(bbb)
	}
}

更多关于优化的信息可以在这里这里找到。

英文:

There is no reason to use the unsafe package or something just to compare []byte and string. The Go compiler is clever enough now, and it can optimize such conversions.

Here's a benchmark:

BenchmarkEqual-8                172135624                6.96 ns/op &lt;--
BenchmarkUnsafe-8               179866616                6.65 ns/op &lt;--
BenchmarkUnsafeEqual-8          175588575                6.85 ns/op &lt;--
BenchmarkCopy-8                 23715144                 47.3 ns/op
BenchmarkPetersEqual-8          24709376                 47.3 ns/op

Just convert a byte slice to a string and compare:

var (
	aaa = strings.Repeat(&quot;a&quot;, 100)
	bbb = []byte(strings.Repeat(&quot;a&quot;, 99) + &quot;b&quot;)
)

func BenchmarkEqual(b *testing.B) {
	for i := 0; i &lt; b.N; i++ {
		_ = aaa == string(bbb)
	}
}

👉 Here is more information about the optimization, and this.

huangapple
  • 本文由 发表于 2015年7月18日 02:58:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/31482900.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定