2015年12月1日 20:51:20go评论119阅读模式

英文:

Why strings.HasPrefix is faster than bytes.HasPrefix?

问题

在我的代码中，我有这样的基准测试：

const STR = "abcd"
const PREFIX = "ab"
var STR_B = []byte(STR)
var PREFIX_B = []byte(PREFIX)
func BenchmarkStrHasPrefix(b *testing.B) {
    for i := 0; i < b.N; i++ {
        strings.HasPrefix(STR, PREFIX)
    }
}
func BenchmarkBytHasPrefix(b *testing.B) {
    for i := 0; i < b.N; i++ {
        bytes.HasPrefix(STR_B, PREFIX_B)
    }
}

我对结果有点困惑：

BenchmarkStrHasPrefix-4    300000000    4.67 ns/op
BenchmarkBytHasPrefix-4    200000000    8.05 ns/op

为什么会有高达2倍的差异？

谢谢。

英文:

In my code, I have such benchmark:

const STR = &quot;abcd&quot;
const PREFIX = &quot;ab&quot;
var STR_B = []byte(STR)
var PREFIX_B = []byte(PREFIX)
func BenchmarkStrHasPrefix(b *testing.B) {
	for i := 0; i &lt; b.N; i++ {
		strings.HasPrefix(STR, PREFIX)
	}
}
func BenchmarkBytHasPrefix(b *testing.B) {
	for i := 0; i &lt; b.N; i++ {
		bytes.HasPrefix(STR_B, PREFIX_B)
	}
}

And I am little bit confused about results:

BenchmarkStrHasPrefix-4    300000000    4.67 ns/op
BenchmarkBytHasPrefix-4    200000000    8.05 ns/op

Why there is up to 2x difference?

Thanks.

答案1

得分: 17

主要原因是bytes.HasPrefix()和strings.HasPrefix()的调用成本不同。正如@tomasz在他的评论中指出的那样，strings.HasPrefix()默认是内联的，而bytes.HasPrefix()不是。

另一个原因是参数类型不同：bytes.HasPrefix()接受两个切片（2个切片描述符），而strings.HasPrefix()接受两个字符串（2个字符串头）。切片描述符包含一个指针和两个int：长度和容量，参见reflect.SliceHeader。字符串头只包含一个指针和一个int：长度，参见reflect.StringHeader。

如果我们在基准函数中手动内联HasPrefix()函数，消除调用成本（两者都为零），我们可以证明这一点。通过内联它们，将不会进行函数调用（到它们）。

HasPrefix()的实现：

// HasPrefix tests whether the byte slice s begins with prefix.
func HasPrefix(s, prefix []byte) bool {
    return len(s) >= len(prefix) && Equal(s[0:len(prefix)], prefix)
}
// HasPrefix tests whether the string s begins with prefix.
func HasPrefix(s, prefix string) bool {
    return len(s) >= len(prefix) && s[0:len(prefix)] == prefix
}

内联它们后的基准函数：

func BenchmarkStrHasPrefix(b *testing.B) {
    s, prefix := STR, PREFIX
    for i := 0; i < b.N; i++ {
        _ = len(s) >= len(prefix) && s[0:len(prefix)] == prefix
    }
}
func BenchmarkBytHasPrefix(b *testing.B) {
    s, prefix := STR_B, PREFIX_B
    for i := 0; i < b.N; i++ {
        _ = len(s) >= len(prefix) && bytes.Equal(s[0:len(prefix)], prefix)
    }
}

运行这些基准测试，你会得到非常接近的结果：

BenchmarkStrHasPrefix-2 300000000 5.88 ns/op
BenchmarkBytHasPrefix-2 200000000 6.17 ns/op

内联基准测试结果之间的微小差异的原因可能是两个函数都通过对string和[]byte操作数进行切片来测试前缀的存在。由于string是可比较的，而字节切片不是，相比于BenchmarkStrHasPrefix()，BenchmarkBytHasPrefix()需要额外调用bytes.Equal()（额外的函数调用还包括对其参数进行复制：2个切片头）。

其他可能稍微影响最初不同结果的因素：BenchmarkStrHasPrefix()中使用的参数是常量，而BenchmarkBytHasPrefix()中使用的参数是变量。

你不必过多担心性能差异，这两个函数都只需要几个纳秒就能完成。

请注意bytes.Equal()的“实现”：

func Equal(a, b []byte) bool // ../runtime/asm_$GOARCH.s

在某些平台上可能会被内联，从而不会产生额外的调用成本。

英文:

The main reason is the difference in calling cost of bytes.HasPrefix() and strings.HasPrefix(). As @tomasz pointed out in his comment, strings.HashPrefix() is inlined by default while bytes.HasPrefix() is not.

Further reason is the different parameter types: bytes.HasPrefix() takes 2 slices (2 slice descriptors). strings.HasPrefix() takes 2 strings (2 string headers). Slice descriptors contain a pointer and 2 ints: length and capacity, see reflect.SliceHeader. String headers contain only a pointer and an int: the length, see reflect.StringHeader.

We can prove this if we manually inline the HasPrefix() functions in the benchmark functions, so we eliminate the calling costs (zero both). By inlining them, no function call will be made (to them).

HasPrefix() implementations:

// HasPrefix tests whether the byte slice s begins with prefix.
func HasPrefix(s, prefix []byte) bool {
	return len(s) &gt;= len(prefix) &amp;&amp; Equal(s[0:len(prefix)], prefix)
}
// HasPrefix tests whether the string s begins with prefix.
func HasPrefix(s, prefix string) bool {
	return len(s) &gt;= len(prefix) &amp;&amp; s[0:len(prefix)] == prefix
}

Benchmark functions after inlining them:

func BenchmarkStrHasPrefix(b *testing.B) {
	s, prefix := STR, PREFIX
	for i := 0; i &lt; b.N; i++ {
		_ = len(s) &gt;= len(prefix) &amp;&amp; s[0:len(prefix)] == prefix
	}
}
func BenchmarkBytHasPrefix(b *testing.B) {
	s, prefix := STR_B, PREFIX_B
	for i := 0; i &lt; b.N; i++ {
		_ = len(s) &gt;= len(prefix) &amp;&amp; bytes.Equal(s[0:len(prefix)], prefix)
	}
}

Running these you get very close results:

BenchmarkStrHasPrefix-2 300000000                5.88 ns/op
BenchmarkBytHasPrefix-2 200000000                6.17 ns/op

The reason for the small difference in the inlined benchmarks may be that both functions test the presence of the prefix by slicing the string and []byte operand. And since strings are comparable while byte slices are not, BenchmarkBytHasPrefix() requires an additional function call to bytes.Equal() compared to BenchmarkStrHasPrefix() (and the extra function call also includes making copies of its parameters: 2 slice headers).

Other things that may slightly contribute to the original different results: arguments used in BenchmarkStrHasPrefix() are constants, while parameters used in BenchmarkBytHasPrefix() are variables.

You shouldn't worry much about the performance difference, both functions complete in just a few nanoseconds.

Note that the "implementation" of bytes.Equal():

func Equal(a, b []byte) bool // ../runtime/asm_$GOARCH.s

This may be inlined in some platforms resulting in no additional call costs.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么 strings.HasPrefix 比 bytes.HasPrefix 更快？

问题

答案1

如何在Golang中找到两个切片的交集结果

字节切片操作

如何分析一堆个人资料？

Using custom linker script with Go build

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。