Go 1.6编译器在[]byte和string之间进行转换时应用了哪些优化?

huangapple go评论66阅读模式
英文:

Which optimisations does the Go 1.6 compiler apply when converting between []byte and string or vice versa?

问题

我知道将 []byte 转换为字符串,或者反过来,会导致底层数组的副本被创建。从字符串是不可变的角度来看,这对我来说是有道理的。

然后我在这里阅读到,在特定情况下,编译器会进行两个优化:

"第一个优化是在 map[string] 集合中使用 []byte 键查找条目时避免额外的分配:m[string(key)]."

这是有道理的,因为转换只在方括号的范围内进行,所以不会有在那里改变字符串的风险。

"第二个优化是在 for range 子句中避免额外的分配,其中字符串被转换为 []byte:for i,v := range []byte(str) {...}."

这也是有道理的,因为再次强调-在这里没有改变字符串的方式。

还提到了待办事项列表上的进一步优化(不确定指的是哪个待办事项列表),所以我的问题是:

Go 1.6 中是否存在其他类似的(进一步的)优化,如果有,它们是什么?

英文:

I know that converting from a []byte to a string, or vice versa, results in a copy of the underlying array being made. This makes sense to me, from the point of view of strings being immutable.

Then I read here that two optimisations get made by the compiler in specific cases:

"The first optimization avoids extra allocations when []byte keys are used to lookup entries in map[string] collections: m[string(key)]."

This makes sense because the conversion is only scoped to the square brackets, so no risk of mutating the string there.

"The second optimization avoids extra allocations in for range clauses where strings are converted to []byte: for i,v := range []byte(str) {...}."

This makes sense because once again - no way of mutating the string here.

Also mentioned is further optimisations on the todo list (not sure which todo list is being referred to), so my question is:

Does any other such (further) optimisations exist in Go 1.6 and if so, what are they?

1: http://devs.cloudimmunity.com/gotchas-and-common-mistakes-in-go-golang/index.html#string_byte_slice_conv "here"

答案1

得分: 9

[]byte转换为string

对于[]bytestring的转换,当编译器能够证明以下情况时,它会生成对内部的runtime.slicebytetostringtmp函数的调用(源代码链接):

> 在调用的goroutine可能修改原始切片或与另一个goroutine同步之前,字符串形式将被丢弃。

runtime.slicebytetostringtmp返回一个引用实际[]byte字节的string,因此它不会分配内存。函数中的注释说:

// 第一种情况是m[string(k)]查找,其中
// m是一个以字符串为键的映射,k是一个[]byte。
// 第二种情况是"<"+string(b)+">"连接,其中b是[]byte。
// 第三种情况是string(b)=="foo"比较,其中b是[]byte。

简而言之,对于一个b []byte

  • m[string(b)]的映射查找不会分配内存
  • "<"+string(b)+">"的连接不会分配内存
  • string(b)=="foo"的比较不会分配内存

第二个优化是在2015年1月22日实现的,它在go1.6中可用。

第三个优化是在2015年1月27日实现的,它在go1.6中可用。

因此,例如在以下代码中:

var bs []byte = []byte{104, 97, 108, 108, 111}

func main() {
    x := string(bs) == "hello"
    println(x)
}

在go1.6中,比较不会导致内存分配。

string转换为[]byte

类似地,runtime.stringtoslicebytetmp函数(源代码链接)中说:

// 返回一个引用实际字符串字节的切片。
// 这仅供内部编译器优化使用,
// 它知道切片不会被修改。
// 目前唯一的情况是:
// for i, c := range []byte(str)

因此,i, c := range []byte(str)不会分配内存,但这一点你已经知道了。

英文:

[]byte to string

For []byte to string conversion, the compiler generates a call to the internal runtime.slicebytetostringtmp function (link to source) when it can prove

> that the string form will be discarded before the calling goroutine
> could possibly modify the original slice or synchronize with another
> goroutine.

runtime.slicebytetostringtmp returns a string referring to the actual []byte bytes, so it does not allocate. The comment in the function says

// First such case is a m[string(k)] lookup where
// m is a string-keyed map and k is a []byte.
// Second such case is "<"+string(b)+">" concatenation where b is []byte.
// Third such case is string(b)=="foo" comparison where b is []byte.

In short, for a b []byte:

  • map lookup m[string(b)] does not allocate
  • "<"+string(b)+"> concatenation does not allocate
  • string(b)=="foo" comparison does not allocate

The second optimization was implemented on 22 Jan 2015, and it is in go1.6

The third optimization was implemented on 27 Jan 2015, and it is in go1.6

So, for example, in the following:

var bs []byte = []byte{104, 97, 108, 108, 111}

func main() {
    x := string(bs) == "hello"
    println(x)
}

the comparison does not cause allocations in go1.6.

String to []byte

Similarly, the runtime.stringtoslicebytetmp function (link to source) says:

// Return a slice referring to the actual string bytes.
// This is only for use by internal compiler optimizations
// that know that the slice won't be mutated.
// The only such case today is:
// for i, c := range []byte(str)

so i, c := range []byte(str) does not allocate, but you already knew that.

huangapple
  • 本文由 发表于 2016年3月10日 17:08:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/35911953.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定