Why Golang string.Builder String() can convert []byte to slice without a reflect.StringHeader?

huangapple go评论73阅读模式
英文:

Why Golang string.Builder String() can convert []byte to slice without a reflect.StringHeader?

问题

Go的strings包定义了一个Builder类型,该类型具有一个String()方法。

func (b *Builder) String() string {
    return *(*string)(unsafe.Pointer(&b.buf))
}

根据reflect包的指示,字节切片被定义为SliceHeader,并且由头部的Data字段指向的关联数据块:

type SliceHeader struct {
    Data uintptr
    Len  int
    Cap  int
}

字符串被定义为StringHeader,并且由头部的Data字段指向的关联数据块:

type StringHeader struct {
    Data uintptr
    Len  int
}

SliceHeader有3个字段(Data、Len和Cap),而StringHeader只有2个字段(Data和Len),那么如何直接将字节切片转换为字符串呢?

*(*string)(unsafe.Pointer(&buf))

在我的理解中,我们应该这样编写代码:

func byte2str(b []byte) string {
    hdr := (*reflect.SliceHeader)(unsafe.Pointer(&b))
    return *(*string)(unsafe.Pointer(&reflect.StringHeader{Data: hdr.Data, Len: hdr.Len}))
}
英文:

Go's strings package defines a Builder type which has a String() method

func (b *Builder) String() string {
    return *(*string)(unsafe.Pointer(&b.buf))
}

as reflect pacakge indicates, a byte slice is defined as a SliceHeader and an associated data block pointed by the header's Data field:

type SliceHeader struct {
  Data uintptr
  Len  int
  Cap  int
}

and string is defined as a StringHeader and an associated data block pointed by the header's Data field

type StringHeader struct {
  Data uintptr
  Len  int
}

the SliceHeader has 3 fields(Data, Len and Cap) and StringHeader has only 2(Data and Len),
so how can one convert byte slice to string directly like this ?

*(*string)(unsafe.Pointer(&buf))

in my understanding, we should code following:

func byte2str(b []byte) string {
  hdr := (*reflect.SliceHeader)(unsafe.Pointer(&b))
  return *(*string)(unsafe.Pointer(&reflect.StringHeader{Data: hdr.Data, Len: hdr.Len}))
}

答案1

得分: 4

让我们来分解一下。

表达式&buf是指向切片头部的指针。指向切片的指针实际上是指向切片头部的指针。

表达式(unsafe.Pointer(&buf)是将切片头部指针转换为unsafe.Pointer类型的转换unsafe.Pointer类型具有一个神奇的属性——它可以转换为任何其他指针类型,反之亦然。

表达式(*string)(unsafe.Pointer(&buf))是将切片头部指针转换为字符串头部指针的转换。指向字符串的指针实际上是指向头部的指针。

到这一步,我们有一个实际上指向切片头部的字符串头部指针。乍一看,这可能看起来不好,但一切都很好。字符串头部的内存布局是切片头部的内存布局的前缀。

表达式*(*string)(unsafe.Pointer(&buf))解引用字符串头部指针以获取字符串。这个操作将数据和长度字段从切片头部复制到字符串头部。

如果切片头部的字段重新排序而没有相应地更改字符串头部的字段,这段代码将会出错,但这种情况永远不会发生。

英文:

Let's break it down.

The expression &buf is a pointer to the slice header. A pointer to a slice is a pointer to the slice header.

The expression (unsafe.Pointer(&buf) is a conversion from the slice header pointer to an unsafe.Pointer. The unsafe.Pointer type has a magical property — it can be converted to and from any other pointer type.

The expression is (*string)(unsafe.Pointer(&buf)) is a conversion from a slice header pointer to a string header pointer. A pointer to a string is a pointer to the header.

At this point, we have a string header pointer that's actually pointing to a slice header. That might seem bad at first glance, but all is good. The memory layout of a string header is a prefix of the memory layout for a slice header.

The expression *(*string)(unsafe.Pointer(&buf)) dereferences the string header pointer to get the string. This operation copies data and length fields from the slice header to a string header.

This code will break if the slice header fields are reordered without a corresponding change to the string header fields, but that's never going to happen.

huangapple
  • 本文由 发表于 2022年1月9日 13:55:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/70638795.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定