英文:
Why Golang string.Builder String() can convert []byte to slice without a reflect.StringHeader?
问题
Go的strings包定义了一个Builder类型,该类型具有一个String()方法。
func (b *Builder) String() string {
return *(*string)(unsafe.Pointer(&b.buf))
}
根据reflect包的指示,字节切片被定义为SliceHeader,并且由头部的Data字段指向的关联数据块:
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
字符串被定义为StringHeader,并且由头部的Data字段指向的关联数据块:
type StringHeader struct {
Data uintptr
Len int
}
SliceHeader有3个字段(Data、Len和Cap),而StringHeader只有2个字段(Data和Len),那么如何直接将字节切片转换为字符串呢?
*(*string)(unsafe.Pointer(&buf))
在我的理解中,我们应该这样编写代码:
func byte2str(b []byte) string {
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&b))
return *(*string)(unsafe.Pointer(&reflect.StringHeader{Data: hdr.Data, Len: hdr.Len}))
}
英文:
Go's strings package defines a Builder type which has a String() method
func (b *Builder) String() string {
return *(*string)(unsafe.Pointer(&b.buf))
}
as reflect pacakge indicates, a byte slice is defined as a SliceHeader and an associated data block pointed by the header's Data field:
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
and string is defined as a StringHeader and an associated data block pointed by the header's Data field
type StringHeader struct {
Data uintptr
Len int
}
the SliceHeader has 3 fields(Data, Len and Cap) and StringHeader has only 2(Data and Len),
so how can one convert byte slice to string directly like this ?
*(*string)(unsafe.Pointer(&buf))
in my understanding, we should code following:
func byte2str(b []byte) string {
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&b))
return *(*string)(unsafe.Pointer(&reflect.StringHeader{Data: hdr.Data, Len: hdr.Len}))
}
答案1
得分: 4
让我们来分解一下。
表达式&buf
是指向切片头部的指针。指向切片的指针实际上是指向切片头部的指针。
表达式(unsafe.Pointer(&buf)
是将切片头部指针转换为unsafe.Pointer
类型的转换。unsafe.Pointer
类型具有一个神奇的属性——它可以转换为任何其他指针类型,反之亦然。
表达式(*string)(unsafe.Pointer(&buf))
是将切片头部指针转换为字符串头部指针的转换。指向字符串的指针实际上是指向头部的指针。
到这一步,我们有一个实际上指向切片头部的字符串头部指针。乍一看,这可能看起来不好,但一切都很好。字符串头部的内存布局是切片头部的内存布局的前缀。
表达式*(*string)(unsafe.Pointer(&buf))
解引用字符串头部指针以获取字符串。这个操作将数据和长度字段从切片头部复制到字符串头部。
如果切片头部的字段重新排序而没有相应地更改字符串头部的字段,这段代码将会出错,但这种情况永远不会发生。
英文:
Let's break it down.
The expression &buf
is a pointer to the slice header. A pointer to a slice is a pointer to the slice header.
The expression (unsafe.Pointer(&buf)
is a conversion from the slice header pointer to an unsafe.Pointer
. The unsafe.Pointer type has a magical property — it can be converted to and from any other pointer type.
The expression is (*string)(unsafe.Pointer(&buf))
is a conversion from a slice header pointer to a string header pointer. A pointer to a string is a pointer to the header.
At this point, we have a string header pointer that's actually pointing to a slice header. That might seem bad at first glance, but all is good. The memory layout of a string header is a prefix of the memory layout for a slice header.
The expression *(*string)(unsafe.Pointer(&buf))
dereferences the string header pointer to get the string. This operation copies data and length fields from the slice header to a string header.
This code will break if the slice header fields are reordered without a corresponding change to the string header fields, but that's never going to happen.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论