为什么结构体中`[0]byte`的位置很重要?

huangapple go评论94阅读模式
英文:

Why position of `[0]byte` in the struct matters?

问题

[0]byte在Go语言中不会占用任何内存空间。但是这两个结构体的大小是不同的。

type bar2 struct {
	A int
	_ [0]byte
}

type bar3 struct {
	_ [0]byte
	A int	
}

那么为什么[0]byte的位置在这里很重要呢?

顺便说一下,我使用了unsafe.Sizeof()方法来检查结构体的大小。请参考完整示例

英文:

[0]byte in golang should not take any memory space. But these two structs have different sizes.

type bar2 struct {
	A int
	_ [0]byte
}

type bar3 struct {
	_ [0]byte
	A int	
}

So why the position of [0]byte matters here?

By the way, I use unsafe.Sizeof() method to check the struct size. See the full example .

答案1

得分: 8

这是由于一种棘手的填充方式。

首先,请允许我稍微重命名结构体和字段,这样更容易讨论它们:

type bar1 struct {
    A [0]byte
    I int
}

type bar2 struct {
    I int
    A [0]byte
}

当然,这并不会改变大小和偏移量,可以在Go Playground上验证:

bar1 大小:     4
bar1.A 偏移量: 0
bar1.I 偏移量: 0

bar2 大小:     8
bar2.I 偏移量: 0
bar2.A 偏移量: 4

类型为[0]byte的值的大小为零,因此在bar1中不保留任何空间给第一个字段(bar1.A)是完全有效的,并且可以将bar1.I字段的偏移量设置为0。

问题是:为什么编译器不能在第二种情况下(使用bar2)做同样的事情?

字段必须具有地址,该地址必须在为前一个字段保留的内存区域之后。在第一种情况下,第一个字段bar1.A的大小为0,因此第二个字段可以具有0偏移量,它不会与第一个字段“重叠”。

bar2的情况下,第二个字段不能具有与第一个字段重叠的地址(因此也没有偏移量),因此其偏移量不能小于int的大小,在32位架构中为4字节(在64位架构中为8字节)。

这看起来还好。但是由于bar2.A的大小为零,为什么结构体bar2的大小不能只是4字节(或者在64位架构中为8字节)?

这是因为可以完全有效地获取大小为0的字段(和变量)的地址。好的,那又怎样?

bar2的情况下,编译器必须插入4(或8)字节的填充,否则获取bar2.A字段的地址将指向结构体值保留的内存区域之外

例如,没有填充的情况下,bar2的值可能具有地址0x100,大小为4,因此为结构体值保留的内存具有地址范围0x100 .. 0x103bar2.A的地址将为0x104,这在结构体的内存之外。在这种结构体的数组(例如x [5]bar2)的情况下,如果数组从0x100开始,x[0]的地址将为0x100x[0].A的地址将为0x104,后续元素x[1]的地址也将为0x104,但那是另一个结构体值的地址!这样不好。

为了避免这种情况,编译器插入填充(根据架构的不同,将为4或8字节),以便获取bar2.A的地址不会导致地址超出结构体的内存范围,否则可能会引发垃圾回收方面的问题和问题(例如,如果仅保留bar2.A的地址,而不保留结构体或指向它的其他字段的指针,整个结构体不应该被垃圾回收,但由于没有指针指向其内存区域,似乎可以这样做)。插入的填充将为4(或8)字节,因为规范:大小和对齐保证

对于结构体类型的变量xunsafe.Alignof(x)是所有字段x.funsafe.Alignof(x.f)值中最大的值,但至少为1

如果是这样的话,添加一个额外的int字段将使两个结构体的大小相等:

type bar1 struct {
    I int
    A [0]byte
    X int
}

type bar2 struct {
    A [0]byte
    I int
    X int
}

在32位架构上,它们确实都是8字节(在64位架构上为16字节)(在Go Playground上试一试):

bar1 大小:     8
bar1.I 偏移量: 0
bar1.A 偏移量: 4
bar1.X 偏移量: 4

bar2 大小:     8
bar2.A 偏移量: 0
bar2.I 偏移量: 0
bar2.X 偏移量: 4

参考问题:https://stackoverflow.com/questions/34219232/struct-has-different-size-if-the-field-order-is-different/34219916#34219916

英文:

This is due to a tricky padding.

First please allow me to slightly rename the structs and fields so it'll be easier to talk about them:

type bar1 struct {
	A [0]byte
	I int
}

type bar2 struct {
	I int
	A [0]byte
}

This of course doesn't change the size and offsets as can be verified on the Go Playground:

bar1 size:     4
bar1.A offset: 0
bar1.I offset: 0

bar2 size:     8
bar2.I offset: 0
bar2.A offset: 4

The size of a value of type [0]byte is zero, so it is perfectly valid in bar1 to not reserve any space for the first field (bar1.A), and lay out the bar1.I field with 0 offset.

The question is: why can't the compiler do the same in the 2nd case (with bar2)?

A field must have an address that must be after the memory area reserved for the previous field. In the first case the first field bar1.A has 0 size, so the 2nd field may have 0 offset, it will not "overlap" with the first field.

In case of bar2, the second field cannot have an address (and therefore an offset) that overlaps with the first field, so its offset cannot be less than the size of int which is 4 bytes in case of 32-bit architectures (and 8 bytes in case of 64-bit arch).

This still seems ok. But since bar2.A has zero size, why can't the size of the struct bar2 be just that: 4 bytes (or 8 in 64-bit arch)?

This is because it is perfectly valid to take the address of fields (and variables) that have 0 size. Ok, so what?

In case of bar2, the compiler has to insert a 4 (or 8) byte padding, else taking the address of a bar2.A field would point outside of the memory area reserved for a value of type bar2.

As an example, without padding a value of bar2 may have an address of 0x100, size 4, so memory reserved for the struct value has address range 0x100 .. 0x103. Address of bar2.A would be 0x104, that is outside of the struct's memory. In case of an array of this struct (e.g. x [5]bar2), if the array starts at 0x100, address of x[0] would be 0x100, address of x[0].A would be 0x104, and address of the subsequent element x[1] would also be 0x104 but that's the address of another struct value! Not cool.

To avoid this, the compiler inserts a padding (which will be 4 or 8 bytes depending on the arch), so that taking the address of bar2.A will not result in an address being outside of the struct's memory, which otherwise could raise questions and cause problems regarding garbage collection (e.g. if only address of bar2.A is kept but not the struct or another pointer to it or its other fields, the whole struct should not be garbage collected, but since no pointer points to its memory area, it would seem to be valid to do so). The inserted padding will be 4 (or 8) bytes, because Spec: Size and alignment guarantees:

> For a variable x of struct type: unsafe.Alignof(x) is the largest of all the values unsafe.Alignof(x.f) for each field f of x, but at least 1.

If this is so, adding an additional int field would make the size of both structs equal:

type bar1 struct {
	I int
	A [0]byte
	X int
}

type bar2 struct {
	A [0]byte
	I int
	X int
}

And truly they both have 8 bytes on 32-bit arch (and 16 bytes on 64-bit arch) (try it on the Go Playground):

bar1 size:     8
bar1.I offset: 0
bar1.A offset: 4
bar1.X offset: 4

bar2 size:     8
bar2.A offset: 0
bar2.I offset: 0
bar2.X offset: 4

See related question: https://stackoverflow.com/questions/34219232/struct-has-different-size-if-the-field-order-is-different/34219916#34219916

答案2

得分: 1

原因是“holes”:

“holes”是编译器添加的未使用空间,以确保以下字段或元素与结构体或数组的起始位置对齐1

例如(基于Go Playground使用的硬件):

struct {bool; float64; int16} // 24字节
struct {float64; bool; int16} // 16字节

您可以使用以下方法验证结构体的布局:

  • unsafe.Alignof 返回所需的对齐方式
  • unsafe.Offsetof 计算字段相对于其封闭结构体的起始位置的偏移量,包括“holes”

1 p354 Donovan, Kernighan, AD, BK, 2016. The GO Programming Language. 1st ed. New York: Addison-Wesley.

英文:

The reason is "holes":

> Holes are the unused spaces added by the compiler to ensure that the
> following field or element is properly aligned relative to the start
> of the struct or array [1]

For example (numbers based on whatever hardware the go playground is using):

struct {bool; float64; int16} // 24 bytes
struct {float64; bool; int16} // 16 bytes

You can verify the layout of a struct using:

  • unsafe.Alignof returns the required alignment
  • unsafe.Offsetof computes the offset of a field relative to the start of it's enclosing struct including holes

[1] p354 Donovan, Kernighan, AD, BK, 2016. The GO Programming Language. 1st ed. New York: Addison-Wesley.

huangapple
  • 本文由 发表于 2017年7月12日 11:46:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/45048049.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定