使用联合和位字段进行类型转换 uint64_t 时结果不一致。

huangapple go评论76阅读模式
英文:

Inconsistent results when type punning uint64_t with union and bit-field

问题

以下是翻译好的部分:

我在union中使用匿名的struct,如下所示:

using time64_t = uint64_t;
using bucket_t = uint64_t;

union clock_test {
    time64_t _time64;

    struct {
        bucket_t _bucket5 : 10;     // 桶:5  1024
        bucket_t _bucket4 : 8;      // 桶:4  256
        bucket_t _bucket3 : 6;      // 桶:3  64
        bucket_t _bucket2 : 6;      // 桶:2  64
        bucket_t _bucket1 : 6;      // 桶:1  64
        bucket_t _bucket0 : 6;      // 桶:0  64
    };
};

如果bucket_t = uint64_t,它的工作正常,但是如果using bucket_t = uint16_tuint32_t,我会得到令人困惑的结果。

我对所有情况使用相同的测试代码:

clock_test clk;
clk._time64 = 168839113046;

对于bucket_t = uint64_tclk是:

_bucket5   342	// unsigned __int64
_bucket4	26	// unsigned __int64
_bucket3	38	// unsigned __int64
_bucket2	15	// unsigned __int64
_bucket1	29	// unsigned __int64
_bucket0	 2	// unsigned __int64

对于bucket_t = uint32_tclk是:

_bucket    342	// unsigned int
_bucket4	26	// unsigned int
_bucket3	38	// unsigned int
_bucket2	15	// unsigned int
_bucket1	39	// unsigned int
_bucket0	 0	// unsigned int

对于bucket_t = uint16_tclk是:

_bucket5    342	// unsigned short
_bucket4	152	// unsigned short
_bucket3	 15	// unsigned short
_bucket2	 39	// unsigned short
_bucket1	  0	// unsigned short
_bucket0	  0	// unsigned short

如果需要更多信息,请查看上述的代码和结果。

英文:

I am using an anonymous struct in union as follows:

using time64_t = uint64_t;
using bucket_t = uint64_t;

union clock_test {
    time64_t _time64;

    struct {
        bucket_t _bucket5 : 10;     // bucket:5  1024
        bucket_t _bucket4 : 8;      // bucket:4  256
        bucket_t _bucket3 : 6;      // bucket:3  64
        bucket_t _bucket2 : 6;      // bucket:2  64
        bucket_t _bucket1 : 6;      // bucket:1  64
        bucket_t _bucket0 : 6;      // bucket:0  64
    };
};

If bucket_t = uint64_t, it works as expected, but with using bucket_t = uint16_t or uint32_t, I get puzzling results.

I use the same test code for all cases:

clock_test clk;
clk._time64 = 168839113046;

For bucket_t = uint64_t, clk is:

_bucket5   342	// unsigned __int64
_bucket4	26	// unsigned __int64
_bucket3	38	// unsigned __int64
_bucket2	15	// unsigned __int64
_bucket1	29	// unsigned __int64
_bucket0	 2	// unsigned __int64

For bucket_t = uint32_t, clk is:

_bucket    342	// unsigned int
_bucket4	26	// unsigned int
_bucket3	38	// unsigned int
_bucket2	15	// unsigned int
_bucket1	39	// unsigned int
_bucket0	 0	// unsigned int

For bucket_t = uint16_t, clk is:

_bucket5    342	// unsigned short
_bucket4	152	// unsigned short
_bucket3	 15	// unsigned short
_bucket2	 39	// unsigned short
_bucket1	  0	// unsigned short
_bucket0	  0	// unsigned short

...
use vscode + clang, See this issue clearly
使用联合和位字段进行类型转换 uint64_t 时结果不一致。

答案1

得分: 3

由于位字段成员通常不会被紧密排列,所以您得到不一致结果的原因在于这一点。
成员的类型很重要,可能会影响填充:

// 二进制中的 168839113046
// 假设 type punned with bucket_t = unsigned short (假设 16 位)
00100111 01001111 10011000 01101001 01010110
  |    |   |    | |      |       |         |
  |    |   |    | |      |       01 01010110 // _bucket5 = 342
  |    |   |    | |      | ######            // 填充至 16 位边界
  |    |   |    | 10011000                   // _bucket4 = 152
  |    |   001111                            // _bucket3 = 15
  |    | ##                                  // 填充至 16 位边界
  100111                                     // _bucket2 = 39
...                                          // _bucket1 = 0
                                             // _bucket0 = 0

每当编译器无法将另一个位字段成员放入相同的 16 位对象中时,它会插入填充并将其放入下一个对象中。
这会改变您读取的值,因为您在不同位置读取位。如果您的位字段成员都具有 64 位类型,这种情况就不会发生。

非标准和未定义行为

尽管如此,您的代码并不符合有效的 C++ 标准。

  1. 匿名结构体不是标准的 C++;它们只能工作是因为有 GCC 编译器扩展支持
  2. 位字段成员的布局和对齐完全是实现定义的,因此在不同编译器下可能不会得到相同的结果
  3. 像这样使用 union 进行类型转换是未定义行为;您只能访问联合的活动成员,有一些例外情况

为了获得一致的结果,使用位移和位掩码:

time64_t data = 168839113046;
(data >>  0) & ((1u << 10) - 1)     // = 342
(data >> 10) & ((1u <<  8) - 1)     // = 26
(data >> 18) & ((1u <<  6) - 1)     // = 38
// ...

这将在所有地方都为您提供一致的结果。它通过将 data 向右移动,然后使用位与运算符来屏蔽最低的 N 位。

英文:

The reason why you get inconsistent results is that bit-field members are normally not packed.
The type of the member matters, and may impact padding:

// 168839113046 in binary
// type punned with bucket_t = unsigned short (assuming 16-bit)
00100111 01001111 10011000 01101001 01010110
  |    |   |    | |      |       |         |
  |    |   |    | |      |       01 01010110 // _bucket5 = 342
  |    |   |    | |      | ######            // padding to 16-bit bounds
  |    |   |    | 10011000                   // _bucket4 = 152
  |    |   001111                            // _bucket3 = 15
  |    | ##                                  // padding to 16-bit bounds
  100111                                     // _bucket2 = 39
...                                          // _bucket1 = 0
                                             // _bucket0 = 0

Whenever the compiler can't fit another bit-field member into the same 16-bit object, it inserts padding and puts it into the next one.
This changes the values you read, because you're reading bits at different positions. If your bit-field members all had a 64-bit type, this wouldn't happen.

Non-Standard and Undefined Behavior

That being said, your code is just not valid C++.

  1. anonymous structs are not standard C++; they only work because of a GCC compiler extension
  2. the layout and alignment of bit-field members is completely implementation-defined, so you might not get the same results with different compilers
  3. using union for type punning like this is undefined behavior; you can only access the active member of the union, with some exceptions

To get consistent results, use shifts and masks:

time64_t data = 168839113046;
(data &gt;&gt;  0) &amp; ((1u &lt;&lt; 10) - 1)     // = 342
(data &gt;&gt; 10) &amp; ((1u &lt;&lt;  8) - 1)     // = 26
(data &gt;&gt; 18) &amp; ((1u &lt;&lt;  6) - 1)     // = 38
// ...

This will give you consistent results everywhere. It works by shifting the data to the right, and then using the bitwise AND operator to mask out the lowest N bits.

huangapple
  • 本文由 发表于 2023年7月3日 22:00:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76605488.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定