英文:
Is the code below well formed, in particular regarding aliasing rules?
问题
以下是您要的翻译内容:
"The template function below is part of a sequence generator. Instead of manual shifts, I came up with the following union-based solution to make the operations more explicit. It works great on all the compilers tested. Godbolt link.
However despite working in practice, I am afraid there are aliasing rules that are being violated, which means it might not work in the future or in another compiler other than GCC and CLANG.
Strictly in view of the C++ standard: is the code below well formed? Does it incur in undefined behavior?
template <int BITS>
uint64_t flog2(uint64_t num) {
constexpr uint64_t MAXNUM = (uint64_t(1) << BITS);
if (num < MAXNUM) return num;
union FP {
double dbl;
struct {
uint64_t man: 52;
uint32_t exp: 11;
uint32_t sign: 1;
};
struct {
uint64_t xman: 52-BITS;
uint32_t xexp: 11+BITS;
uint32_t xsgn: 1;
};
};
FP fp;
fp.dbl = num;
fp.exp -= 1023-1+BITS;
return fp.xexp;
}
Thanks!"
英文:
The template function below is part of a sequence generator. Instead of manual shifts, I came up with the following union-based solution to make the operations more explicit. It works great on all the compilers tested. Godbolt link.
However despite working in practice, I am afraid there are aliasing rules that are being violated, which means it might not work in the future or in another compiler other than GCC and CLANG.
Strictly in view of the C++ standard: is the code below well formed? Does it incur in undefined behavior?
template <int BITS>
uint64_t flog2(uint64_t num) {
constexpr uint64_t MAXNUM = (uint64_t(1) << BITS);
if (num < MAXNUM) return num;
union FP {
double dbl;
struct {
uint64_t man: 52;
uint32_t exp: 11;
uint32_t sign: 1;
};
struct {
uint64_t xman: 52-BITS;
uint32_t xexp: 11+BITS;
uint32_t xsgn: 1;
};
};
FP fp;
fp.dbl = num;
fp.exp -= 1023-1+BITS;
return fp.xexp;
}
Thanks!
答案1
得分: 3
首先,根据ISO标准的C++,该程序在语法上是不符合规范的。匿名的struct
成员不是标准C++的一部分(与C不同)。它们是一种扩展。在ISO标准的C++中,struct
必须具有名称,并通过该名称进行访问。
我会忽略这一点,然后假设您是通过这样的名称进行访问。
在技术上,这并不是别名违规,但在以下情况下读取联合对象的非活动成员会导致未定义行为:
fp.exp -= 1023-1+BITS;
对于这一点,类型并不是真正重要的(与别名不同)。联合中始终只有一个活动成员,这个活动成员要么是最后一个明确创建的成员,要么是通过成员访问/赋值表达式写入的成员。在您的情况下,fp.dbl = num;
表示 dbl
是活动成员,是唯一可以读取的成员。
标准中有一个例外,用于访问联合的标准布局类类型成员的公共初始序列,此时非活动成员可以被访问,就好像它是活动的一样。但是,即使是您的两个 struct {
成员在BITS == 0
的情况下也具有非空的公共初始序列。
然而,在实际编译器中,通常明确支持这种类型的转换,可能已经支持C兼容性,其中允许这种转换。
当然,即使将所有这些都放在一边,位字段的布局和涉及类型的表示是完全由实现定义的,您不能期望这在一般情况下都是可移植的。
英文:
First of all, the program is syntactically ill-formed in ISO standard C++. Anonymous struct
members are not standard C++ (in contrast to C). They are an extension. In ISO standard C++ the struct
must be named and accessed through that name.
I'll ignore that for the rest of the answer and pretend you were accessing through such a name.
It is not an aliasing violation technically, but undefined behavior for reading an inactive member of the union object in
fp.exp -= 1023-1+BITS;
The types don't really matter for this (in contrast to aliasing). There is always only at most one active member of a union, which would the last one which was either explicitly created or written with a member access/assignment expression. In your case fp.dbl = num;
means that dbl
is the active member and the only one that may be read from.
There is one exception in the standard for accessing the common initial sequence of standard layout class type members of a union, in which case the non-active one may be accessed as if it was the active one. But even your two struct {
members have a non-empty common initial sequence only for BITS == 0
.
However, in practice compilers typically explicitly support this kind of type punning, probably already for C compatibility where it is allowed.
Of course, even setting all of this aside, the layout of bitfields and the representations of the involved types are completely implementation-defined and you can't expect this to be generally portable.
答案2
得分: 1
这是未定义行为,读取一个未最近写入的联合体成员。
此外,位域的布局是实现定义的。
因此,从严格的C++标准观点来看,这段代码既引发了未定义行为(在写入dbl
之后读取exp
时),又依赖于实现定义的行为,假定位域的布局对应于double
浮点表示(顺便提一下,这也是实现定义的)。
英文:
It is undefined behaviour to read from a union member that was not most recently written.
Furthermore, the layout of bit-fields is implementation-defined.
Hence, from a strict C++ Standard view, this code invokes both undefined behaviour (by reading exp
after writing dbl
) and relies on implementation-defined behaviour in assuming that the bit-field layout corresponds to the double
floating point representation (which by the way is also implementation-defined).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论