英文:
Safe usage of `reinterpret_cast`
问题
如果我有一组整数类型的聚合体,并且我想要用随机数创建一个实例,这里使用reinterpret_cast
是否安全?
template <typename T>
auto random() -> T {
static auto random_device = std::random_device{};
static auto generator = std::mt19937(random_device());
static auto distribution = std::uniform_int_distribution<std::uint8_t>{};
auto bytes = std::array<std::uint8_t, sizeof(T)>{};
for (auto i = std::size_t{0}; i < bytes.size(); ++i) {
bytes[i] = distribution(generator);
}
return *reinterpret_cast<T*>(&bytes[0]);
}
英文:
If I have an aggregate of integral types and I want to create a instance of it with random, is reinterpret_cast
being used here safe?
template <typename T>
auto random() -> T {
static auto random_device = std::random_device{};
static auto generator = std::mt19937(random_device());
static auto distribution = std::uniform_int_distribution<std::uint8_t>{};
auto bytes = std::array<std::uint8_t, sizeof(T)>{};
for (auto i = std::size_t{0}; i < bytes.size(); ++i) {
bytes[i] = distribution(generator);
}
return *reinterpret_cast<T *>(bytes.data());
}
答案1
得分: 4
以下是代码部分的翻译:
<s>首先,通常情况下这不会编译通过。您不能将`reinterpret_cast`为任意类型。如果`T`不是指针类型或整数类型,则无法编译通过。
要修复这个问题,您可能希望返回</s>
return *reinterpret_cast<T*>(bytes.data());
但也不要这样做。如果`T`不是`((un)signed) char`或`std::byte`,这将导致未定义行为。
实现此类函数的正确方法是首先确保目标类型`T`满足要求。我会过度限制它,要求它是平凡的:
decltype(auto) get_mt19937() {
static auto random_device = std::random_device{};
static auto generator = std::mt19937(random_device());
return generator;
}
template <typename T>
auto random() -> T {
static_assert(std::is_trivial_v<T>, "T必须是平凡的");
// 这是不好的。它将为每种类型`T`实例化一个新的mt19937引擎。
//static auto random_device = std::random_device{};
//static auto generator = std::mt19937(random_device());
auto& generator = get_mt19937();
// 不需要静态
auto distribution = std::uniform_int_distribution<unsigned char>{};
auto return_val = T{};
// reinterpret_cast'ing到std::uint8_t*可能不合法
auto* bytes = reinterpret_cast<unsigned char*>(&return_val);
for (auto i = std::size_t{0}; i < sizeof(T); ++i) {
bytes[i] = distribution(generator);
}
return return_val;
}
请注意,代码中的HTML转义字符(如<s>)已保留在翻译中。如果需要清除这些HTML转义字符,请在处理代码时进行清理。
英文:
<s>Firstly this will not compile in the general case. You cannot
reinterpret_castinto an arbitrary type. If
T` is not a pointer type or an integral type this will not compile.
To fix this you probably want to return</s>
return *reinterpret_cast<T*>(bytes.data());
But do not do that either. If T
is not ((un)signed) char
or std::byte
this is undefined behavior.
The correct way to implement such a function is to first make sure the target type T
satisfies the criteria to do this. I will over-restrict it by requiring that it is trivial:
decltype(auto) get_mt19937() {
static auto random_device = std::random_device{};
static auto generator = std::mt19937(random_device());
return generator;
}
template <typename T>
auto random() -> T {
static_assert(std::is_trivial_v<T>, "T must be trivial");
// This is bad. It will instantiate a new mt19937 engine for each type `T`.
//static auto random_device = std::random_device{};
//static auto generator = std::mt19937(random_device());
auto& generator = get_mt19937();
// no need to static
auto distribution = std::uniform_int_distribution<unsigned char>{};
auto return_val = T{};
// reinterpret_cast'ing to std::uint8_t* might not be legal
auto* bytes = reinterpret_cast<unsigned char*>(&return_val);
for (auto i = std::size_t{0}; i < sizeof(T); ++i) {
bytes[i] = distribution(generator);
}
return return_val;
}
答案2
得分: 2
对于通用类型T
,这可能不是安全的。
根据c++参考文档,如果满足以下条件,则允许使用reinterpret_cast
:
> 5) 任何对象指针类型T1都可以转换为另一个对象指针类型cv T2。这与static_cast<cv T2*>(static_cast<cv void*>(expression))完全等效(这意味着如果T2的对齐要求不比T1的严格,指针的值不会改变,并且将结果指针转换回其原始类型的操作会产生原始值)。在任何情况下,只有在类型别名规则允许的情况下才能安全地对结果指针进行解引用(请参阅下文)。
因此,你可以从*std::uint8_t
转换为*T
进行强制转换。但问题是,什么时候可以解引用得到T
的指针?
并不是对于每种类型T
都允许这样做。
同一页面说明,只有在以下情况下才能解引用到别名类型T
:
> 别名类型是std::byte、(自C++17起)char或unsigned char:这允许将任何对象的对象表示作为字节数组来检查。
因此,如果你将你的例程限制在std::byte
、char
或unsigned char
的使用范围内(如第二个引用中所述),那么你可以安全地在return
语句中解引用指针,你的函数应该是安全的。
但是,你应该考虑使用std::byte
来存储通用的原始内存缓冲区。
英文:
For a generic type T
, this may not be not safe.
According to the c++ reference, a reinterpret_cast
is allowed if:
> 5) Any object pointer type T1* can be converted to another object
> pointer type cv T2*. This is exactly equivalent to static_cast<cv
> T2*>(static_cast<cv void*>(expression)) (which implies that if T2's
> alignment requirement is not stricter than T1's, the value of the
> pointer does not change and conversion of the resulting pointer back
> to its original type yields the original value). In any case, the
> resulting pointer may only be dereferenced safely if allowed by the
> type aliasing rules (see below).
So, you can do the cast from *std::uint8_t
to *T
. But the problem is, when can you dereference the resulting pointer to T
?
Not for every type T
is this allowed.
The same page says that you can dereference to a AliasedType T
if:
> AliasedType is std::byte, (since C++17) char, or unsigned char: this
> permits examination of the object representation of any object as an
> array of bytes.
So, if you restrict the use of your routine to the std::byte
, char
, or unsigned char
(as mentioned above in the second quote), yes, you can then safely dereference the pointer in the return
statement and your function should be safe.
But, you should rather consider to use std::byte
to store a generic raw-memory buffer.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论