在C++17中仍然允许XOR链接列表吗?

huangapple go评论73阅读模式
英文:

Are XOR linked lists still allowed in C++17?

问题

XOR 链表 在指针运算中使用了一种在我看来看起来可疑的方式,考虑到C++17引入的语义变化(例如在自C++17以来,具有正确地址和类型的指针仍然始终是有效指针吗?中讨论)。它们现在会引起未定义行为吗?如果是的话,是否可以使用launder 来修复?

编辑:

维基百科文章中仅包含有关指针和整数之间转换的简短说明。我默认假设(现在明确陈述),指针首先被转换为足够大以容纳它们的整数类型,然后对整数进行XOR运算。因此,在操作理论下列出的XOR属性保证只有一次从指针获得的整数会被转换回指针。根据标准,从指针到整数的实际映射可以是任意的注入。我不依赖于除此之外的任何假设。

标准是否允许在C++17之前以及自C++17以来使用它们并访问仍然存在的对象?

英文:

XOR linked lists use pointer arithmetic in a way that looks suspicious to me given the changes in semantics introduced in C++17 (and discussed e.g. in Is a pointer with the right address and type still always a valid pointer since C++17?). Do they cause undefined behavior now? If so, can they be saved with launder?

EDIT:

The Wikipedia article contains just a short note about converting between pointers and integers. I tacitly assumed (and now am making it explicitly stated) that the pointers are first converted to and integer type of sufficient size to fit them, then XORing is done on the integers. The properties of XOR listed under Theory of operation thus guarantee that only integers obtained once from pointers will be converted back to them. The actual mapping from pointers to integers can be, per the standard, an arbitrary injection. I don't rely on any assumption beyond that.

Does the standard allow to use them and access the still existing objects? Before C++17? Since C++17?

答案1

得分: 5

这是你要翻译的内容:

  • It's implementation-defined, and still valid in C++17, at least for GCC.

    • 这是实现定义的,在C++17中仍然有效,至少对于GCC而言。
  • You cannot perform an xor operation between two pointers directly; you would have to go through reinterpret_cast<std::intptr_t>.

    • 你不能直接在两个指针之间执行异或操作;你必须通过reinterpret_cast<std::intptr_t>
  • The effect of this conversion (and back) is implementation-defined.

    • 此转换的效果(以及逆转换)是实现定义的
  • Implementation-defined means that the compiler must document what happens. What GCC provides is:

    • 实现定义 意味着编译器必须记录发生的情况。GCC提供的是:

    A cast from pointer to integer discards [...], otherwise the bits are unchanged.

    A cast from integer to pointer discards [...], otherwise the bits are unchanged.

    When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined.

    当从指针转换为整数,然后再次转换回指针时,结果指针必须引用与原始指针相同的对象,否则行为是未定义的。

  • From this description, we can conclude that:

    • 从这个描述中,我们可以得出结论:
  1. the user ensures that the object at the address stays the same between storing pointers in an XOR list and retrieving them

    • 用户确保在将指针存储在XOR列表中并检索它们之间,地址上的对象保持不变
    • if the pointed-to object changes during this process, casting the integer back to a pointer is undefined behavior
      • 如果在此过程中所指的对象发生更改,则将整数强制转换回指针是未定义的行为
  2. converting one past the end pointers, null pointers, and invalid pointers back and forth isn't explained here, but "preserves the value" according to the C++ standard

    • 反复转换超出末尾的指针、空指针和无效指针在这里没有解释,但根据C++标准,“保留值”。

Implications for an XOR-list

对于XOR列表的影响

Generally, this should make XOR-lists implementable, as long as we reproduce the same pointers that we stored, and don't "rug pull" nodes while there are XORed pointers to them.
一般来说,只要我们复制了我们存储的相同指针,并且在有XOR指向它们的指针的情况下不删除节点,这应该使XOR列表可实现。

std::intptr_t ptr;

// STORE:
// - bit-cast both operands and XOR them
// - store result in ptr
ptr = reinterpret_cast<std::intptr_t>(prev) ^ reinterpret_cast<std::intptr_t>(next);

// LOAD:
// - XOR stored ptr and bit-cast to node*
node* next = reinterpret_cast<node*>(ptr ^ reinterpret_cast<std::intptr_t>(prev));
// valid dereference, because at the address 'next', we still store the same object
*next;
std::intptr_t ptr;

// 存储:
// - 对两个操作数执行位强制转换并对它们执行异或操作
// - 将结果存储在ptr中
ptr = reinterpret_cast<std::intptr_t>(prev) ^ reinterpret_cast<std::intptr_t>(next);

// 加载:
// - 对存储的ptr执行异或操作并进行位强制转换,得到node*
node* next = reinterpret_cast<node*>(ptr ^ reinterpret_cast<std::intptr_t>(prev));
// 有效的解引用,因为在地址'next'上,我们仍然存储相同的对象
*next;

As stated in the documentation, next "must reference the same object as the original pointer", so we can assume that next is now a pointer to an object, if such a pointer was originally stored in ptr.
正如文档中所述,next "必须引用与原始指针相同的对象",所以我们可以假设,如果最初在ptr中存储了这样的指针,那么next现在是一个指向对象的指针。

However, it would be UB if we stored the XORed next pointer, began the lifetime of a new object where next points to, and then un-XORed the address and converted back to a pointer type.
然而,如果我们存储了XOR后的next指针,开始了一个新对象的生命周期,其中next指向,然后取消XOR地址并转换回指针类型,那将是未定义的行为。

英文:

It's implementation-defined, and still valid in C++17, at least for GCC. You cannot perform an xor operation between two pointers directly; you would have to go through reinterpret_cast&lt;std::intptr_t&gt;. The effect of this conversion (and back) is implementation-defined.

Implementation-defined means that the compiler must document what happens. What GCC provides is:

> A cast from pointer to integer discards [...], otherwise the bits are unchanged.
>
> A cast from integer to pointer discards [...], otherwise the bits are unchanged.
>
> When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined.

See https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html

From this description, we can conclude that:

  1. the user ensures that the object at the address stays the same between storing pointers in an XOR list and retrieving them
    • if the pointed-to object changes during this process, casting the integer back to a pointer is undefined behavior
  2. converting one past the end pointers, null pointers, and invalid pointers back and forth isn't explained here, but "preserves the value" according to the C++ standard

Implications for an XOR-list

Generally, this should make XOR-lists implementable, as long as we reproduce the same pointers that we stored, and don't "rug pull" nodes while there are XORed pointers to them.

std::intptr_t ptr;

// STORE:
// - bit-cast both operands and XOR them
// - store result in ptr
ptr = reinterpret_cast&lt;std::intptr_t&gt;(prev) ^ reinterpret_cast&lt;std::intptr_t&gt;(next);

// LOAD:
// - XOR stored ptr and bit-cast to node*
node* next = reinterpret_cast&lt;node*&gt;(ptr ^ reinterpret_cast&lt;std::intptr_t&gt;(prev));
// valid dereference, because at the address &#39;next&#39;, we still store the same object
*next;

As stated in the documentation, next "must reference the same object as the original pointer", so we can assume that next is now a pointer to an object, if such a pointer was originally stored in ptr.

However, it would be UB if we stored the XORed next pointer, began the lifetime of a new object where next points to, and then un-XORed the address and converted back to a pointer type.

答案2

得分: 2

根据我所知,reinterpret_caststd::uintptr_t 之间的转换应该是可以的。

英文:

As far as I know, reinterpret_cast to and from std::uintptr_t should be fine.

答案3

得分: 2

XOR链接列表在C++17及以后仍然有效,前提是类型uintptr_t存在。

虽然一个表示特定地址的指针不一定是指向驻留在该地址的对象的指针,但在[expr.reinterpret.cast]/5中有这样的描述:

> [...] 将指针转换为足够大小的整数(如果在实现中存在的话),然后再转换回相同的指针类型,将具有其原始值;指针和整数之间的映射在其他情况下是由实现定义的。【注:除了在6.7.4.3中描述的情况外,此类转换的结果不会是一个安全派生的指针值。— 结束注释

这告诉我们,虽然reinterpret_cast通常不会给您一个指向对象的指针,但有一种特殊情况,其中操作数是先前从reinterpret_cast中获得的整数值的情况。往返的指针值是原始指针值,这意味着如果原始指针值指向一个对象,结果也指向该对象(当然,前提是对象仍然存在)。

但是,注意告诉我们,在C++17中,转换的结果可能不是一个“安全派生的指针值”。这意味着一旦您执行指针到整数的转换,而不保留该整数的副本,而是仅存储其值与其他整数的异或,实现允许对指向的对象进行垃圾收集,因为(在某种技术意义上,我不会在这里深入讨论)指向的对象不再“可访问”。当您稍后通过另一个异或操作重新构造原始整数值,然后尝试将其reinterpret_cast回原始指针时,该指针值不是“安全派生的”,因此在某些理论实现中被视为无效(因为实现可能已经对其进行了垃圾收集)。但是,如果您的实现具有“宽松的指针安全性”,那么这无关紧要;指针仍然有效。设计意图是,垃圾回收实现将自己定义为具有“严格的指针安全性”。

实际上,尽管一些垃圾回收的C++实现显然存在,但没有一个实际的实现符合标准中规定的“严格指针安全性”概念。因此,基于这个原因,严格指针安全性的概念将在C++23中被废除。您可以放心,假设uintptr_t存在,XOR链接列表是有效的。

英文:

XOR linked lists are still valid in C++17 and beyond, assuming that the type uintptr_t exists.

While it's true that a pointer that represents a particular address is not necessarily a pointer to an object that resides at that address, there is [expr.reinterpret.cast]/5:

> [...] A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note: Except as described in 6.7.4.3, the result of such a conversion will not be a safely-derived pointer
value. — end note ]

What this tells us is that, while reinterpret_cast may not in general give you a pointer to an object, there is a special case where the operand is an integer value that was previously obtained from reinterpret_casting a pointer operand. The pointer value resulting from the round trip is the original pointer value, meaning that if the original pointer value pointed to an object, the result also points to that object (assuming the object still exists, of course).

But, the note tells us that, in C++17, the result of the conversion may not be a "safely-derived pointer value". What that means is that once you perform the pointer to integer conversion and do not keep a copy of that integer, but instead only store its value XOR other integers, the implementation is allowed to garbage collect the pointee because (in some technical sense that I won't get into here) the pointee is no longer "reachable". When you later reconstitute the original integer value by another XOR operation, and then attempt to reinterpret_cast it back to the original pointer, that pointer value is not "safely-derived" and is therefore considered invalid under some theoretical implementations (because the implementation might have garbage collected it already). But, if your implementation has "relaxed pointer safety", then this doesn't matter; the pointer is still valid. The design intent was that garbage-collected implementations would define themselves as having "strict pointer safety".

In practice, no implementation actually has "strict pointer safety" as specified in the standard, even though some garbage-collected C++ implementations do apparently exist. For this reason, the concept of strict pointer safety will be abolished in C++23. You can rest assured that XOR linked lists are valid, assuming that uintptr_t exists.

huangapple
  • 本文由 发表于 2023年7月14日 05:33:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76683373.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定