
huangapple go评论96阅读模式

Are XOR linked lists still allowed in C++17?


XOR 链表 在指针运算中使用了一种在我看来看起来可疑的方式,考虑到C++17引入的语义变化(例如在自C++17以来,具有正确地址和类型的指针仍然始终是有效指针吗?中讨论)。它们现在会引起未定义行为吗?如果是的话,是否可以使用launder 来修复?





XOR linked lists use pointer arithmetic in a way that looks suspicious to me given the changes in semantics introduced in C++17 (and discussed e.g. in Is a pointer with the right address and type still always a valid pointer since C++17?). Do they cause undefined behavior now? If so, can they be saved with launder?


The Wikipedia article contains just a short note about converting between pointers and integers. I tacitly assumed (and now am making it explicitly stated) that the pointers are first converted to and integer type of sufficient size to fit them, then XORing is done on the integers. The properties of XOR listed under Theory of operation thus guarantee that only integers obtained once from pointers will be converted back to them. The actual mapping from pointers to integers can be, per the standard, an arbitrary injection. I don't rely on any assumption beyond that.

Does the standard allow to use them and access the still existing objects? Before C++17? Since C++17?


得分: 5


  • It's implementation-defined, and still valid in C++17, at least for GCC.

    • 这是实现定义的,在C++17中仍然有效,至少对于GCC而言。
  • You cannot perform an xor operation between two pointers directly; you would have to go through reinterpret_cast<std::intptr_t>.

    • 你不能直接在两个指针之间执行异或操作;你必须通过reinterpret_cast<std::intptr_t>
  • The effect of this conversion (and back) is implementation-defined.

    • 此转换的效果(以及逆转换)是实现定义的
  • Implementation-defined means that the compiler must document what happens. What GCC provides is:

    • 实现定义 意味着编译器必须记录发生的情况。GCC提供的是:

    A cast from pointer to integer discards [...], otherwise the bits are unchanged.

    A cast from integer to pointer discards [...], otherwise the bits are unchanged.

    When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined.


  • From this description, we can conclude that:

    • 从这个描述中,我们可以得出结论:
  1. the user ensures that the object at the address stays the same between storing pointers in an XOR list and retrieving them

    • 用户确保在将指针存储在XOR列表中并检索它们之间,地址上的对象保持不变
    • if the pointed-to object changes during this process, casting the integer back to a pointer is undefined behavior
      • 如果在此过程中所指的对象发生更改,则将整数强制转换回指针是未定义的行为
  2. converting one past the end pointers, null pointers, and invalid pointers back and forth isn't explained here, but "preserves the value" according to the C++ standard

    • 反复转换超出末尾的指针、空指针和无效指针在这里没有解释,但根据C++标准,“保留值”。

Implications for an XOR-list


Generally, this should make XOR-lists implementable, as long as we reproduce the same pointers that we stored, and don't "rug pull" nodes while there are XORed pointers to them.

  1. std::intptr_t ptr;
  2. // STORE:
  3. // - bit-cast both operands and XOR them
  4. // - store result in ptr
  5. ptr = reinterpret_cast<std::intptr_t>(prev) ^ reinterpret_cast<std::intptr_t>(next);
  6. // LOAD:
  7. // - XOR stored ptr and bit-cast to node*
  8. node* next = reinterpret_cast<node*>(ptr ^ reinterpret_cast<std::intptr_t>(prev));
  9. // valid dereference, because at the address 'next', we still store the same object
  10. *next;
  1. std::intptr_t ptr;
  2. // 存储:
  3. // - 对两个操作数执行位强制转换并对它们执行异或操作
  4. // - 将结果存储在ptr中
  5. ptr = reinterpret_cast<std::intptr_t>(prev) ^ reinterpret_cast<std::intptr_t>(next);
  6. // 加载:
  7. // - 对存储的ptr执行异或操作并进行位强制转换,得到node*
  8. node* next = reinterpret_cast<node*>(ptr ^ reinterpret_cast<std::intptr_t>(prev));
  9. // 有效的解引用,因为在地址'next'上,我们仍然存储相同的对象
  10. *next;

As stated in the documentation, next "must reference the same object as the original pointer", so we can assume that next is now a pointer to an object, if such a pointer was originally stored in ptr.
正如文档中所述,next "必须引用与原始指针相同的对象",所以我们可以假设,如果最初在ptr中存储了这样的指针,那么next现在是一个指向对象的指针。

However, it would be UB if we stored the XORed next pointer, began the lifetime of a new object where next points to, and then un-XORed the address and converted back to a pointer type.


It's implementation-defined, and still valid in C++17, at least for GCC. You cannot perform an xor operation between two pointers directly; you would have to go through reinterpret_cast&lt;std::intptr_t&gt;. The effect of this conversion (and back) is implementation-defined.

Implementation-defined means that the compiler must document what happens. What GCC provides is:

> A cast from pointer to integer discards [...], otherwise the bits are unchanged.
> A cast from integer to pointer discards [...], otherwise the bits are unchanged.
> When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined.

See https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html

From this description, we can conclude that:

  1. the user ensures that the object at the address stays the same between storing pointers in an XOR list and retrieving them
    • if the pointed-to object changes during this process, casting the integer back to a pointer is undefined behavior
  2. converting one past the end pointers, null pointers, and invalid pointers back and forth isn't explained here, but "preserves the value" according to the C++ standard

Implications for an XOR-list

Generally, this should make XOR-lists implementable, as long as we reproduce the same pointers that we stored, and don't "rug pull" nodes while there are XORed pointers to them.

  1. std::intptr_t ptr;
  2. // STORE:
  3. // - bit-cast both operands and XOR them
  4. // - store result in ptr
  5. ptr = reinterpret_cast&lt;std::intptr_t&gt;(prev) ^ reinterpret_cast&lt;std::intptr_t&gt;(next);
  6. // LOAD:
  7. // - XOR stored ptr and bit-cast to node*
  8. node* next = reinterpret_cast&lt;node*&gt;(ptr ^ reinterpret_cast&lt;std::intptr_t&gt;(prev));
  9. // valid dereference, because at the address &#39;next&#39;, we still store the same object
  10. *next;

As stated in the documentation, next "must reference the same object as the original pointer", so we can assume that next is now a pointer to an object, if such a pointer was originally stored in ptr.

However, it would be UB if we stored the XORed next pointer, began the lifetime of a new object where next points to, and then un-XORed the address and converted back to a pointer type.


得分: 2

根据我所知,reinterpret_caststd::uintptr_t 之间的转换应该是可以的。


As far as I know, reinterpret_cast to and from std::uintptr_t should be fine.


得分: 2



> [...] 将指针转换为足够大小的整数(如果在实现中存在的话),然后再转换回相同的指针类型,将具有其原始值;指针和整数之间的映射在其他情况下是由实现定义的。【注:除了在6.7.4.3中描述的情况外,此类转换的结果不会是一个安全派生的指针值。— 结束注释





XOR linked lists are still valid in C++17 and beyond, assuming that the type uintptr_t exists.

While it's true that a pointer that represents a particular address is not necessarily a pointer to an object that resides at that address, there is [expr.reinterpret.cast]/5:

> [...] A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note: Except as described in, the result of such a conversion will not be a safely-derived pointer
value. — end note ]

What this tells us is that, while reinterpret_cast may not in general give you a pointer to an object, there is a special case where the operand is an integer value that was previously obtained from reinterpret_casting a pointer operand. The pointer value resulting from the round trip is the original pointer value, meaning that if the original pointer value pointed to an object, the result also points to that object (assuming the object still exists, of course).

But, the note tells us that, in C++17, the result of the conversion may not be a "safely-derived pointer value". What that means is that once you perform the pointer to integer conversion and do not keep a copy of that integer, but instead only store its value XOR other integers, the implementation is allowed to garbage collect the pointee because (in some technical sense that I won't get into here) the pointee is no longer "reachable". When you later reconstitute the original integer value by another XOR operation, and then attempt to reinterpret_cast it back to the original pointer, that pointer value is not "safely-derived" and is therefore considered invalid under some theoretical implementations (because the implementation might have garbage collected it already). But, if your implementation has "relaxed pointer safety", then this doesn't matter; the pointer is still valid. The design intent was that garbage-collected implementations would define themselves as having "strict pointer safety".

In practice, no implementation actually has "strict pointer safety" as specified in the standard, even though some garbage-collected C++ implementations do apparently exist. For this reason, the concept of strict pointer safety will be abolished in C++23. You can rest assured that XOR linked lists are valid, assuming that uintptr_t exists.

  • 本文由 发表于 2023年7月14日 05:33:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76683373.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
