标准库容器如何知道超出末尾迭代器之前的内容?

huangapple go评论67阅读模式
英文:

How do standard library containers know what's before the past-the-end iterator?

问题

我试图实现自己的 listmap,但在 end() 方法上遇到了问题,因为这些容器在内存中不是连续的。

经过一些研究,我最终将 end() 实现为一个返回持有 nullptr 的迭代器的方法(https://stackoverflow.com/questions/44767368/)。这对于我的大部分实现来说已经足够了,因为它验证了当列表为空时 list::begin() == list::end()(空列表没有节点,所以 head 指向 nullptr)。

然而,我发现可以在超出末尾的迭代器上调用 operator--() 来指向容器的实际最后一个元素。

int main() {
  {
    std::map<int, char> m = {{1, 'a'}, {2, 'b'}, {3, 'c'}};
    auto it = m.end();
    --it;
    std::cout << it->first << "\n";  // 输出: 3
    std::cout << it->second << "\n"; // 输出: c
  }
  {
    std::list<int> l{ 1, 2, 8};
    auto it = l.end();
    --it;
    std::cout << *it << "\n"; // 输出: 8
  }
}

这是如何实现的呢?在 gcc 中 _List_iteratoroperator--() 的实现将当前节点指向前一个节点:

_Self&
operator--() _GLIBCXX_NOEXCEPT
{
  _M_node = _M_node->_M_prev;
  return *this;
}

但是,如果节点在列表之外,这是如何可能的呢?这个节点甚至存在吗?这是否意味着额外的节点被分配了?您是否真的允许对超出末尾的迭代器调用 operator--()

这是 gcc 的 std::listbegin()end() 方法,如果有帮助的话:

/**
  * 返回一个只读(常量)迭代器,指向 %list 中的第一个元素。迭代按普通元素顺序进行。
  */
 _GLIBCXX_NODISCARD
 const_iterator
 begin() const _GLIBCXX_NOEXCEPT
 { return const_iterator(this->_M_impl._M_node._M_next); }

 /**
  * 返回一个读/写迭代器,指向 %list 中的最后一个元素的下一个位置。迭代按普通元素顺序进行。
  */
 _GLIBCXX_NODISCARD
 iterator
 end() _GLIBCXX_NOEXCEPT
 { return iterator(&this->_M_impl._M_node); }
英文:

I'm trying to implement my own list and map, and I've been having trouble with the end() method since these containers aren't contiguous in memory.

After some research, I've ended up implementing the end() as a method that returns an iterator that holds nullptr. (https://stackoverflow.com/questions/44767368/). That is enough for most of my implementation, as it verifies that list::begin() == list::end() when list is empty (empty list has no nodes so head points to nullptr).

However, I've seen that it's possible to call operator--() on past-the-end iterators to point to the actual last element of the container.

int main() {
  {
    std::map&lt;int, char&gt; m = {{1, &#39;a&#39;}, {2, &#39;b&#39;}, {3, &#39;c&#39;}};
    auto it = m.end();
    --it;
    std::cout &lt;&lt; it-&gt;first &lt;&lt; &quot;\n&quot;;  // output: 3
    std::cout &lt;&lt; it-&gt;second &lt;&lt; &quot;\n&quot;; // output: c
  }
  {
    std::list&lt;int&gt; l{ 1, 2, 8};
    auto it = l.end();
    --it;
    std::cout &lt;&lt; *it &lt;&lt; &quot;\n&quot;; // output: 8
  }
}

How is this achieved? The implementation of operator--() of _List_iterator in the gcc changes the current node to point to the previous one:

      _Self&amp;
      operator--() _GLIBCXX_NOEXCEPT
      {
	_M_node = _M_node-&gt;_M_prev;
	return *this;
      }

But how is this possible if the node is out of the list? How does that node even exist? Does this mean that an extra node is allocated? Are you actually allowed to call operator--() to a past-the-end iterator?

These are the begin() and end() methods of gcc's std::list, in case it helps:

     /**
       *  Returns a read-only (constant) iterator that points to the
       *  first element in the %list.  Iteration is done in ordinary
       *  element order.
       */
      _GLIBCXX_NODISCARD
      const_iterator
      begin() const _GLIBCXX_NOEXCEPT
      { return const_iterator(this-&gt;_M_impl._M_node._M_next); }

      /**
       *  Returns a read/write iterator that points one past the last
       *  element in the %list.  Iteration is done in ordinary element
       *  order.
       */
      _GLIBCXX_NODISCARD
      iterator
      end() _GLIBCXX_NOEXCEPT
      { return iterator(&amp;this-&gt;_M_impl._M_node); }

答案1

得分: 1

"_M_node->_M_prev"指向最后一个节点。

这个节点是如何存在的?

与任何节点存在的方式一样。在标准列表的情况下,它将使用提供的分配器创建。

这是否意味着会额外分配一个节点?

这似乎是一个合理的结论。这种模式被称为“哨兵节点”。

您是否真的可以对过去末尾的迭代器调用operator--()?

可以。

英文:

> But how is this possible if the node is out of the list?

Simply, _M_node-&gt;_M_prev points to the last node.

> How does that node even exist?

In the same way that any node exists. In the case of standard list, it would be created with the provided allocator.

> Does this mean that an extra node is allocated?

That seems like a reasonable conclusion. This pattern is called "sentinel node".

> Are you actually allowed to call operator--() to a past-the-end iterator?

Yes.

答案2

得分: 1

标准容器使用了一种巧妙(或聪明,这取决于你的角度)的技巧。 (简化了代码部分)。

你从一个“base”节点类型开始。这只是你的前向和后向指针。

struct ListNodeBase {
  ListNodeBase* next;
  ListNodeBase* prev;
};

列表节点扩展了基本节点类型,用于存储实际数据。指针和数据之间的这种分离有点有用!

template<typename T>
struct ListNode : public ListNodeBase {
  T data;
};

遍历链表仅使用ListNodeBase。要访问数据,对象会向上转型为ListNode。

template<typename T>
struct ListIter {
   ListNodeBase* current;

   // 但在这里,我们向上转型为节点类型以访问数据。
   T& operator*() const
      { return static_cast< ListNode<T>* >(current)->data; }
   T* operator->() const
      { return &static_cast< ListNode<T>* >(current)->data; }

   // 只是标准的列表遍历...
   // 再次强调,这仅操作ListNodeBase。
   ListIter<T>& operator++() {
      current = current->next;
      return *this;
   }

   ListIter<T>& operator--() {
      current = current->prev;
      return *this;
   }
};

技巧在于有一个ListNodeBase类型的节点(它是列表容器),其余节点都是ListNode类型。

template<typename T> 
struct list : private ListNodeBase {

   typedef ListIter<T> iterator;

   list() 
   {
     // 当为空时,存储指向自身的指针
     this->next = this;
     this->prev = this;
   }

   iterator begin() 
     { return (iterator){ this->next }; }

   iterator end() 
     { return (iterator){ this }; }
};

换句话说,列表容器哨兵节点(它不存储任何数据,因为它使用基本节点类型,而不是实际节点)。

英文:

The standard containers use a bit of a dirty trick (or clever, depending on the way you look at it). (Simplifying the code somewhat).

You start out with an 'base' node type. This is just your forward and backwards pointers.

struct ListNodeBase {
  ListNodeBase* next;
  ListNodeBase* prev;
};

The list node extends the base node type with the actual data to store. This separation between the pointers and the data is kinda useful!

template&lt;typename T&gt;
struct ListNode : public ListNodeBase {
  T data;
};

Iteration across the linked list ONLY uses ListNodeBase. To access the data, the objects are upcast to ListNode.

template&lt;typename T&gt;
struct ListIter {
   ListNodeBase* current;

   // Here however we upcast to the node type to access the data. 
   T&amp; operator*() const
      { return static_cast&lt; ListNode&lt;T&gt;* &gt;(current)-&gt;data; }
   T* operator-&gt;() const
      { return &amp;static_cast&lt; ListNode&lt;T&gt;* &gt;(current)-&gt;data; }

   // just your bog standard list traversal.. 
   // again, this only operates on the 
   ListIter&lt;T&gt;&amp; operator++() {
      current = current-&gt;next;
      return *this;
   }

   ListIter&lt;T&gt;&amp; operator--() {
      current = current-&gt;prev;
      return *this;
   }
};

The trick though is that there is one node of ListNodeBase (which is the list container), and the rest of the nodes are ListNode<T>.

template&lt;typename T&gt; 
struct list : private ListNodeBase {

   typedef ListIter&lt;T&gt; iterator;


   list() 
   {
     // when empty, store pointers to itself 
     this-&gt;next = this;
     this-&gt;prev = this;
   }

   iterator begin() 
     { return (iterator){ this-&gt;next }; }

   iterator end() 
     { return (iterator){ this }; }
};

In other words, the list container is the sentinel node (It just doesn't store any data, because it uses the base node type, and node the actual node).

huangapple
  • 本文由 发表于 2023年7月7日 03:05:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76631848.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定