循环遍历容器并修改它为什么被认为是不良实践?

huangapple go评论68阅读模式
英文:

Why is looping over a container and modifying it considered bad practice?

问题

I would like to understand why it is considered bad practice to do this. I do get the "desired" output when printing the vector. (1,2,4,5).

我想了解为什么这被认为是不良实践。当打印向量时,我确实得到了“期望的”输出。 (1,2,4,5)。

I know I could also do:

我知道我也可以这样做:

By I want to understand why the first approach can lead to undefined behavior and what is happening underneath.

但我想了解为什么第一种方法可能导致未定义行为以及底层发生了什么。

英文:

I would like to understand why it is considered bad practice to do this. I do get the "desired" output when printing the vector. (1,2,4,5).

int main()
{
    std::vector<int> numbers = {1, 2, 3, 4, 5};

    for (auto it = numbers.begin(); it != numbers.end(); ++it)
    {
        // Accessing invalidated iterator leads to undefined behavior
        if (*it == 3)
        {
            numbers.erase(it); // Invalidates the iterator
        }
    }

    for_each(numbers.begin(), numbers.end(),([](int x){cout << x << '\n';}));
}

I know I could also do:

numbers.erase(std::remove(numbers.begin(), numbers.end(), 3), numbers.end());

By I want to understand why the first approach can lead to undefined behavior and what is happening underneath.

答案1

得分: 8

numbers.erase(it) 会使迭代器 it 失效,不能再用于任何操作。它不再 "指向" 存在于向量中的元素。该迭代器已经与向量 "断开连接"。

相反,您应该使用 erase 函数返回的 迭代器作为 "下一个" 迭代器(跳过 ++it)。


举个简单的例子来说明 可能 发生的情况,想象一下从向量中移除最后一个元素的情况。然后,您增加迭代器以指向......哪里?增加迭代器不会使其指向 end 迭代器,而且由于它已与它曾经属于的向量 "断开连接",实际上无法确定它可能 "指向" 什么或在什么位置。

所以,继续以擦除最后一个元素的示例,您将执行无效的增加操作和无效的比较操作,这可能是真的(因为无效的迭代器与向量 end 迭代器不同),并且循环将永远继续。

但即使只是擦除中间的元素也会使迭代器失效,因此无论为哪个元素调用擦除,再次使用相同的迭代器都会导致 未定义行为(这是一种 错误,不仅仅是不良实践)。


为了完整起见,以下是应该的循环示例:

for (auto it = numbers.begin(); it != numbers.end(); /* 空 */)
{
    if (*it == 3)
    {
        // 擦除元素,并获取被擦除后的元素的迭代器
        // 或者如果它是最后一个元素,则获取末尾迭代器
        it = numbers.erase(it);
    }
    else
    {
        // 不擦除元素,增加迭代器以指向下一个元素
        ++it;
    }
}
英文:

numbers.erase(it) will invalidate the iterator it, it can no longer be used for anything. It no longer "point" to an element that exists in the vector. The iterator has been "disconnected" from the vector.

You should instead use the iterator that the erase function returns as the "next" iterator (and skip ++it).


For a simple example of what can happen, think about the case where you remove the last element from the vector. You then increase the iterator to point to.... Where? Increasing the iterator will not make it point to the end iterator, and since it's "disconnected" from the vector it used to belong to, there's really no way to ell exactly what or where it might "point" to.

So to continue with the example of erasing the last element, you will do an invalid increase, and invalid comparison which will likely be true (since the invalid iterator will not be the same as the vector end iterator) and the loop will continue forever.

But even just erasing an element in the middle will invalidate the iterator, so no matter for which element you call erase, using the same iterator again leads to undefined behavior (which is an error, and not only a bad practice).


For completeness, here's the loop as it should be:

for (auto it = numbers.begin(); it != numbers.end(); /* Empty */)
{
    if (*it == 3)
    {
        // Erase the element, and get an iterator to the element after the erased
        // Or if it's the last element then get the end iterator
        it = numbers.erase(it);
    }
    else
    {
        // Not erasing element, increase iterator to point to the next element
        ++it;
    }
}

答案2

得分: 6

这是一个示例,说明未定义行为是如此难以处理的原因之一 - 当你的代码在表面上看起来正常运行时,实际上它可能已经出现问题,而当你改变看似无关的某些东西时,一切都会混乱不堪,使得很难找出实际问题所在。

因为std::vector迭代器通常被实现为指向向量的指针,所以像erase这样使迭代器失效的操作仍然会使迭代器看似有效,只是它指向了错误的位置。所以如果你使用它,它很容易看起来工作正常,尽管后来可能会出现微妙(或不那么微妙)的问题。对于你的特定情况,请看看如果你的输入向量是这样的情况会发生什么:

std::vector numbers = {1, 2, 3, 3, 4, 5};

在这种情况下,删除第一个3之后,迭代器将指向第二个3,然后循环增量将继续到4,所以第二个3不会被删除。

英文:

This is an example of what makes Undefined Behavior so insidious and difficult to deal with -- your code may appear to work fine when in fact it is broken and when you change something apparently unrelated, all hell will break loose, making it very hard to figure out where the actual problem is.

Because std::vector iterators are (usually) implemented as pointers into the vector, an operation like erase that invalidates the iterator will still leave the iterator as an apparently valid pointer, it just points at the wrong place. So if you use it, it is easy for it to appear to work, though subtle (or not so subtle) problems may appear later on. For your specific case, look at what happens if your input vector is something like:

std::vector<int> numbers = {1, 2, 3, 3, 4, 5};

in this case, after erasing the first 3, the iterator will point at the second 3, and then the loop increment will go on to the 4, so the second 3 won't be erased.

答案3

得分: 2

因为修改容器可能会使迭代器和(可能的)索引失效。因此,迭代突然不再遍历它认为正在遍历的内容,从而引发混乱。

英文:

"Why is looping over a container and modifying it considered bad practice?" - Because modifying the container potentially invalidates iterators and (potentially) indices. So the iteration suddenly doesn't iterate over what it thought it was iterating over and chaos ensues.

huangapple
  • 本文由 发表于 2023年5月30日 07:52:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76360873.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定