如何使用另一个向量对向量进行排序?

huangapple go评论81阅读模式
英文:

How can I order a vector using another vector?

问题

这个问题与这里提出的问题类似,但这个答案对我的问题不起作用,而且有些不同。

我正在尝试的最佳方法是用代码来展示:

//这将是一个复制版本:
int main(){
   std::vector<uint32_t> order = {0,2,5,6,9,10,1,3,4,7,8,11};
   std::vector<uint32_t> example = {0,1,2,3,4,5,6,7,8,9,10,11};
   std::vector<uint32_t> output(order.size());
   for(uint32_t i = 0; i < order.size(); ++i){
       output[i] = example[order[i]];
   }
}
//输出结果为{0,2,5,6,9,10,1,3,4,7,8,11}

然而,当我尝试使用上面链接中的重新排序代码来实现一个原地版本时,代码如下:

void reorder(std::vector<uint32_t> &v, std::vector<uint32_t> const &order )  {   
    for ( int s = 1, d; s < order.size(); ++ s ) {
        for ( d = order
展开收缩
; d < s; d = order[d] ) ;
if ( d == s ) while ( d = order[d], d != s ) std::swap( v
展开收缩
, v[d] );
} } int main(){ std::vector<uint32_t> order = {0,2,5,6,9,10,1,3,4,7,8,11}; std::vector<uint32_t> example = {0,1,2,3,4,5,6,7,8,9,10,11}; reorder(example, order); } //example = {0,6,1,7,8,2,3,9,10,4,5,11,}

我如何实现我想要的功能的原地版本,而不复制内存?

编辑

我想要更清楚一些,代码中的向量example可以是任意元素,我只是为了方便检查它们而初始化了它们。这是完全有效的:

std::vector<uint32_t> example = {300,12,21,34,47,15,61,57,82,94,1,2};
  • orderexample 将始终包含相同数量的元素
  • 不保证 orderexample 存储相同的数据类型
  • 保证 order 存储唯一的值
  • 也保证 order 将始终是数据类型 uint32_t
  • order 总是位于从0到n-1的范围内,其中 nexample 的大小,每个数字恰好一次,没有范围之外的内容。 (就像在示例代码中一样)
  • 该范围的实际顺序是完全随机的,与偶数或奇数索引无关,以任何顺序出现。
英文:

This question is similar to the one posed here however, this answer does not work for my question and it is slightly different.

What I am trying to do would best be shown in code:

//this would be a copy version:
int main(){
   std::vector&lt;uint32_t&gt; order = {0,2,5,6,9,10,1,3,4,7,8,11};
   std::vector&lt;uint32_t&gt; example = {0,1,2,3,4,5,6,7,8,9,10,11};
   std::vector&lt;uint32_t&gt; output(order.size());
   for(uint32_t i = 0; i &lt; order.size(); ++i){
       output[i] = example[order[i]];
   }
}
//output comes out to {0,2,5,6,9,10,1,3,4,7,8,11}

However, when I try to implement an in-place version using the reorder code from the link above as such:

void reorder(std::vector&lt;uint32_t&gt; &amp;v, std::vector&lt;uint32_t&gt; const &amp;order )  {   
    for ( int s = 1, d; s &lt; order.size(); ++ s ) {
        for ( d = order
展开收缩
; d &lt; s; d = order[d] ) ; if ( d == s ) while ( d = order[d], d != s ) std::swap( v
展开收缩
, v[d] ); } } int main(){ std::vector&lt;uint32_t&gt; order = {0,2,5,6,9,10,1,3,4,7,8,11}; std::vector&lt;uint32_t&gt; example = {0,1,2,3,4,5,6,7,8,9,10,11}; reorder(example, order); } //example = {0,6,1,7,8,2,3,9,10,4,5,11,}

How could I implement an in-place version of what I am trying to accomplish without copying memory?

Edit

I wanted to make something more clear, the vector example from the code can be arbitrary elements, I just happened to make them initialized the way I did for the ease of checking them. This would be entirely valid:

std::vector&lt;uint32_t&gt; example = {300,12,21,34,47,15,61,57,82,94,1,2};
  • order and example will always contain the same number of elements
  • It is not guaranteed that order and example store the same datatype
  • It is also guaranteed that order stores unique values
  • It is also guaranteed that order will always be the datatype uint32_t
  • order is always going to be in a range from 0 to n-1 where n is the size of example, Each of those numbers, exactly once, and nothing outside of that range. (like it does in the example code)
  • The actual order of that range is entirely random, has nothing to do with even or odd indices coming in any order.

答案1

得分: 3

我已经想出了一个简单的算法,在O(N)时间内完成,额外的存储空间为O(1)。不出所料,这会破坏order的内容。我认为,除非增加时间复杂度或存储复杂度,否则无法避免这种情况。

基本思路是逐个遍历排序,将当前位置之前的每个位置视为已解决。您不会修改已解决位置的排序或数据。

除了order是排序向量的平凡情况外,您不可避免地需要将某个数据移到一边,以便在该位置带入所需的值。有两种可能的情况:

  1. 值已经在正确的位置,或者在某个“未来”的位置;或
  2. 值指向“过去”的位置,因此我们知道它已经被移动了一次或多次,以至于现在在一个“未来”的位置。

因此,从非常基本的层面上,当您发现某个i的值order[i]小于i时,您就知道它已经被移动,并且它已经移动到位置order[order[i]]。从那个位置开始,它可能已经被再次移动。通过应用完全相同的测试,您将最终得到一个大于或等于i的索引。那就是数据移动到的地方。

使其具有O(N)时间复杂度的秘诀在于,在解析最终的“移动到”位置后,您进行最后一步操作,即用新位置覆盖order[i]。这意味着无论在该位置执行了多少搜索,都不会重复。

现在,这是否会导致线性时间复杂度并不明显,如果您感到有点费解,不要感到难过。我发现很难说服自己,在撰写此答案之前,我实际上不得不通过查找order的所有排列的最坏情况总搜索来进行实验性地证明它。这也有助于验证算法的正确性。

实际上,推理实际上归结为这样一个事实:一个值所做的“跳跃”越多,其他值可以做的跳跃就越少。通过折叠任何搜索的结果,您确保该搜索在将来也是O(1)的。关键是,这意味着链中的每次额外跳跃也是O(1),因为对于相同的数据值,连续的“跳跃”始终指向在数据移动时被展平的搜索。

以下是验证所有排列并计算执行的总搜索间接性能的测试框架:

stdout:

stderr:

这显然是最坏情况下的O(N)搜索。如果您在reorder中注释掉order[i] = next;这一行,您会看到最坏情况下为O(N^2)。

如果然后设置一个单发实验,使用最坏情况的排序(有许多最坏情况的排序),并取消注释搜索循环中的输出行,您将清楚地看到为什么展平搜索很重要。

英文:

I've come up with a simple algorithm to do this in O(N) time with O(1) additional storage. Unsurprisingly, it does destroy the contents of order in the process. I don't think you can get around that without either increasing time complexity or storage complexity.

template &lt;typename T&gt;
void reorder(std::vector&lt;T&gt; &amp;v, std::vector&lt;uint32_t&gt; &amp;order)
{
    for (size_t i = 0; i &lt; v.size(); i++)
    {
        size_t next = order[i];      // find index of next output
        while (next &lt; i)             // resolve chain of moves
        {
            next = order[next];
        }
        std::swap(v[i], v[next]);    // exchange data
        order[i] = next;             // record where previous data moved to
    }
}

The basic approach is that you iterate through the ordering one at a time, and you consider every position prior to the current position as solved. You will never modify the ordering or the data for solved positions.

Except for the trivial case where order is a sorted vector, you will inevitably need to move a piece of data out of the way, so that you can bring in the required value at that position. There are two possible scenarios:

  1. The value is already in the right place, or is in some "future" position; or
  2. The value points at a "past" position, so we know it has been moved one or more times such that it's now in a "future" position.

So, on a very basic level, when you discover that for some i, the value of order[i] is less than i, then you know it was moved, and that it moved to the position order[order[i]]. From that position, it may have been moved again. You'll know, by applying exactly the same test, until you end up with an index that's greater than or equal to i. And that's where the data moved to.

The secret to making this O(N) is that, after resolving the final "moved-to" position, you do one final step which is to overwrite order[i] with that new position. That means whatever searching you had to do at this position will never be repeated.

Now, it's not at all obvious that this results in linear time complexity, so don't feel bad if it's a bit mind-bending. I found it hard to convince myself, and before writing this answer I actually had to prove it experimentally by looking for the worst-case total searches for all permutations of order. That was also useful to verify the algorithm's correctness.

The reasoning really boils down to the fact that the more "hops" a single value makes, the fewer hops are possible for other values. By collapsing the result of any search you ensure that search is also O(1) in future. Crucially, that means the process of chaining one extra hop is also O(1) because successive "hops" for the same data value will always point back to a search that was flattened when the data was moved.


Here's the little test harness that verifies all permutations and counts the total number of search indirections performed:

#include &lt;algorithm&gt;
#include &lt;cstdint&gt;
#include &lt;iostream&gt;
#include &lt;numeric&gt;
#include &lt;vector&gt;

typedef int Data;

template &lt;typename T&gt;
size_t reorder(std::vector&lt;T&gt; &amp;v, std::vector&lt;uint32_t&gt; &amp;order)
{
    size_t indirections = 0;
    for (size_t i = 0; i &lt; v.size(); i++)
    {
        size_t next = order[i];
        while (next &lt; i)
        {
            // std::cout &lt;&lt; &quot;search &quot; &lt;&lt; i &lt;&lt; &quot; : hop to &quot; &lt;&lt; next &lt;&lt; &quot;\n&quot;;
            next = order[next];
            indirections++;
        }
        std::swap(v[i], v[next]);
        order[i] = next;
    }
    return indirections;
}

size_t count_worst_case_indirections(size_t size)
{
    size_t max_indirections = 0;

    std::vector&lt;uint32_t&gt; order_perm(size);
    std::vector&lt;uint32_t&gt; order_worst;
    std::vector&lt;uint32_t&gt; order;
    std::vector&lt;Data&gt; data(size);
    std::vector&lt;Data&gt; expected;
    expected.reserve(size);

    // Test all possible orderings
    std::iota(order_perm.begin(), order_perm.end(), 0);
    do
    {
        // Reset initial data and generate expected result
        order = order_perm;
        std::iota(data.begin(), data.end(), 0);
        expected.clear();
        for (auto i : order) expected.push_back(data[i]);

        // Run test
        size_t indirections = reorder(data, order);
        if (indirections &gt; max_indirections)
        {
            max_indirections = indirections;
            order_worst = order_perm;
        }

        // Throw if result is invalid
        if (data != expected) throw &quot;ALGORITHM IS BROKEN&quot;;
    } while (std::next_permutation(order_perm.begin(), order_perm.end()));

    std::cerr &lt;&lt; &quot;worst order : &quot;;
    for (auto x : order_worst) std::cerr &lt;&lt; x &lt;&lt; &#39; &#39;;
    std::cerr &lt;&lt; &quot;\n&quot;;

    return max_indirections;
}

int main()
{
    for (size_t size = 1; size &lt; 12; size++)
    {
        size_t max_indirections = count_worst_case_indirections(size);
        std::cout &lt;&lt; &quot;Size &quot; &lt;&lt; size &lt;&lt; &quot; : &quot; &lt;&lt; max_indirections &lt;&lt; &quot;\n&quot;;
    }
}

stdout:

Size 1 : 0
Size 2 : 1
Size 3 : 2
Size 4 : 3
Size 5 : 4
Size 6 : 5
Size 7 : 6
Size 8 : 7
Size 9 : 8
Size 10 : 9
Size 11 : 10

stderr:

worst order : 0 
worst order : 1 0 
worst order : 1 2 0 
worst order : 1 2 3 0 
worst order : 1 2 3 4 0 
worst order : 1 2 3 4 5 0 
worst order : 1 2 3 4 5 6 0 
worst order : 1 2 3 4 5 6 7 0 
worst order : 1 2 3 4 5 6 7 8 0 
worst order : 1 2 3 4 5 6 7 8 9 0 
worst order : 1 2 3 4 5 6 7 8 9 10 0

This is clearly worst-case O(N) in searches. If you comment-out the line order[i] = next; in reorder, you will see that you get a worst-case of O(N<sup>2</sup>).

If you then set up a single-fire experiment with a worst-case ordering (there are many worst-case orderings), and uncomment the line of output in the search loop, you'll see exactly why it's important to flatten the search.

答案2

得分: 2

我第一次理解你的问题时出现了误解。所以我删除了那个答案,并重新提交。

基本上,如果你需要避免复制数组,那也意味着你不能使用哈希表(map)来解决连续重新排序的问题。因此,就地重新排序成为一个O(N²)的运行时算法,除了已经分配的存储空间之外,不需要额外的存储空间。

void reorder(std::vector<uint32_t>& items, std::vector<uint32_t>& order)
{

    // 无需额外存储的n平方
    for (uint32_t i = 0; i < (uint32_t)items.size(); i++)
    {
        if (order[i] == i)
        {
            continue;
        }

        // 跟踪即将执行的置换
        uint32_t displacedValue = items[i];
        uint32_t availableIndex = order[i];

        // 交换
        items[i] = items[availableIndex];
        items[availableIndex] = displacedValue;

        // 在顺序数组中往前扫描以考虑交换
        for (size_t j = i + 1; j < items.size(); j++)
        {
            if (order[j] == i)
            {
                order[j] = availableIndex;
            }
        }
    }
}

注意,上面的代码将排列并破坏order表格。

int main() {
    std::vector<uint32_t> order = { 0,2,5,6,9,10,1,3,4,7,8,11 };
    std::vector<uint32_t> example = { 0,1,2,3,4,5,6,7,8,9,10,11 };

    reorder(example, order);

    // 显示应用的最终排序顺序
    for (size_t i = 0; i < example.size(); i++) {
        std::cout << example[i] << " ";
    }
    std::cout << std::endl;
    return 0;
}

上面的代码输出:0 2 5 6 9 10 1 3 4 7 8 11

替代解决方案

如果你可以承受复制order而不复制example的存储成本,将上面的代码转换为运行时和存储都为O(N)的解决方案。

void reorder2(std::vector<uint32_t>& items, std::vector<uint32_t>& order) {

    // 在order上创建一个反向查找表
    std::unordered_map<uint32_t, uint32_t> reverseLookup; // order表的反向
    for (uint32_t i = 0; i < (uint32_t)items.size(); i++)
    {
        reverseLookup[order[i]] = i;
    }

    for (uint32_t i = 0; i < (uint32_t)items.size(); i++)
    {
        if (order[i] == i)
        {
            continue;
        }

        // 跟踪即将执行的置换
        uint32_t displacedValue = items[i];
        uint32_t availableIndex = order[i];

        // 交换
        items[i] = items[availableIndex];
        items[availableIndex] = displacedValue;

        // 考虑交换
        uint32_t j = reverseLookup[i];
        order[j] = availableIndex;
        reverseLookup[availableIndex] = j;
    }
}
英文:

I misunderstood your question on my first attempt. So I deleted that answer and am resubmitting.

Basically, if you need to avoid a copy of the array, that also implies you couldn't use a hash-table (map) solution for the continous re-ordering. So reordering in place becomes an O(N&#178;) runtime algorithm with O(1) storage beyond what was already allocated.

void reorder(std::vector&lt;uint32_t&gt;&amp; items, std::vector&lt;uint32_t&gt;&amp; order)
{

    // n-squared without additional storage
    for (uint32_t i = 0; i &lt; (uint32_t)items.size(); i++)
    {
        if (order[i] == i)
        {
            continue;
        }

        // keep track of the displacement that&#39;s about to be done
        uint32_t displacedValue = items[i];
        uint32_t availableIndex = order[i];

        // swap
        items[i] = items[availableIndex];
        items[availableIndex] = displacedValue;

        // scan ahead in orders array to account for the swap
        for (size_t j = i + 1; j &lt; items.size(); j++)
        {
            if (order[j] == i)
            {
                order[j] = availableIndex;
            }
        }
    }
}

Note that the above code will permute and corrupt the `order` table.

Proof of concept using your example:

int main() {
    std::vector&lt;uint32_t&gt; order = { 0,2,5,6,9,10,1,3,4,7,8,11 };
    std::vector&lt;uint32_t&gt; example = { 0,1,2,3,4,5,6,7,8,9,10,11 };

    reorder(example, order);

    // show the final sort order as applied
    for (size_t i = 0; i &lt; example.size(); i++) {
        std::cout &lt;&lt; example[i] &lt;&lt; &quot; &quot;;
    }
    std::cout &lt;&lt; std::endl;
    return 0;
}

The above prints: 0 2 5 6 9 10 1 3 4 7 8 11

Alternate solution

If you can afford the storage cost of making a copy of order without copying example, converting the above code into an O(N) solution for both runtime and storage.

void reorder2(std::vector&lt;uint32_t&gt;&amp; items, std::vector&lt;uint32_t&gt;&amp; order) {

    // create a reverse lookup table on order
    std::unordered_map&lt;uint32_t, uint32_t&gt; reverseLookup; // reverse of order table
    for (uint32_t i = 0; i &lt; (uint32_t)items.size(); i++)
    {
        reverseLookup[order[i]] = i;
    }

    for (uint32_t i = 0; i &lt; (uint32_t)items.size(); i++)
    {
        if (order[i] == i)
        {
            continue;
        }

        // keep track of the displacement that&#39;s about to be done
        uint32_t displacedValue = items[i];
        uint32_t availableIndex = order[i];

        // swap
        items[i] = items[availableIndex];
        items[availableIndex] = displacedValue;

        // account for the swap
        uint32_t j = reverseLookup[i];
        order[j] = availableIndex;
        reverseLookup[availableIndex] = j;
    }
}

huangapple
  • 本文由 发表于 2023年8月9日 10:35:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76864224-2.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定