从已排序的数组中**原地**移除重复项,无需为第二个数组分配空间。

huangapple go评论63阅读模式
英文:

Removing duplicates from a sorted array IN-PLACE without allocating space for a second array

问题

我面临一个令人困惑的问题。我试图将一个带有重复项的未排序数组转换为一个没有重复项的排序数组。我使用选择排序算法完成了第一部分:

public static void SelectionSort(int [] list) {
    int i = 0;
    int j = 0;

    int indexSmallest = 0;
    int temp = 0;

    for (i = 0; i < list.length - 1; i++) {
        indexSmallest = i;
        for (j = i + 1; j < list.length; j++) {
            if (list[j] < list[indexSmallest]) {
                indexSmallest = j;
            }
        }

        temp = list[i];
        list[i] = list[indexSmallest];
        list[indexSmallest] = temp;
    }
}

排序不是问题的关键——我很难从数组中移除重复项。我脑中的解决方案是创建一个辅助数组,遍历输入数组以检查元素是否存在于辅助数组中,如果不存在,则将其添加到辅助数组中。然而,我陷入困境,因为我不能创建另一个数组。那么,怎么办呢?如果我不能创建一个数组来进行交叉参考并检查是否有唯一的值,我该如何解决这个问题呢?我不能使用任何内置的Java函数。

英文:

I'm tasked with a perplexing problem. I'm attempting to turn an unsorted array with duplicates into a sorted array without duplicates. I used the selection sort to accomplish the first part:

public static void SelectionSort(int [] list) {
        int i = 0;
        int j = 0;

        int indexSmallest = 0;
        int temp = 0;

        for (i = 0; i &lt; list.length - 1; i++) {
            indexSmallest = i;
            for (j = i + 1; j &lt; list.length; j++) {
                if (list[j] &lt; list[indexSmallest]) {
                    indexSmallest = j;
                }
            }

            temp = list[i];
            list[i] = list[indexSmallest];
            list[indexSmallest] = temp;
        }

    }

The sorting isn't the issue - I'm having a hard time removing the duplicates from the array. The solution that I have in my head is to create a secondary array, and iterate through the input to check if that element exists in the second array, if not, add it to the array. I'm stuck because I'm not able to create another array. So, what gives? How do I solve this problem if I can't create an array to cross-reference and check if I have unique values? I'm not able to use any built-in Java functions.

答案1

得分: 1

以下是翻译好的部分:

不需要第二个数组……用这种方式思考一下……

最简单的方法是将其转换为集合/哈希集,然后在数组中对其进行排序。

但如果禁止使用集合,唯一的可能性就是将重复项放在末尾,将它们删掉,然后对剩下的部分进行排序。

在这个数组中 [1,8,1,2,3,5,3],你需要移除重复的元素……好的……那么如果我们这样做呢……我们将把这个数组“分割”成“已排序”、“未排序且重复的”以及“重复的”部分。现在我们将使用两个指针遍历数组。一个在第一个元素(称为 i),一个在最后一个元素(称为 j)……现在我们会执行 while i < j,并且每当我们找到一个重复项时就进行交换。这样,你将在 i 之前得到所有不重复的内容,而在 i 之后得到所有重复的内容……现在你将对索引 0 到索引 i 的数组进行排序,然后你应该会得到排序后的数组,然后你只需要去掉重复的部分……

当然,这将需要时间复杂度能够处理 O(n*logn) / O(n^2)……

还有一种方法可以在 O(n) 内完成,可以这样做……你将使用两个指针……

一个指向当前已排序数组,其中没有重复项,另一个指向尚未交换的整数所在的位置……(你需要记住找到的最大数)

更具体地说:

[1,2,2,3,3,4,5]
i = 0, j = 1
- 没问题
i = 1, j = 2
- 重复项……所以……
跳到重复位置
i = 2, j = 3(array[3] != 2,所以我们将交换)
当前数组 -&gt; [1,2,3,2,3,4,5]
                      ^ ^  
                      i j
i = 3, j = 4 
- highest_number > 3 不成立(2 < 3),所以跳过
i = 3, j = 5
- highest_number > 3 不成立(3 < 3),所以跳过
i = 3, j = 6
- 交换
……等等

最终你应该得到类似这样的结果
[1,2,3,4,5,2,3]
           ^ ^
           i j
现在你可以在索引 i 处截断数组,这样你将得到 `[1,2,3,4,5,
[1,2,2,3,3,4,5]
i = 0, j = 1
- 没问题
i = 1, j = 2
- 重复项……所以……
跳到重复位置
i = 2, j = 3(array[3] != 2,所以我们将交换)
当前数组 -&gt; [1,2,3,2,3,4,5]
                      ^ ^  
                      i j
i = 3, j = 4 
- highest_number > 3 不成立(2 < 3),所以跳过
i = 3, j = 5
- highest_number > 3 不成立(3 < 3),所以跳过
i = 3, j = 6
- 交换
……等等

最终你应该得到类似这样的结果
[1,2,3,4,5,2,3]
           ^ ^
           i j
现在你可以在索引 i 处截断数组,这样你将得到 `[1,2,3,4,5,\0]`(用 C 语法表示)……基本上就是 `[1,2,3,4,5]`
]`(用 C 语法表示)……基本上就是 `[1,2,3,4,5]`
英文:

There is no need of second array ... think about it this way ...

The easiest way is to convert it to set / hashset and then sort it in an array.

But if the sets are forbidden, the only possibility is to put the duplicates at the end, cut them out and then sort the rest.

[1,8,1,2,3,5,3] in this array, you need to remove elements that are duplicates ... okay ... so what if we did something like this ... we will "split" this array into "sorted", "unsorted and duplicates" and "duplicates". Now what we will do is that we will go through the array using 2 pointers. One at the first element (lets call it &quot;i&quot;) and at the last element (lets call it &quot;j&quot;) ... now we will go while i &lt; j and we will swap everytime, when we will find a duplicate. This way, you will get everything not duplicate before &quot;i&quot; and everything that is dupicate after &quot;i&quot; ... now you will sort the array from index 0 do index i and you should have sorted array and you will just cut out the duplicates ...

ofc., this will require the time complexity, to be able to handle O(n*logn) / O(n^2) ...

There is a way how to do it in a O(n), and that can be done by that .. you will use 2 pointers ...

one will be pointing at current sorted array, where you have no duplicates and toher will be pointing to a place, where are yet unswapped integers ... (you need to remember the highest number found)

to be more specific:

[1,2,2,3,3,4,5]
i = 0, j = 1
- fine
i = 1, j = 2
- duplicate ... soo ..
jumping to duplicate position
i = 2, j = 3 (array[3] != 2, so we will swap)
current array -&gt; [1,2,3,2,3,4,5]
                      ^ ^  
                      i j
i = 3, j = 4 
- highest_number &gt; 3 is not true (2 &lt; 3), so skipping
i = 3, j = 5
- highest_number &gt; 3 is not true (3 &lt; 3), so skipping
i = 3, j = 6
- swapping
... etc

and you should end up with something like this
[1,2,3,4,5,2,3]
           ^ ^
           i j
now you can cut the array at i, so you will get `[1,2,3,4,5,
[1,2,2,3,3,4,5]
i = 0, j = 1
- fine
i = 1, j = 2
- duplicate ... soo ..
jumping to duplicate position
i = 2, j = 3 (array[3] != 2, so we will swap)
current array -&gt; [1,2,3,2,3,4,5]
^ ^  
i j
i = 3, j = 4 
- highest_number &gt; 3 is not true (2 &lt; 3), so skipping
i = 3, j = 5
- highest_number &gt; 3 is not true (3 &lt; 3), so skipping
i = 3, j = 6
- swapping
... etc
and you should end up with something like this
[1,2,3,4,5,2,3]
^ ^
i j
now you can cut the array at i, so you will get `[1,2,3,4,5,\0]` (in C syntax) ... so basically `[1,2,3,4,5]`
</details>
]` (in C syntax) ... so basically `[1,2,3,4,5]` </details>

huangapple
  • 本文由 发表于 2020年10月11日 10:12:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/64300030.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定