ArrayList相对于LinkedList的优势以及它们之间的性能比较。

huangapple go评论83阅读模式
英文:

Benefits of ArrayList over LinkedList and performance comparation between them

问题

最近,我一直在研究Java中的链表(LinkedLists)。经过一些研究,我发现使用LinkedList相比ArrayList的主要优势在于,当在具有大量数据的列表中的中间位置添加/删除元素时,它的速度更快。那么,使用ArrayList的好处是什么,而不总是使用LinkedList呢?

为了进行比较,我编写了这个简单的程序来比较LinkedList和ArrayList的运行速度。该程序的概念是创建一个长度为 n 的整数数组,并使用从Integer.MIN_VALUEInteger.MAX_VALUE范围内的随机整数填充它。

然后,它创建另一个数组,该数组是第一个数组元素的 x%。例如,如果第一个数组是[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],如果 x 设置为 50%,那么第二个数组可能是[2, 4, 5, 9, 10]。这将导致一个包含原始数组开头、中间和结尾的随机元素的数组。

创建这些数组期间的运行时间不计算在内。之后,程序创建一个LinkedList,将第一个数组中的所有元素添加到此列表中,并从该列表中删除第二个数组的元素。我重复这个过程 y 次,并使用System.nanoTime()计算平均经过时间。

我知道我的代码远非最优化,但由于LinkedList和ArrayList的代码完全相同,只是创建的列表类型不同,并且在初始化时没有预先设置列表的大小,结果不应受影响,对吗?例如,如果我的糟糕代码使其运行速度变慢了 400%,由于两者使用相同的代码,理论上速度变慢了 400%,结果不应受到影响。

我找到的结果是,平均而言,ArrayList比LinkedList快360%,至少在使用包含 10 万个整数的数组并运行 10 次,同时删除 45% 的元素时是如此。

根据我阅读的这个这个这个比较ArrayList和LinkedList性能的StackOverflow帖子,这是因为LinkedList分配了更多的内存(24 字节对比 ArrayList 的 4 字节)每个节点。尽管从理论上讲,当在数组元素的中间位置添加/删除元素时,LinkedList更快,平均只需 O(n/4) 步,而List需要 O(n/2) 步,但在现实中,至少在我的测试和上述链接的帖子中进行的测试中,这似乎并不成立。如果是这种情况,那么问题就变成了:使用LinkedList而不是ArrayList的好处是什么?只是为了能够使用list.iterator()吗?

以下是我得到的确切结果以及我用于获取这些结果的代码:

使用LinkedList运行 10 次,数组长度为 100000,删除 45% 的元素,花费了 53079.970300 毫秒
LinkedList的平均运行时间为 5307.997030 毫秒
使用普通List运行 10 次,数组长度为 100000,删除 45% 的元素,花费了 14344.833900 毫秒
普通List的平均运行时间为 1434.483390 毫秒
import java.util.ArrayList;
import java.util.Arrays;
import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.ThreadLocalRandom;

public class LinkedLists {
    // ...(此处省略了代码的其余部分,只返回翻译的结果)
}
英文:

Recently I have been looking LinkedLists in Java. After some research, I found out that the main benefit of using LinkedList over ArrayList is the fact that it's faster when removing/adding elements mid array in a list with a large amount of data. Therefore, what would be the benefit of using an ArrayList and not always a LinkedList?

For the sake of comparison, I wrote this simple program to compare the run speed between LinkedList and ArrayList. The concept of the program is that it creates an int array of length n and populates it with random integers from Integer.MIN_VALUE to Integer.MAX_VALUE.

Then, it creates another array that is x% of the elements of the first array. For instance, if the first array was [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], the second array, if x was set to 50%, may be [2, 4, 5, 9, 10]. This will result in an array with random elements from the start, middle, and end of the original array.

The run time elapsed during the creation of these arrays is not counted. Afterward, the program creates a LinkedList, add all elements from the first array into this list, and removes the elements of the second array from this list. I repeat this process y times and calculate the average elapsed time using System.nanoTime().

I know my code is far from optimized, but since it is the exact same code for both LinkedList and ArrayList, only changing the type of list that is being created and not pre-setting the size of the list on initialization, the results should not be affected, correct? For example, if my bad code is making it run 400% slower, since both use the same code, the results should not be affected whatsoever, since both are, theoretically, 400% slower.

The results I found are that ArrayLists are, in average, 360% faster than LinkedLists, at least when using a 100 thousand integers array and running it 10 times while removing 45% of the elements.

From what I understood by reading this, this, and this StackOverflow posts that compare the performance of ArrayList and LinkedList, that happens because LinkedLists allocates more memory (24 bytes vs 4 bytes on ArrayLists) per node. Although, theoretically, LinkedLists are faster when adding/removing elements mid array elements, needing only O(n/4) steps on average, while Lists need O(n/2), in reality - at least in my tests and in the tests performed in the posts linked above - that does not seem to be true. If that is the case, then the question inverts itself: what would be the benefit of using LinkedLists and not ArrayList? Only to be able to use list.iterator()?

Here is the exact results I got and the code I used to get them:

It took 53079,970300 ms with LinkedList to run 10 times with an array of 100000 integers removing 45 percent of the elements
The average run time for LinkedList was 5307,997030 ms
It took 14344,833900 ms with normal List to run 10 times with an array of 100000 integers removing 45 percent of the elements
The average run time for normal List was 1434,483390 ms
import java.util.ArrayList;
import java.util.Arrays;
import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.ThreadLocalRandom;

public class LinkedLists {
    public static void main(String[] args) {

        int   timesToRun          = 10;
        int   percentageToRemove  = 45;
        int[] arr                 = getRandomArray(100_000);
        int[] integersToRemoveArr = getIntegersToRemoveArr(percentageToRemove, arr);

        long elapsedTime = 0L;
        long startTime;
        long endTime;

        //LinkedLists
        startTime = System.nanoTime();

        for (int i = 0; i < timesToRun; i++) {
            linkedList(arr, integersToRemoveArr);
        }

        endTime = System.nanoTime();
        elapsedTime += endTime - startTime;
        System.out.printf("It took %f ms with LinkedList to run %d times with an array of %d integers removing %d percent of the elements\n", (endTime - startTime) * 0.000001, timesToRun, arr.length, percentageToRemove);
        System.out.printf("The average run time for LinkedList was %f ms\n", elapsedTime/timesToRun * 0.000001);

        //Force GC - again, not sure I need this, just to be safe
        Runtime.getRuntime().gc();

        //Normal Lists
        elapsedTime = 0L;
        startTime = System.nanoTime();

        for (int i = 0; i < timesToRun; i++) {
            normalList(arr, integersToRemoveArr);
        }

        endTime = System.nanoTime();
        elapsedTime += endTime - startTime;

        System.out.printf("It took %f ms with normal List to run %d times with an array of %d integers removing %d percent of the elements\n", (endTime - startTime) * 0.000001, timesToRun, arr.length, percentageToRemove);
        System.out.printf("The average run time for normal List was %f ms\n", elapsedTime/timesToRun * 0.000001);


    }

    /**
     * Get an array wit X percent of the integers of the providade array
     */
    public static int[] getIntegersToRemoveArr(int percentageToRemove, int[] originalArray) {

        if (percentageToRemove > 100 || percentageToRemove < 0) {
            return new int[]{};
        }

        List<Integer> indexesToRemove = new ArrayList<>();
        while (indexesToRemove.size() < (originalArray.length * (percentageToRemove)/100F)) {
            int x = ThreadLocalRandom.current().nextInt(0, originalArray.length-1);
            indexesToRemove.add(originalArray[x]);
        }

        int[] arr = new int[indexesToRemove.size()];

        for (int i = 0; i < arr.length; i++) {
            arr[i] = indexesToRemove.get(i);
        }

        Arrays.sort(arr);

        return arr;

    }

    public static int[] getRandomArray(int length) {
        if (length < 1) {
            return new int[]{};
        }

        int[] arr = new int[length];

        for (int i = 0; i < length; i++) {
            arr[i] = ThreadLocalRandom.current().nextInt(Integer.MIN_VALUE, Integer.MAX_VALUE);
        }

        return arr;
    }

    /**
     * Create a LinkedList from the provided array and remove from the list the elements in integersToRemove
     */
    public static void linkedList(int[] arr, int[] integersToRemove) {
        LinkedList<Integer> list = new LinkedList<>();

        for (int i : arr) {
            list.add(i);

    }
        for (int i : integersToRemove) {
            list.remove((Integer) i);
        }

        //Nulling the element to allow GC to remove it - i'm not sure this is needed since I'm not using
        //this var anywhere, but just to be safe
        list = null;

    }

    /**
     * Create a LinkedList from the provided array and remove from the list the elements in integersToRemove
     */
    public static void normalList(int[] arr, int[] integersToRemove) {
        List<Integer> list = new ArrayList<>();

        for (int i : arr) {
            list.add(i);
        }

        for (int i : integersToRemove) {
            list.remove((Integer) i);
        }

        //Nulling the element to allow GC to remove it - i'm not sure this is needed since I'm not using
        //this var anywhere, but just to be safe
        list = null;

    }
}

答案1

得分: 1

好的,以下是翻译好的内容:

优势在于具有O(1)的添加操作。即使平均而言,ArrayList更快,但当你调用add()并且需要扩展容量时,速度可能会非常慢。对于对延迟要求很高的应用,这可能很重要。另外,由于它是双向链表,与大型ArrayList相比,前置项目的速度要快得多,这是您没有测试的。

英文:

The benefit is that you have O(1) add. Even though on average ArrayList is faster, when you add() and need to expand the capacity it can be very slow. For latency critical applications this might be important. Also since it's a doubly linked list, prepending items is way faster than large ArrayList, which you did not test.

huangapple
  • 本文由 发表于 2020年9月8日 22:46:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/63796345.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定