英文:
Why index accessing is faster than link access in Java?
问题
我正在比较ArrayList和LinkedList上contains方法在一个从0到1000的数字序列上的性能。
对于这两个列表,我执行了contains(400)操作。ArrayList的性能始终比LinkedList高3倍。
比较是使用JMH进行的。
ArrayList花费了329,642纳秒
LinkedList花费了945,881纳秒
如果这两个列表的contains方法都具有O(n)的性能,为什么LinkedList差3倍?
比较类:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5)
public class Comparation {
@State(Scope.Thread)
public static class MyState {
private List<Integer> arrayList = new ArrayList<>();
private List<Integer> linkedList = new LinkedList<>();
private int iterations = 1000;
@Setup(Level.Trial)
public void setUp() {
for (int i = 0; i < iterations; i++) {
arrayList.add(i);
linkedList.add(i);
}
}
}
@Benchmark
public boolean testLinkedListContains(MyState state) {
return state.linkedList.contains(400);
}
@Benchmark
public boolean testArrayListContains(MyState state) {
return state.arrayList.contains(400);
}
}
结果:
Benchmark Mode Cnt Score Error Units
Comparation.testArrayListContains avgt 20 329,642 ± 20,709 ns/op
Comparation.testLinkedListContains avgt 20 945,881 ± 43,621 ns/op
英文:
I'm comparing the perfomance of contains method in ArrayList and LinkedList on a sequence of digits from 0 to 1000.
For both lists, I do a contains(400). ArrayList performance is always 3 times higher than LinkedList.
Comparison is made using JMH.
The ArrayList took 329,642 nanoseconds
The LinkedList took 945,881 nanoseconds
If both lists have O(n) performance of contains method, why is the LinkedList 3 times worse?
Comparation class:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5)
public class Comparation {
@State(Scope.Thread)
public static class MyState {
private List<Integer> arrayList = new ArrayList<>();
private List<Integer> linkedList = new LinkedList<>();
private int iterations = 1000;
@Setup(Level.Trial)
public void setUp() {
for (int i = 0; i < iterations; i++) {
arrayList.add(i);
linkedList.add(i);
}
}
}
@Benchmark
public boolean testLinkedListContains(MyState state) {
return state.linkedList.contains(400);
}
@Benchmark
public boolean testArrayListContains(MyState state) {
return state.arrayList.contains(400);
}
}
Result:
Benchmark Mode Cnt Score Error Units
Comparation.testArrayListContains avgt 20 329,642 ± 20,709 ns/op
Comparation.testLinkedListContains avgt 20 945,881 ± 43,621 ns/op
答案1
得分: 2
在ArrayList中,数据由一个数组支持,其元素在内存中是连续排列的。因此,增加迭代器的速度非常快 -> 只需转到内存中的下一个地址。
在LinkedList中,情况不一定如此,内存不一定是连续的,元素可能随机放置在内存中,因此从一个元素转到另一个元素会稍慢一些。
英文:
In ArrayList, data is backed by an array whose elements are laid out contiguously in memory. So it is very fast to increment the iterator -> Just go to the next address in memory.
In LinkedList, this is not necessarily the case, the memory needn't be contiguous the elements may be randomly placed in memory, so going from one element to other is a bit slower.
答案2
得分: 2
如果两个列表的contains
方法都具有O(n)的性能,为什么LinkedList的性能要差三倍?
这并不违反O(n)复杂性的规则,将O(n)视为描述具有线性时间的算法的方式。与您的想法相反,这并不意味着所有这些算法的实现具有相同的速度。它意味着它们所花费的时间是它们输入的线性函数,这里ArrayList
快三倍,但如果您增加List
中的元素数量,您会发现ArrayList
仍然快三倍,而遍历它们的时间与List
中的元素数量呈线性函数增长。因此,如果您说ArrayList
需要2n来执行该功能,而LinkedList
需要10000n来执行该功能,那么您可以说它们都在O(n)内运行。
在这里,您确切地得出了LikedList
性能差三倍的结论,这意味着它们具有相同的复杂性O(n)。
但是,如果通过增加输入的数量,您发现它不再差三倍,而是变得差五倍或差30倍,并且这个数字根据输入的数量(即n)而增长,那么它们就不具有相同的复杂性O(n)。
英文:
Your question: If both lists have O(n) performance of contains method, why is the LinkedList 3 times worse?
It is not against the rule of O(n) complexity, consider O(n) as something that describes an algorithm with linear time. Contrary to what you believe It does not mean all implementations of those algorithm have the same speed. It means time that they take is a linear function of their inputs, here ArrayList
is three times faster, but if you increase the number of element in List
you can see ArrayList
is still three time faster, and the time to iterate both of them grows as a liner function of the number of elements in List
. So if you say Arraylist
takes 2n to execute that functionality and LinkedList
takes 10000n to execute that functionality, you can say they both run in O(n).
Here you exactly came to this conclusion that LikedList
is three times worse, it means they have the same complexity O(n).
But If by increasing the number of input you find that it is not 3 times worse any more and it is getting 5 times worse or 30 times worse and this number grows based on the number of input(which is n), they don't have the same complexity O(n).
答案3
得分: 1
非常简单:不同的数据结构具有不同的性能权衡。
LinkedList 是一个双向链表实现。在添加和删除方面,其性能优于 ArrayList,但在获取和设置方法上性能较差。
换句话说,如果您的动机是持续修改列表,那么 LinkedList 可能是一个更好的选择。但如果只是创建和遍历列表,ArrayList“胜出”。
随着更多元素被添加到 ArrayList 中,其大小会动态增加。可以通过使用 get 和 set 方法直接访问其元素,因为 ArrayList 本质上是一个数组。
Vector 和 ArrayList 需要在添加更多元素时分配空间。Vector 每次都会将其数组大小加倍,而 ArrayList 每次增长其大小的50%。
然而,LinkedList 也实现了 Queue 接口,添加了比 ArrayList 和 Vector 更多的方法,如 offer()、peek()、poll() 等。
注意:ArrayList 的默认初始容量相当小。最好的习惯是使用更高的初始容量构造 ArrayList,以避免重新调整大小的成本。
在这种情况下,数组速度更快,但灵活性较低。
英文:
Very simply: different data structures have different performance trade-offs.
> https://dzone.com/articles/arraylist-vs-linkedlist-vs
>
> LinkedList is implemented as a double linked list. Its performance on
> add and remove is better than Arraylist, but worse on get and set
> methods.
In other words, if your motivation was to continuously modify the list, LinkedList might be a better choice. But to simply create and traverse the list, ArrayList "wins".
To continue:
> As more elements are added to ArrayList, its size is increased
> dynamically. It's elements can be accessed directly by using the get
> and set methods, since ArrayList is essentially an array.
> ...
>
> Vector and ArrayList require space as more elements are added. Vector
> each time doubles its array size, while ArrayList grow 50% of its size
> each time.
> ...
>
> LinkedList, however, also implements Queue interface which adds more
> methods than ArrayList and Vector, such as offer(), peek(), poll(),
> etc.
> ...
>
> Note: The default initial capacity of an ArrayList is pretty
> small. It is a good habit to construct the ArrayList with a higher
> initial capacity. This can avoid the resizing cost.
In this case, an array is faster .. but less flexible.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论