英文:
Why is a linear search through a heap data structure faster than tree traversal?
问题
在以下代码中,在我的测试用例中,indexOfSlow
大约比 indexOf
慢了 17 倍,尽管前者进行的比较次数较少。
为什么会这样?
(我的猜测是 indexOf
的卓越性能归因于顺序数组访问,例如具有缓存优势,但我想要确切的原因)
来自我的计算机的性能数据:
| elementCount | indexOf 毫秒 | indexOfSlow 毫秒 | 慢/快比例 |
|--------------+--------------+------------------+-----------|
| 10000 | 41 | 852 | 21 |
| 20000 | 232 | 3387 | 15 |
| 30000 | 375 | 6642 | 18 |
| 40000 | 710 | 12699 | 18 |
| 50000 | 1197 | 19182 | 16 |
|--------------+--------------+------------------+-----------|
| 平均值 | | | 17.6 |
注意:Java 的 PriorityQueue.indexOf()
类似于我的 indexOf
,因此不像我的 indexOfSlow
那样尝试遍历树并提前停止。
class Heap<T extends Comparable<T>> {
public static void main(String[] args) {
int elementCount = 50_000;
final List<Integer> elements = new ArrayList<>(elementCount);
for (int i = 0; i < elementCount; i++)
elements.add(i);
for (int j = 0; j < 3; j++) {
final var heap = new Heap<Integer>();
for (int i : elements)
heap.add(i);
assert heap.peek() == 0;
final long nanoBefore = System.nanoTime();
for (int i = elementCount; i > 0; i--)
heap.indexOf(i - 1);
// heap.indexOfSlow(i - 1);
final long nanoAfter = System.nanoTime();
if (j > 0) // discard first round as warmup
System.out.println("Time taken: " + (nanoAfter - nanoBefore) / 1_000_000 + " 毫秒");
}
}
private final ArrayList<T> list = new ArrayList<>();
public T peek() {
return list.isEmpty() ? null : list.get(0);
}
private void siftUp(int i, T value) {
while (i > 0) {
int parentI = (i - 1) / 2;
final T parentValue = list.get(parentI);
if (parentValue.compareTo(value) <= 0)
return;
list.set(parentI, value);
list.set(i, parentValue);
i = parentI;
}
}
public void add(T value) {
list.add(value);
siftUp(list.size() - 1, value);
}
public int indexOf(T value) {
final int size = list.size();
if (size > 0) {
for (int i = 0; i < list.size(); i++) {
if (value.compareTo(list.get(i)) == 0)
return i;
}
}
return -1;
}
public int indexOfSlow(T value) {
final int size = list.size();
if (size > 0) {
Queue<Integer> childrenToVisit = new LinkedList<>();
childrenToVisit.add(0);
while (!childrenToVisit.isEmpty()) {
int i = childrenToVisit.poll();
final int cmp = list.get(i).compareTo(value);
if (cmp == 0)
return i;
if (cmp > 0)
continue;
int rightChildIdx = (i + 1) * 2;
int leftChildIdx = rightChildIdx - 1;
if (leftChildIdx < size)
childrenToVisit.add(leftChildIdx);
if (rightChildIdx < size)
childrenToVisit.add(rightChildIdx);
}
}
return -1;
}
}
英文:
In the following code, indexOfSlow
is around 17 times slower than indexOf
in my test cases, although the former does fewer comparison than the latter.
Why is that?
(My hunch is that the superior performance of indexOf
is due to sequential array access which has e.g. caching benefits - but I'd like to know for sure)
Performance numbers from my machine:
| elementCount | indexOf msec | indexOfSlow msec | slow/fast |
|--------------+--------------+------------------+-----------|
| 10000 | 41 | 852 | 21 |
| 20000 | 232 | 3387 | 15 |
| 30000 | 375 | 6642 | 18 |
| 40000 | 710 | 12699 | 18 |
| 50000 | 1197 | 19182 | 16 |
|--------------+--------------+------------------+-----------|
| average | | | 17.6 |
N.B. Java's PriorityQueue.indexOf()
is similar to my indexOf
, so it does not try to walk the tree and stop early like my indexOfSlow
does.
class Heap<T extends Comparable<T>> {
public static void main(String[] args) {
int elementCount = 50_000;
final List<Integer> elements = new ArrayList<>(elementCount);
for (int i = 0; i < elementCount; i++)
elements.add(i);
for (int j = 0; j < 3; j++) {
final var heap = new Heap<Integer>();
for (int i : elements)
heap.add(i);
assert heap.peek() == 0;
final long nanoBefore = System.nanoTime();
for (int i = elementCount; i > 0; i--)
heap.indexOf(i - 1);
// heap.indexOfSlow(i - 1);
final long nanoAfter = System.nanoTime();
if (j > 0) // discard first round as warmup
System.out.println("Time taken: " + (nanoAfter - nanoBefore) / 1_000_000 + " msec");
}
}
private final ArrayList<T> list = new ArrayList<>();
public T peek() {
return list.isEmpty() ? null : list.get(0);
}
private void siftUp(int i, T value) {
while (i > 0) {
int parentI = (i - 1) / 2;
final T parentValue = list.get(parentI);
if (parentValue.compareTo(value) <= 0)
return;
list.set(parentI, value);
list.set(i, parentValue);
i = parentI;
}
}
public void add(T value) {
list.add(value);
siftUp(list.size() - 1, value);
}
public int indexOf(T value) {
final int size = list.size();
if (size > 0) {
for (int i = 0; i < list.size(); i++) {
if (value.compareTo(list.get(i)) == 0)
return i;
}
}
return -1;
}
public int indexOfSlow(T value) {
final int size = list.size();
if (size > 0) {
Queue<Integer> childrenToVisit = new LinkedList<>();
childrenToVisit.add(0);
while (!childrenToVisit.isEmpty()) {
int i = childrenToVisit.poll();
final int cmp = list.get(i).compareTo(value);
if (cmp == 0)
return i;
if (cmp > 0)
continue;
int rightChildIdx = (i + 1) * 2;
int leftChildIdx = rightChildIdx - 1;
if (leftChildIdx < size)
childrenToVisit.add(leftChildIdx);
if (rightChildIdx < size)
childrenToVisit.add(rightChildIdx);
}
}
return -1;
}
}
答案1
得分: 1
以下是您提供的代码的翻译:
你更多地在写代码而不是阅读,而且你以一种非常低效的方式进行,如果用 `int[]` 替换 `LinkedList` 可能会有一些改进的余地:
```java
final int size = list.size();
if (size <= 0) {
return -1;
}
int[] queue = new int[size];
int last = 0;
for (int index = 0; index <= last; index++) {
int cmp = list.get(queue[index]).compareTo(value);
if (cmp == 0) {
return index;
}
if (cmp > 0) {
continue;
}
int right = (index + 1) * 2;
int left = right - 1;
if (left < size) {
queue[++last] = left;
if (right < size) {
queue[++last] = right;
}
}
}
return -1;
由于在最坏的情况下 LinkedList
将包含 N/2
个元素,这种替代是合理的,但仍然比 DFS 方法性能差:
int indexOf(T value, int index, int size) {
if (index >= size) {
return -1;
}
int cmp = list.get(index).compareTo(value);
if (cmp == 0) {
return index;
}
int res = -1;
if (cmp > 0) {
return res;
}
int rIdx = (index + 1) * 2;
int lIdx = rIdx - 1;
if ((res = indexOf(value, rIdx, size)) >= 0) {
return res;
}
return indexOf(value, lIdx, size);
}
对于这种特定情况来说,通过在某处存储关于它们的信息来优化掉一些路径的想法并不是很好 - 你可以将索引存储在 HashMap
中,并且利用相当数量的内存来实现搜索操作的 O(1)
时间复杂度。
<details>
<summary>英文:</summary>
You are writing ~ more than reading, moreover, you are doing that in a very inefficient manner, it is possible to get some improvements if replace `LinkedList` with `int[]`:
```java
final int size = list.size();
if (size <= 0) {
return -1;
}
int[] queue = new int[size];
int last = 0;
for (int index = 0; index <= last; index++) {
int cmp = list.get(queue[index]).compareTo(value);
if (cmp == 0) {
return index;
}
if (cmp > 0) {
continue;
}
int right = (index + 1) * 2;
int left = right - 1;
if (left < size) {
queue[++last] = left;
if (right < size) {
queue[++last] = right;
}
}
}
return -1;
since in the worst case LinkedList
will contain N/2
elements, such replacement is reasonable, however, that still performs worse than DFS approach:
int indexOf(T value, int index, int size) {
if (index >= size) {
return -1;
}
int cmp = list.get(index).compareTo(value);
if (cmp == 0) {
return index;
}
int res = -1;
if (cmp > 0) {
return res;
}
int rIdx = (index + 1) * 2;
int lIdx = rIdx - 1;
if ((res = indexOf(value, rIdx, size)) >= 0) {
return res;
}
return indexOf(value, lIdx, size);
}
the idea of optimizing out some paths via storing information about them somewhere is not so good for this particular case - you may store indexes in HashMap
and achieve O(1)
TC for search operation utilising comparable amount of memory.
答案2
得分: 0
First of all, the results obtained with
final long nanoBefore = System.nanoTime();
//...
final long nanoAfter = System.nanoTime();
are not reliable at all. To get reliable results you need to use JMH.
Second, in indexOfSlow()
you use LinkedList
which is highly ineffective.
Third, in indexOf()
you don't allocate any memory, iterating over the same list again and again. In indexOfSlow()
you allocate a new LinkedList
for each method invocation. This obviously costs us time and GC.
Fourth, in spite of the fact that in theory linear search has complexity O(n) in practice in your example you are searching over ArrayList
which is array-based and extremely fast when you are accessing elements by index (even if you need to iterate over the whole list). In the indexOfSlow()
on the contrary you rely on continuous poll()
/add()
operations. Imagine you are looking for 49733
, then for the 1
, 2
, 3
etc. values you execute both poll()
and add()
operations. As a result before returning the value you are looking for your childrenToVisit
list is modified 99734 (!!!) times. So this algorithm is extremely ineffectively implemented and this is why you have such a dramatic difference in numbers.
英文:
First of all, the results obtained with
final long nanoBefore = System.nanoTime();
//...
final long nanoAfter = System.nanoTime();
are not reliable at all. To get reliable results you need to use JMH.
Second, in indexOfSlow()
you use LinkedList
which is highly ineffective.
Third, in indexOf()
you don't allocate any memory, iterating over the same list again and again. In indexOfSlow()
you allocate a new LinkedList
for each method invocation. This obviously costs us time and GC.
Fourth, in spite of the fact that in theory linear search has complexity O(n) in practice in your example you are searching over ArrayList
which is array-based and extremely fast when you are accessing elements by index (even if you need to iterate over the whole list). In the indexOfSlow()
on the contrary you rely on continuous poll()
/add()
operations. Imagine you are looking for 49733
, then for the 1
, 2
, 3
etc. values you execute both poll()
and add()
operation. As a result before returning the value you are looking for your childrenToVisit
list is modified 99734 (!!!) times. So this algorithm is extremely ineffectively implemented and this is why you have such a dramatic difference in numbers.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论