为什么对排序数组进行求和比对未排序数组进行求和更快?

huangapple go评论76阅读模式
英文:

Why is summing a sorted array faster than an unsorted one?

问题

I have an ArrayList with numbers from 10 MM to 20 MM

 List<Long> l = new ArrayList<>();
        for (long i = 10_000_000; i < 20_000_000; i++) {
            l.add(i);
        }

Why is the sum() function faster on a sorted array than on an unsorted?

public class Main {

    public static void main(String[] args) {
        List<Long> l = new ArrayList<>();
        for (long i = 10_000_000; i < 20_000_000; i++) {
            l add(i);
        }

        System.out.println(sum(l));

        Collections.shuffle(l);
        System.out.println(sum(l));

        Collections.sort(l);
        System.out.println(sum(l));
    }

    static long sum(List<Long> l) {
        long started = System.nanoTime();
        long result = 0;
        for (long v : l) {
            result += v;
        }
        System.out.println(" duration: " + (System.nanoTime() - started) / 1_000_000 + "ms");
        return result;
    }

}

I execute the sum() function three times

  • First time, on a sorted array
  • Second time on unsorted
  • And the third time again on sorted

On my computer, the execution time was

  1. 85ms

  2. 250ms

  3. 97ms

Why is the sum() function faster on a sorted array than on an unsorted?

英文:

I have an ArrayList with numbers from 10 MM to 20 MM

 List<Long> l = new ArrayList<>();
        for (long i = 10_000_000; i < 20_000_000; i++) {
            l.add(i);
        }

Why is the sum() function faster on a sorted array than on an unsorted?

public class Main {

    public static void main(String[] args) {
        List<Long> l = new ArrayList<>();
        for (long i = 10_000_000; i < 20_000_000; i++) {
            l.add(i);
        }

        System.out.println(sum(l));

        Collections.shuffle(l);
        System.out.println(sum(l));

        Collections.sort(l);
        System.out.println(sum(l));
    }

    static long sum(List<Long> l) {
        long started = System.nanoTime();
        long result = 0;
        for (long v : l) {
            result += v;
        }
        System.out.println(" duration: " + (System.nanoTime() - started) / 1_000_000 + "ms");
        return result;
    }

}

I execute the sum() function three times

  • First time, on a sorted array
  • Second time on unsorted
  • And the third time again on sorted

On my computer, the execution time was

  1. 85ms

  2. 250ms

  3. 97ms

Why is the sum() function faster on a sorted array than on an unsorted?

答案1

得分: 5

不仅仅是排序的问题,而是内存访问的顺序。

我将简要描述一些实验证明这一点:

  • 尝试使用long[]而不是ArrayList<Long>。然后,没有单独分配的Long对象,只有一个包含数据的大数组。这使得sum的内存访问顺序与数组的内容无关。我没有执行这个实验,因为编写代码需要更多工作。

  • 尝试使用随机顺序的值初始化列表。我尝试过(还将列表稍微缩短以节省时间),得到了以下结果:

    持续时间:31毫秒
    -6262510617230624864
    持续时间:77毫秒
    -6262510617230624864
    持续时间:160毫秒
    -6262510617230624864
    

    即最快的顺序与创建Long对象的顺序相同,与它们的值无关。

  • 在洗牌后按顺序重新创建Long对象,例如:

      for (int i = 0; i < l.size(); i++)
      	l.set(i, new Long(l.get(i)));
    

    这会再次使洗牌列表的sum操作变得快速(随后对列表进行排序会改变内存访问模式并再次使sum变慢)。

实际上,并没有保证连续分配的Long对象将按照相同的顺序分配到内存中,但当分配了足够多的对象并且同时没有发生太多干扰时,这是一个相当常见的情况。

英文:

It's not about being sorted per-se, it's about the order in which memory is accessed.

I will briefly describe a few experiments to confirm that:

  • Try using long[] instead of ArrayList<Long>. Then there are no individually allocated Long objects, just one big array that directly contains the data. That makes the memory access order of sum independent of the contents of the array. I did not perform this experiment since it would take a bit more work to write the code.

  • Try initializing the list with randomly-ordered values. I tried that (I also made the list a bit shorter to save time) and got this result:

    duration: 31ms
    -6262510617230624864
    duration: 77ms
    -6262510617230624864
    duration: 160ms
    -6262510617230624864
    

    ie the fastest order is the same as the order in which the Long objects were created, independently of their values.

  • Recreate the Long objects in-order after shuffling, for example:

      for (int i = 0; i < l.size(); i++)
      	l.set(i, new Long(l.get(i)));
    

    That makes the sum of the shuffled list fast again (subsequently sorting the list un-orders the memory access pattern and makes sum slow again).

There is actually no guarantee that Long objects that are allocated one after the other will be allocated in the same order in memory, but that is a pretty common thing to happen, when a sufficiently large number of them is allocated and not much disturbance happens meanwhile.

huangapple
  • 本文由 发表于 2023年3月7日 03:12:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/75654911.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定