Scala List sort vs Python list sort. Execution time comparison

huangapple go评论67阅读模式
英文:

Scala List sort vs Python list sort. Execution time comparison

问题

Scala List sort vs Python list sort. Execution time comparison.

我正在学习Scala。
我有兴趣比较Scala的List排序和Python的列表排序。
令我惊讶的是,在Python中对包含1,000,000个整数的列表进行排序比在Scala中快5倍。
据我所知,Scala比Python快。
谁能解释Python在这方面具有如此大的优势的原因?

Scala代码:

import scala.util.Random

@main
def Main(args: String*): Unit =
  def cmillis = System.currentTimeMillis()
  val n = 1_000_000

  val xs = List.fill(n)(Random.nextInt(n))

  var currentMillis = cmillis
  xs.sorted
  println("Sorted in " + (cmillis - currentMillis) + "ms")

Python代码:

from time import time
from random import randint

n = 1_000_000

xs = [randint(0, n) for _ in range(n)]

start = time()
sorted(xs)
print(f"Sorted in {(time() - start)* 1000} ms")
英文:

Scala List sort vs Python list sort. Execution time comparison.

I'm learning Scala right now.
I was interested to compare Scala's List sort with Python's list sort.
To my surprise, sorting a list of 1_000_000 integers works 5 times faster in Python than in Scala.
As far as I know, Scala is faster than python.
Who can explain the reason for such a large advantage of Python over Scala?

Scala code:

import scala.util.Random

@main
def Main(args: String*): Unit =
  def cmillis = System.currentTimeMillis()
  val n = 1_000_000

  val xs = List.fill(n)(Random.nextInt(n))

  var currentMillis = cmillis
  xs.sorted
  println("Sorted in " + (cmillis - currentMillis) + "ms")

Python code:

from time import time
from random import randint

n = 1_000_000

xs = [randint(0, n) for _ in range(n)]

start = time()
sorted(xs)
print(f"Sorted in {(time() - start)* 1000} ms")

答案1

得分: 2

Python是地球上最快的语言,如果你调用C代码的话。;-)

开玩笑的话,这里的差异可以解释为你在Scala中使用了List,这是一个单链表。虽然它是一个非常简单的数据结构,但它的局部性很差(列表的节点可以位于堆上的任何位置,导致频繁的缓存未命中),更重要的是其结构使得需要索引的操作(包括排序)相当低效。

你可以在Scala标准库中查看数据结构的比较这里

为了进行公平比较,你应该选择具有类似后端结构的数据结构,并使用可比较的排序方法。如果Python列表由可变数组支持,与不可变链表进行比较将总是让可变数组获胜。如果是这种情况,你可以在Scala中使用ArraySeq,而不是sorted(它会创建一个新的结构),你可以使用sortInPlace

关于你的基准测试还有一些注意事项:

  • 不应该使用System.currentTimeMillis(),因为它不能保证是单调的,对于基准测试,System.nanoTime()可以确保你测量的是经过的时间。
  • 你应该运行多次基准测试,以确保你有一些具有统计学意义的结果。在JVM上,你可以使用JMH执行微基准测试,这会考虑到这一点,并且还允许你运行几次以“热身”JVM(这可以将你的字节码中的热点动态编译成本地代码以加快执行速度)。我不确定Python生态系统中是否有类似的工具,但我确信会有的。

请注意,根据Python中sorted的具体实现,你可能仍然会看到它在执行时间上胜过Scala(回到我最初的玩笑,如果它调用C代码,很难击败它)。一般来说,我期望Scala在执行时间上优于Python代码,但这当然取决于你正在测试什么。


编辑1

一位评论者正确指出,如果确实是通过索引进行排序的话,5倍的因子似乎过于快了。我查看了代码,Seq(包括List)似乎通过将自己复制到一个Array中进行排序(代码),这可以解释为什么这很慢,但不像通过索引和交换对不可变链表进行天真排序那么慢。

编辑2

出于纯粹的好奇,我尝试在本地运行了这个天真的基准测试。你的Scala代码在我的笔记本电脑上运行了约650毫秒,而Python代码则花费了约250毫秒。只需将List更改为ArraySeq,将sorted更改为sortInPlace,Scala版本运行时间约为200毫秒。虽然这不具备统计学意义,但这表明这些更改似乎产生了预期的影响。你可能希望使用更科学的方法进行进一步的基准测试,以获得更好的结果。

英文:

Python is the fastest language on Earth, if you call C code. Scala List sort vs Python list sort. Execution time comparison

Jokes aside, the difference here can be explained by the fact that you are using List in Scala, which is a singly-linked list. While it's a very simple data structure, it has poor locality (the nodes of the list can be anywhere on the heap, leading to frequent cache misses) and more importantly its structure makes operations requiring indexing (including sorting) quite inefficient.

You can see a comparison of data structure in the Scala standard library here.

To make a fair comparison, you should select data structures that have a similar backing structure and use a sorting approach that is also comparable. If Python lists are backed by mutable arrays, comparing with immutable linked lists will always have mutable arrays win. If that's the case, you can use ArraySeq in Scala, and instead of sorted (which creates a new structure) you can use sortInPlace.

A couple more notes on your benchmark:

  • System.currentTimeMillis() shouldn't be used because it's not guaranteed to be monotonic, for benchmarks System.nanoTime() makes sure you are measuring elapsed time
  • You should run a benchmark many times to make sure you have some statistically relevant result. On the JVM you can use JMH to perform microbenchmarks that take this in consideration and also allows you to make a few runs to "warm up" the JVM (which can dynamically compile hot spots in your bytecode to native code for faster execution). I'm not sure about similar tools in the Python ecosystem, but I'm sure there's one.

Note that depending on the specific implementation of sorted in Python, you might still see it outperforming Scala (going back to my initial joke, if it calls C code, it's hard to beat it). I would generally expect Scala to have an advantage over Python code in terms of execution time, but this of course depends on what you are testing.


EDIT 1

A commenter correctly pointed out that a 5x factor seems too fast if indeed sorting on Seqs is done with a naive approach using indexing. I looked into the code and Seqs (including Lists) seem to perform the sorting by copying themselves into an Array and sorting that (code), which would explain why this is slow but not as slow as naively sorting an immutable linked list by indexing and swapping.

EDIT 2

Just out of pure curiosity, I tried to run the naive benchmark locally. Your Scala code ran in ~650ms on my laptop and the Python code took ~250ms. Just changing the List into an ArraySeq and sorted into sortInPlace makes the Scala version run in ~200ms. While this is not statistically relevant, it's an indication that those changes seem to have the expected impact. You might want to perform further benchmarks using a more scientific approach to get better results.

huangapple
  • 本文由 发表于 2023年6月5日 03:47:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76402142.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定