为什么我的使用NumPy数组的排序算法比使用列表慢?

huangapple go评论66阅读模式
英文:

Why does my sorting algo with numpy arrayw work slower than with lists?

问题

I noticed that you were trying to optimize your bitonic sort algorithm by using a NumPy array instead of a list, but it ended up being slower. One potential issue could be related to the way you converted the input data to a NumPy array.

In the part where you read the data for the NumPy array, you are using np.array(line[1:-2].split(', ')).astype('int32'), which can be less efficient for large datasets compared to directly using the list of integers. NumPy arrays are generally more efficient when you perform vectorized operations, and if you are not taking advantage of NumPy's vectorized capabilities, it might lead to slower performance.

To potentially improve the performance with NumPy, you can try the following:

with open('data.txt') as f:
    line = f.readline()
    a = np.fromiter(map(int, line[1:-2].split(', ')), dtype=np.int32)

This code uses np.fromiter to directly create a NumPy array from the iterable of integers, which may be more efficient for your use case. However, the overall performance improvement may still depend on various factors, including the size of your data and the specific operations you perform with the array.

英文:

I was trying to sort a list with sequential bitonic sort and wanted to make it faster by sorting a numpy array instead of a list, but it only became slower. What did I do wrong?

Here is sorting algo:

from datetime import datetime
import numpy as np


def compAndSwap(a, i, j, dire):
    if (dire == 1 and a[i] > a[j]) or (dire == 0 and a[i] < a[j]):
        a[i], a[j] = a[j], a[i]


def bitonicMerge(a, low, cnt, dire):
    if cnt > 1:
        k = cnt // 2
        for i in range(low, low + k):
            compAndSwap(a, i, i + k, dire)
        bitonicMerge(a, low, k, dire)
        bitonicMerge(a, low + k, k, dire)


def bitonicSort(a, low, cnt, dire):
    if cnt > 1:
        k = cnt // 2
        bitonicSort(a, low, k, 1)
        bitonicSort(a, low + k, k, 0)
        bitonicMerge(a, low, cnt, dire)


def sort_(a, N, up):
    bitonicSort(a, 0, N, up)

And here is part when run this algo for list:

with open('data.txt') as f:
    
    line = f.readline()
    a = line[1:-2].split(', ')
    a = list(map(int, a))


n = len(a)

up = 1

time1 = datetime.now()

sort_(a, n, up)

time2 = datetime.now()


print("\nCurrent Time =", time2-time1)

And here is for numpy array:


with open('data.txt') as f:
    line = f.readline()
    a = np.array(line[1:-2].split(', ')).astype('int32')


n = a.size
up = 1

time1 = datetime.now()

sort_(a, n, up)

time2 = datetime.now()

print("\nCurrent Time =", time2-time1)

What did I miss?

答案1

得分: 4

你对NumPy的理解有误。NumPy实际上是一个科学计算库,最适合执行矢量化操作,例如数组乘法。它并不针对逐个访问数组中的单个项进行优化。

如果你尝试进行这个简单的测试:

data = np.random.randint(0, 10000, 1000000)
a = data.tolist()

然后测量一个简单的循环所花费的时间,没有其他额外的操作:

%timeit for i in range(data.shape[0]): _ = data[i] # 145 ms ± 50.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

同样的操作用于列表:

%timeit for i in range(len(a)): _ = a[i] # 85.1 ms ± 28.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

你会发现访问NumPy数组中的元素需要更多的时间。

然而,利用NumPy库中的预优化函数可以导致性能显著差异:

%timeit np.sort(data, kind='mergesort') # 100 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

而如果你在列表上使用 sorted

%timeit sorted(a) # 449 ms ± 69.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

这个比较可能不太准确,因为sorted使用了timsort,它是mergesort的修改版本。尽管如此,性能差异仍然会很大。

英文:

You got numpy wrong. Numpy is actually a scientific computing package that is best suited for performing vectorized operations, such as array multiplications. It's not optimized for accessing individual items in an array one at a time.

If you try this simple test:

data = np.random.randint(0, 10000, 1000000)
a = data.tolist()

then measure the time taken by a simple loop without any additional operations.:

%timeit for i in range(data.shape[0]): _ = data[i] # 145 ms ± 50.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Do the same for the list:

%timeit for i in range(len(a)): _ = a[i] # 85.1 ms ± 28.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

You observe that accessing an item in a numpy array takes more time.

However, utilizing the pre-optimized functions from the numpy library can lead to a significant difference in performance.

%timeit np.sort(data, kind='mergesort') # 100 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

while if you used sorted on a list

%timeit sorted(a) # 449 ms ± 69.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

This comparison is not be accurate since sorted utilizes timsort, which is a modified version of mergesort. Nevertheless, the difference in performance will still be substantial.

huangapple
  • 本文由 发表于 2023年4月17日 21:07:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76035499.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定