2023年4月17日 21:07:34go评论89阅读模式

英文:

Why does my sorting algo with numpy arrayw work slower than with lists?

问题

I noticed that you were trying to optimize your bitonic sort algorithm by using a NumPy array instead of a list, but it ended up being slower. One potential issue could be related to the way you converted the input data to a NumPy array.

In the part where you read the data for the NumPy array, you are using np.array(line[1:-2].split(', ')).astype('int32'), which can be less efficient for large datasets compared to directly using the list of integers. NumPy arrays are generally more efficient when you perform vectorized operations, and if you are not taking advantage of NumPy's vectorized capabilities, it might lead to slower performance.

To potentially improve the performance with NumPy, you can try the following:

with open('data.txt') as f:
    line = f.readline()
    a = np.fromiter(map(int, line[1:-2].split(', ')), dtype=np.int32)

This code uses np.fromiter to directly create a NumPy array from the iterable of integers, which may be more efficient for your use case. However, the overall performance improvement may still depend on various factors, including the size of your data and the specific operations you perform with the array.

英文:

I was trying to sort a list with sequential bitonic sort and wanted to make it faster by sorting a numpy array instead of a list, but it only became slower. What did I do wrong?

Here is sorting algo:

from datetime import datetime
import numpy as np
def compAndSwap(a, i, j, dire):
    if (dire == 1 and a[i] &gt; a[j]) or (dire == 0 and a[i] &lt; a[j]):
        a[i], a[j] = a[j], a[i]
def bitonicMerge(a, low, cnt, dire):
    if cnt &gt; 1:
        k = cnt // 2
        for i in range(low, low + k):
            compAndSwap(a, i, i + k, dire)
        bitonicMerge(a, low, k, dire)
        bitonicMerge(a, low + k, k, dire)
def bitonicSort(a, low, cnt, dire):
    if cnt &gt; 1:
        k = cnt // 2
        bitonicSort(a, low, k, 1)
        bitonicSort(a, low + k, k, 0)
        bitonicMerge(a, low, cnt, dire)
def sort_(a, N, up):
    bitonicSort(a, 0, N, up)

And here is part when run this algo for list:

with open(&#39;data.txt&#39;) as f:
    
    line = f.readline()
    a = line[1:-2].split(&#39;, &#39;)
    a = list(map(int, a))
n = len(a)
up = 1
time1 = datetime.now()
sort_(a, n, up)
time2 = datetime.now()
print(&quot;\nCurrent Time =&quot;, time2-time1)

And here is for numpy array:


with open(&#39;data.txt&#39;) as f:
    line = f.readline()
    a = np.array(line[1:-2].split(&#39;, &#39;)).astype(&#39;int32&#39;)
n = a.size
up = 1
time1 = datetime.now()
sort_(a, n, up)
time2 = datetime.now()
print(&quot;\nCurrent Time =&quot;, time2-time1)

What did I miss?

答案1

得分: 4

你对NumPy的理解有误。NumPy实际上是一个科学计算库，最适合执行矢量化操作，例如数组乘法。它并不针对逐个访问数组中的单个项进行优化。

如果你尝试进行这个简单的测试：

data = np.random.randint(0, 10000, 1000000)
a = data.tolist()

然后测量一个简单的循环所花费的时间，没有其他额外的操作：

%timeit for i in range(data.shape[0]): _ = data[i] # 145 ms ± 50.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

同样的操作用于列表：

%timeit for i in range(len(a)): _ = a[i] # 85.1 ms ± 28.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

你会发现访问NumPy数组中的元素需要更多的时间。

然而，利用NumPy库中的预优化函数可以导致性能显著差异：

%timeit np.sort(data, kind='mergesort') # 100 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

而如果你在列表上使用 sorted：

%timeit sorted(a) # 449 ms ± 69.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

这个比较可能不太准确，因为sorted使用了timsort，它是mergesort的修改版本。尽管如此，性能差异仍然会很大。

英文:

You got numpy wrong. Numpy is actually a scientific computing package that is best suited for performing vectorized operations, such as array multiplications. It's not optimized for accessing individual items in an array one at a time.

If you try this simple test:

data = np.random.randint(0, 10000, 1000000)
a = data.tolist()

then measure the time taken by a simple loop without any additional operations.:

%timeit for i in range(data.shape[0]): _ = data[i] # 145 ms &#177; 50.7 ms per loop (mean &#177; std. dev. of 7 runs, 10 loops each)

Do the same for the list:

%timeit for i in range(len(a)): _ = a[i] # 85.1 ms &#177; 28.2 ms per loop (mean &#177; std. dev. of 7 runs, 10 loops each)

You observe that accessing an item in a numpy array takes more time.

However, utilizing the pre-optimized functions from the numpy library can lead to a significant difference in performance.

%timeit np.sort(data, kind=&#39;mergesort&#39;) # 100 ms &#177; 11.5 ms per loop (mean &#177; std. dev. of 7 runs, 10 loops each)

while if you used sorted on a list

%timeit sorted(a) # 449 ms &#177; 69.6 ms per loop (mean &#177; std. dev. of 7 runs, 1 loop each)

This comparison is not be accurate since sorted utilizes timsort, which is a modified version of mergesort. Nevertheless, the difference in performance will still be substantial.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么我的使用NumPy数组的排序算法比使用列表慢？

问题

答案1

在pandas中有条件地向列表的列表中追加值。

无法使用 `streamlit` 对包含多个标签的数据集进行标注。

解决使用Python和Sympy解决涉及三角函数的非线性方程组问题

有没有一个用于获取大整数立方根的Go函数？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。