2023年2月26日 20:39:31go评论95阅读模式

英文:

Unexpected timing result after micro-optimization

问题

I've been doing some experiments with some micro-optimizations and got an unexpected timing result, which I couldn't wrap my head around. I would be very thankful for your suggestions.

Following the code:

def findSmallest(arr):
    smallest = arr[0]
    smallest_indx = 0
    for i in range(1, len(arr)):
        if arr[i] &lt; smallest:
            smallest = arr[i]
            smallest_indx = i
    return smallest_indx
def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        newArr.append(arr.pop(smallest))
    return newArr
def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(arr.pop(smallest))
    return newArr
def selectionSort3(arr):
    ap = arr.pop
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(ap(smallest))
    return newArr
import random as r
test = r.sample(range(0,1000000),10000)
test1 = test[:]
test2 = test[:]
test3 = test[:]
if __name__ == &#39;__main__&#39;:
    import timeit
    print(timeit.timeit(&quot;selectionSort1(test1)&quot;, setup=&quot;from __main__ import test1, selectionSort1&quot;))
    print(timeit.timeit(&quot;selectionSort2(test2)&quot;, setup=&quot;from __main__ import test2, selectionSort2&quot;))
    print(timeit.timeit(&quot;selectionSort3(test3)&quot;, setup=&quot;from __main__ import test3, selectionSort3&quot;))

On my computer:

3.8686506970000005 #selectionSort1
3.961112386        #selectionSort2
4.0788594190000005 #selectionSort3

The point is that I'd expected that, when I'm isolating the attributes search (newArr.append and arr.pop) for both lists out of the loop-scope, it should give me the best result. As you've seen from the given results, this isn't the case, and I would be very happy with any help. Thank you in advance

Note: For sure this type of optimization would be relevant for very big lists

英文:

I've been doing some experiments with some micro-optimizations and got an unexpected timing result, which I couldn't wrap around my head. I would be very thankful for your suggestions.

Following the code :

def findSmallest(arr):
    smallest = arr[0]
    smallest_indx = 0
    for i in range(1, len(arr)):
        if arr[i] &lt; smallest:
            smallest = arr[i]
            smallest_indx = i
    return smallest_indx
def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        newArr.append(arr.pop(smallest))
    return newArr
def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(arr.pop(smallest))
    return newArr
def selectionSort3(arr):
    ap = arr.pop
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(ap(smallest))
    return newArr
import random as r
test = r.sample(range(0,1000000),10000)
test1 = test[:]
test2 = test[:]
test3 = test[:]
if __name__ == &#39;__main__&#39;:
    import timeit
    print(timeit.timeit(&quot;selectionSort1(test1)&quot;, setup=&quot;from __main__ import test1, selectionSort1&quot;))
    print(timeit.timeit(&quot;selectionSort2(test2)&quot;, setup=&quot;from __main__ import test2, selectionSort2&quot;))
    print(timeit.timeit(&quot;selectionSort3(test3)&quot;, setup=&quot;from __main__ import test3, selectionSort3&quot;))

On my computer :

3.8686506970000005 #selectionSort1
3.961112386        #selectionSort2
4.0788594190000005 #selectionSort3

The point is that I'd expected that, when I'm isolating the attributes search (newArr.append and arr.pop) for both list out of the loop-scope should give me back the best result. As you've seen from the given results this isn't the case and will be very happy with any help. Thank you in advance

Note: For sure this type of optimization would be relevant for very big lists

答案1

得分: 1

让我们深入研究一下。

对于性能分析，我真的很喜欢cProfile。

现在在比较这些方法之前，我们可以得到其中一个的报告：

if __name__ == '__main__':
    import cProfile
    cProfile.run('selectionSort1(test1)')

然后可以看到：

40005 function calls in 3.408 seconds
Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    3.408    3.408 <string>:1(<module>)
10000    3.390    0.000    3.391    0.000 playground.py:1(findSmallest)
1    0.010    0.010    3.408    3.408 playground.py:12(selectionSort1)
1    0.000    0.000    3.408    3.408 {built-in method builtins.exec}
10001    0.001    0.000    0.001    0.000 {built-in method builtins.len}
10000    0.001    0.000    0.001    0.000 {method 'append' of 'list' objects}
1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
10000    0.006    0.000    0.006    0.000 {method 'pop' of 'list' objects}

绝大多数运行时间都花在了findSmallest函数上。可以运行所有三种方法来进行检查，可以看到在所有三种方法中都是如此。因此，说它使这些方法之间的差异微不足道将是一种轻描淡写。

所以让我们进一步减少实验以获得更好的隔离，并在此过程中扩大规模：

def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        newArr.append(0)
    return newArr
def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        na(0)
    return newArr
if __name__ == '__main__':
    import random as r
    import cProfile
    test = r.sample(range(0, 10000000000), 10000000)
    test1 = test[:]
    test2 = test[:]
    cProfile.run('selectionSort1(test1)')
    cProfile.run('selectionSort2(test2)')

现在：

10000005 function calls in 2.421 seconds
Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.021    0.021    2.421    2.421 <string>:1(<module>)
    1    1.377    1.377    2.400    2.400 playground.py:82(selectionSort1)
    1    0.000    0.000    2.421    2.421 {built-in method builtins.exec}
    1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
10000000    1.023    0.000    1.023    0.000 {method 'append' of 'list' objects}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
10000005 function calls in 2.130 seconds
Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.021    0.021    2.130    2.130 <string>:1(<module>)
    1    1.089    1.089    2.109    2.109 playground.py:88(selectionSort2)
    1    0.000    0.000    2.130    2.130 {built-in method builtins.exec}
    1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
10000000    1.020    0.000    1.020    0.000 {method 'append' of 'list' objects}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

微小的优化可以清楚地看到。

最后一点 - 在分析和尝试提高性能时，请考虑大O符号。一般来说，在对象属性中搜索方法的复杂度是O(1)，而从列表中弹出指定元素的复杂度通常是O(n)，其中n是列表的长度。

无论如何，我相信还有更好的方法来探究这种优化的影响程度，但我认为这是一个不错的开始。祝好运！

英文:

Let's dive into that.

For performance analysis I really like cProfile.

Now before comparing the methods, we can get a report on one of them:

if __name__ == &#39;__main__&#39;:
    import cProfile
    cProfile.run(&#39;selectionSort1(test1)&#39;)

And see that:

40005 function calls in 3.408 seconds
Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    3.408    3.408 &lt;string&gt;:1(&lt;module&gt;)
10000    3.390    0.000    3.391    0.000 playground.py:1(findSmallest)
    1    0.010    0.010    3.408    3.408 playground.py:12(selectionSort1)
    1    0.000    0.000    3.408    3.408 {built-in method builtins.exec}
10001    0.001    0.000    0.001    0.000 {built-in method builtins.len}
10000    0.001    0.000    0.001    0.000 {method &#39;append&#39; of &#39;list&#39; objects}
    1    0.000    0.000    0.000    0.000 {method &#39;disable&#39; of &#39;_lsprof.Profiler&#39; objects}
10000    0.006    0.000    0.006    0.000 {method &#39;pop&#39; of &#39;list&#39; objects}

The vast majority of running time is spent in findSmallest. One can run all three methods for sanity, and see that this is the case in all three of them. So saying that it makes the difference between the methods insignificant would be an understatement.

So let's further reduce the experiment to get better isolation, and grow in scale while we're at it:

def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        newArr.append(0)
    return newArr
def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        na(0)
    return newArr
if __name__ == &#39;__main__&#39;:
    import random as r
    import cProfile
    test = r.sample(range(0, 10000000000), 10000000)
    test1 = test[:]
    test2 = test[:]
    cProfile.run(&#39;selectionSort1(test1)&#39;)
    cProfile.run(&#39;selectionSort2(test2)&#39;)

And now:

         10000005 function calls in 2.421 seconds
   Ordered by: standard name
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.021    0.021    2.421    2.421 &lt;string&gt;:1(&lt;module&gt;)
        1    1.377    1.377    2.400    2.400 playground.py:82(selectionSort1)
        1    0.000    0.000    2.421    2.421 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
 10000000    1.023    0.000    1.023    0.000 {method &#39;append&#39; of &#39;list&#39; objects}
        1    0.000    0.000    0.000    0.000 {method &#39;disable&#39; of &#39;_lsprof.Profiler&#39; objects}
         10000005 function calls in 2.130 seconds
   Ordered by: standard name
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.021    0.021    2.130    2.130 &lt;string&gt;:1(&lt;module&gt;)
        1    1.089    1.089    2.109    2.109 playground.py:88(selectionSort2)
        1    0.000    0.000    2.130    2.130 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
 10000000    1.020    0.000    1.020    0.000 {method &#39;append&#39; of &#39;list&#39; objects}
        1    0.000    0.000    0.000    0.000 {method &#39;disable&#39; of &#39;_lsprof.Profiler&#39; objects}

The micro optimization can be seen clearly.

One last note - consider big O notation when analyzing and trying to improve performance. Generally speaking the complexity of searching a method in an object's attributes is O(1), while even popping a specified element from a list is generally O(n) where n is the length of the list.

Any way, I believe there are even better ways to explore the magnitude of impact of this particular optimization, but I think this is a good start. Good luck!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在微优化后出现意外的时序结果

问题

答案1

“Unwanted result web scrapping” 可以翻译为 “不需要的结果网络抓取”。

最少步骤到达1

unbound method init() error with python2 but not with python3

如何更高效地下载WHL文件？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。