在微优化后出现意外的时序结果

huangapple go评论69阅读模式
英文:

Unexpected timing result after micro-optimization

问题

I've been doing some experiments with some micro-optimizations and got an unexpected timing result, which I couldn't wrap my head around. I would be very thankful for your suggestions.

Following the code:

def findSmallest(arr):
    smallest = arr[0]
    smallest_indx = 0

    for i in range(1, len(arr)):
        if arr[i] < smallest:
            smallest = arr[i]
            smallest_indx = i
    return smallest_indx

def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        newArr.append(arr.pop(smallest))
    return newArr

def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(arr.pop(smallest))
    return newArr

def selectionSort3(arr):
    ap = arr.pop
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(ap(smallest))
    return newArr

import random as r
test = r.sample(range(0,1000000),10000)
test1 = test[:]
test2 = test[:]
test3 = test[:]

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("selectionSort1(test1)", setup="from __main__ import test1, selectionSort1"))
    print(timeit.timeit("selectionSort2(test2)", setup="from __main__ import test2, selectionSort2"))
    print(timeit.timeit("selectionSort3(test3)", setup="from __main__ import test3, selectionSort3"))

On my computer:

3.8686506970000005 #selectionSort1
3.961112386        #selectionSort2
4.0788594190000005 #selectionSort3

The point is that I'd expected that, when I'm isolating the attributes search (newArr.append and arr.pop) for both lists out of the loop-scope, it should give me the best result. As you've seen from the given results, this isn't the case, and I would be very happy with any help. Thank you in advance 在微优化后出现意外的时序结果

Note: For sure this type of optimization would be relevant for very big lists

英文:

I've been doing some experiments with some micro-optimizations and got an unexpected timing result, which I couldn't wrap around my head. I would be very thankful for your suggestions.

Following the code :

def findSmallest(arr):
    smallest = arr[0]
    smallest_indx = 0

    for i in range(1, len(arr)):
        if arr[i] < smallest:
            smallest = arr[i]
            smallest_indx = i
    return smallest_indx


def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        newArr.append(arr.pop(smallest))
    return newArr

def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(arr.pop(smallest))
    return newArr

def selectionSort3(arr):
    ap = arr.pop
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        na(ap(smallest))
    return newArr


import random as r
test = r.sample(range(0,1000000),10000)
test1 = test[:]
test2 = test[:]
test3 = test[:]

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("selectionSort1(test1)", setup="from __main__ import test1, selectionSort1"))
    print(timeit.timeit("selectionSort2(test2)", setup="from __main__ import test2, selectionSort2"))
    print(timeit.timeit("selectionSort3(test3)", setup="from __main__ import test3, selectionSort3"))

On my computer :

3.8686506970000005 #selectionSort1
3.961112386        #selectionSort2
4.0788594190000005 #selectionSort3

The point is that I'd expected that, when I'm isolating the attributes search (newArr.append and arr.pop) for both list out of the loop-scope should give me back the best result. As you've seen from the given results this isn't the case and will be very happy with any help. Thank you in advance 在微优化后出现意外的时序结果

Note: For sure this type of optimization would be relevant for very big lists

答案1

得分: 1

让我们深入研究一下。

对于性能分析,我真的很喜欢cProfile

现在在比较这些方法之前,我们可以得到其中一个的报告:

if __name__ == '__main__':
    import cProfile

    cProfile.run('selectionSort1(test1)')

然后可以看到:

40005 function calls in 3.408 seconds
Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    3.408    3.408 <string>:1(<module>)
10000    3.390    0.000    3.391    0.000 playground.py:1(findSmallest)
1    0.010    0.010    3.408    3.408 playground.py:12(selectionSort1)
1    0.000    0.000    3.408    3.408 {built-in method builtins.exec}
10001    0.001    0.000    0.001    0.000 {built-in method builtins.len}
10000    0.001    0.000    0.001    0.000 {method 'append' of 'list' objects}
1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
10000    0.006    0.000    0.006    0.000 {method 'pop' of 'list' objects}

绝大多数运行时间都花在了findSmallest函数上。可以运行所有三种方法来进行检查,可以看到在所有三种方法中都是如此。因此,说它使这些方法之间的差异微不足道将是一种轻描淡写。

所以让我们进一步减少实验以获得更好的隔离,并在此过程中扩大规模:

def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        newArr.append(0)
    return newArr

def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        na(0)
    return newArr

if __name__ == '__main__':
    import random as r
    import cProfile
    test = r.sample(range(0, 10000000000), 10000000)
    test1 = test[:]
    test2 = test[:]

    cProfile.run('selectionSort1(test1)')
    cProfile.run('selectionSort2(test2)')

现在:

10000005 function calls in 2.421 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.021    0.021    2.421    2.421 <string>:1(<module>)
    1    1.377    1.377    2.400    2.400 playground.py:82(selectionSort1)
    1    0.000    0.000    2.421    2.421 {built-in method builtins.exec}
    1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
10000000    1.023    0.000    1.023    0.000 {method 'append' of 'list' objects}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

10000005 function calls in 2.130 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.021    0.021    2.130    2.130 <string>:1(<module>)
    1    1.089    1.089    2.109    2.109 playground.py:88(selectionSort2)
    1    0.000    0.000    2.130    2.130 {built-in method builtins.exec}
    1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
10000000    1.020    0.000    1.020    0.000 {method 'append' of 'list' objects}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

微小的优化可以清楚地看到。

最后一点 - 在分析和尝试提高性能时,请考虑大O符号。一般来说,在对象属性中搜索方法的复杂度是O(1),而从列表中弹出指定元素的复杂度通常是O(n),其中n是列表的长度。

无论如何,我相信还有更好的方法来探究这种优化的影响程度,但我认为这是一个不错的开始。祝好运!

英文:

Let's dive into that.

For performance analysis I really like cProfile.

Now before comparing the methods, we can get a report on one of them:

if __name__ == &#39;__main__&#39;:
    import cProfile

    cProfile.run(&#39;selectionSort1(test1)&#39;)

And see that:

40005 function calls in 3.408 seconds
Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    3.408    3.408 &lt;string&gt;:1(&lt;module&gt;)
10000    3.390    0.000    3.391    0.000 playground.py:1(findSmallest)
    1    0.010    0.010    3.408    3.408 playground.py:12(selectionSort1)
    1    0.000    0.000    3.408    3.408 {built-in method builtins.exec}
10001    0.001    0.000    0.001    0.000 {built-in method builtins.len}
10000    0.001    0.000    0.001    0.000 {method &#39;append&#39; of &#39;list&#39; objects}
    1    0.000    0.000    0.000    0.000 {method &#39;disable&#39; of &#39;_lsprof.Profiler&#39; objects}
10000    0.006    0.000    0.006    0.000 {method &#39;pop&#39; of &#39;list&#39; objects}

The vast majority of running time is spent in findSmallest. One can run all three methods for sanity, and see that this is the case in all three of them. So saying that it makes the difference between the methods insignificant would be an understatement.

So let's further reduce the experiment to get better isolation, and grow in scale while we're at it:

def selectionSort1(arr):
    newArr = []
    for i in range(len(arr)):
        newArr.append(0)
    return newArr

def selectionSort2(arr):
    newArr = []
    na = newArr.append
    for i in range(len(arr)):
        na(0)
    return newArr

if __name__ == &#39;__main__&#39;:
    import random as r
    import cProfile
    test = r.sample(range(0, 10000000000), 10000000)
    test1 = test[:]
    test2 = test[:]


    cProfile.run(&#39;selectionSort1(test1)&#39;)
    cProfile.run(&#39;selectionSort2(test2)&#39;)

And now:

         10000005 function calls in 2.421 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.021    0.021    2.421    2.421 &lt;string&gt;:1(&lt;module&gt;)
        1    1.377    1.377    2.400    2.400 playground.py:82(selectionSort1)
        1    0.000    0.000    2.421    2.421 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
 10000000    1.023    0.000    1.023    0.000 {method &#39;append&#39; of &#39;list&#39; objects}
        1    0.000    0.000    0.000    0.000 {method &#39;disable&#39; of &#39;_lsprof.Profiler&#39; objects}


         10000005 function calls in 2.130 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.021    0.021    2.130    2.130 &lt;string&gt;:1(&lt;module&gt;)
        1    1.089    1.089    2.109    2.109 playground.py:88(selectionSort2)
        1    0.000    0.000    2.130    2.130 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
 10000000    1.020    0.000    1.020    0.000 {method &#39;append&#39; of &#39;list&#39; objects}
        1    0.000    0.000    0.000    0.000 {method &#39;disable&#39; of &#39;_lsprof.Profiler&#39; objects}

The micro optimization can be seen clearly.

One last note - consider big O notation when analyzing and trying to improve performance. Generally speaking the complexity of searching a method in an object's attributes is O(1), while even popping a specified element from a list is generally O(n) where n is the length of the list.

Any way, I believe there are even better ways to explore the magnitude of impact of this particular optimization, but I think this is a good start. Good luck!

huangapple
  • 本文由 发表于 2023年2月26日 20:39:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75572028.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定