英文:
Unexpected timing result after micro-optimization
问题
I've been doing some experiments with some micro-optimizations and got an unexpected timing result, which I couldn't wrap my head around. I would be very thankful for your suggestions.
Following the code:
def findSmallest(arr):
smallest = arr[0]
smallest_indx = 0
for i in range(1, len(arr)):
if arr[i] < smallest:
smallest = arr[i]
smallest_indx = i
return smallest_indx
def selectionSort1(arr):
newArr = []
for i in range(len(arr)):
smallest = findSmallest(arr)
newArr.append(arr.pop(smallest))
return newArr
def selectionSort2(arr):
newArr = []
na = newArr.append
for i in range(len(arr)):
smallest = findSmallest(arr)
na(arr.pop(smallest))
return newArr
def selectionSort3(arr):
ap = arr.pop
newArr = []
na = newArr.append
for i in range(len(arr)):
smallest = findSmallest(arr)
na(ap(smallest))
return newArr
import random as r
test = r.sample(range(0,1000000),10000)
test1 = test[:]
test2 = test[:]
test3 = test[:]
if __name__ == '__main__':
import timeit
print(timeit.timeit("selectionSort1(test1)", setup="from __main__ import test1, selectionSort1"))
print(timeit.timeit("selectionSort2(test2)", setup="from __main__ import test2, selectionSort2"))
print(timeit.timeit("selectionSort3(test3)", setup="from __main__ import test3, selectionSort3"))
On my computer:
3.8686506970000005 #selectionSort1
3.961112386 #selectionSort2
4.0788594190000005 #selectionSort3
The point is that I'd expected that, when I'm isolating the attributes search (newArr.append and arr.pop) for both lists out of the loop-scope, it should give me the best result. As you've seen from the given results, this isn't the case, and I would be very happy with any help. Thank you in advance
Note: For sure this type of optimization would be relevant for very big lists
英文:
I've been doing some experiments with some micro-optimizations and got an unexpected timing result, which I couldn't wrap around my head. I would be very thankful for your suggestions.
Following the code :
def findSmallest(arr):
smallest = arr[0]
smallest_indx = 0
for i in range(1, len(arr)):
if arr[i] < smallest:
smallest = arr[i]
smallest_indx = i
return smallest_indx
def selectionSort1(arr):
newArr = []
for i in range(len(arr)):
smallest = findSmallest(arr)
newArr.append(arr.pop(smallest))
return newArr
def selectionSort2(arr):
newArr = []
na = newArr.append
for i in range(len(arr)):
smallest = findSmallest(arr)
na(arr.pop(smallest))
return newArr
def selectionSort3(arr):
ap = arr.pop
newArr = []
na = newArr.append
for i in range(len(arr)):
smallest = findSmallest(arr)
na(ap(smallest))
return newArr
import random as r
test = r.sample(range(0,1000000),10000)
test1 = test[:]
test2 = test[:]
test3 = test[:]
if __name__ == '__main__':
import timeit
print(timeit.timeit("selectionSort1(test1)", setup="from __main__ import test1, selectionSort1"))
print(timeit.timeit("selectionSort2(test2)", setup="from __main__ import test2, selectionSort2"))
print(timeit.timeit("selectionSort3(test3)", setup="from __main__ import test3, selectionSort3"))
On my computer :
3.8686506970000005 #selectionSort1
3.961112386 #selectionSort2
4.0788594190000005 #selectionSort3
The point is that I'd expected that, when I'm isolating the attributes search (newArr.append and arr.pop) for both list out of the loop-scope should give me back the best result. As you've seen from the given results this isn't the case and will be very happy with any help. Thank you in advance
Note: For sure this type of optimization would be relevant for very big lists
答案1
得分: 1
让我们深入研究一下。
对于性能分析,我真的很喜欢cProfile。
现在在比较这些方法之前,我们可以得到其中一个的报告:
if __name__ == '__main__':
import cProfile
cProfile.run('selectionSort1(test1)')
然后可以看到:
40005 function calls in 3.408 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 3.408 3.408 <string>:1(<module>)
10000 3.390 0.000 3.391 0.000 playground.py:1(findSmallest)
1 0.010 0.010 3.408 3.408 playground.py:12(selectionSort1)
1 0.000 0.000 3.408 3.408 {built-in method builtins.exec}
10001 0.001 0.000 0.001 0.000 {built-in method builtins.len}
10000 0.001 0.000 0.001 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
10000 0.006 0.000 0.006 0.000 {method 'pop' of 'list' objects}
绝大多数运行时间都花在了findSmallest
函数上。可以运行所有三种方法来进行检查,可以看到在所有三种方法中都是如此。因此,说它使这些方法之间的差异微不足道将是一种轻描淡写。
所以让我们进一步减少实验以获得更好的隔离,并在此过程中扩大规模:
def selectionSort1(arr):
newArr = []
for i in range(len(arr)):
newArr.append(0)
return newArr
def selectionSort2(arr):
newArr = []
na = newArr.append
for i in range(len(arr)):
na(0)
return newArr
if __name__ == '__main__':
import random as r
import cProfile
test = r.sample(range(0, 10000000000), 10000000)
test1 = test[:]
test2 = test[:]
cProfile.run('selectionSort1(test1)')
cProfile.run('selectionSort2(test2)')
现在:
10000005 function calls in 2.421 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.021 0.021 2.421 2.421 <string>:1(<module>)
1 1.377 1.377 2.400 2.400 playground.py:82(selectionSort1)
1 0.000 0.000 2.421 2.421 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.len}
10000000 1.023 0.000 1.023 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
10000005 function calls in 2.130 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.021 0.021 2.130 2.130 <string>:1(<module>)
1 1.089 1.089 2.109 2.109 playground.py:88(selectionSort2)
1 0.000 0.000 2.130 2.130 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.len}
10000000 1.020 0.000 1.020 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
微小的优化可以清楚地看到。
最后一点 - 在分析和尝试提高性能时,请考虑大O符号。一般来说,在对象属性中搜索方法的复杂度是O(1),而从列表中弹出指定元素的复杂度通常是O(n),其中n是列表的长度。
无论如何,我相信还有更好的方法来探究这种优化的影响程度,但我认为这是一个不错的开始。祝好运!
英文:
Let's dive into that.
For performance analysis I really like cProfile.
Now before comparing the methods, we can get a report on one of them:
if __name__ == '__main__':
import cProfile
cProfile.run('selectionSort1(test1)')
And see that:
40005 function calls in 3.408 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 3.408 3.408 <string>:1(<module>)
10000 3.390 0.000 3.391 0.000 playground.py:1(findSmallest)
1 0.010 0.010 3.408 3.408 playground.py:12(selectionSort1)
1 0.000 0.000 3.408 3.408 {built-in method builtins.exec}
10001 0.001 0.000 0.001 0.000 {built-in method builtins.len}
10000 0.001 0.000 0.001 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
10000 0.006 0.000 0.006 0.000 {method 'pop' of 'list' objects}
The vast majority of running time is spent in findSmallest
. One can run all three methods for sanity, and see that this is the case in all three of them. So saying that it makes the difference between the methods insignificant would be an understatement.
So let's further reduce the experiment to get better isolation, and grow in scale while we're at it:
def selectionSort1(arr):
newArr = []
for i in range(len(arr)):
newArr.append(0)
return newArr
def selectionSort2(arr):
newArr = []
na = newArr.append
for i in range(len(arr)):
na(0)
return newArr
if __name__ == '__main__':
import random as r
import cProfile
test = r.sample(range(0, 10000000000), 10000000)
test1 = test[:]
test2 = test[:]
cProfile.run('selectionSort1(test1)')
cProfile.run('selectionSort2(test2)')
And now:
10000005 function calls in 2.421 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.021 0.021 2.421 2.421 <string>:1(<module>)
1 1.377 1.377 2.400 2.400 playground.py:82(selectionSort1)
1 0.000 0.000 2.421 2.421 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.len}
10000000 1.023 0.000 1.023 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
10000005 function calls in 2.130 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.021 0.021 2.130 2.130 <string>:1(<module>)
1 1.089 1.089 2.109 2.109 playground.py:88(selectionSort2)
1 0.000 0.000 2.130 2.130 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.len}
10000000 1.020 0.000 1.020 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
The micro optimization can be seen clearly.
One last note - consider big O notation when analyzing and trying to improve performance. Generally speaking the complexity of searching a method in an object's attributes is O(1), while even popping a specified element from a list is generally O(n) where n is the length of the list.
Any way, I believe there are even better ways to explore the magnitude of impact of this particular optimization, but I think this is a good start. Good luck!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论