Python是否检测到无用的代码片段?(死代码消除)

huangapple go评论69阅读模式
英文:

does python detect code snippets that are useless? (dead code elimination)

问题

如果我编写类似以下的代码:

for _ in range(100_000_000):
    a = 10
    b = 20
    c = a + b
    d = c * 2     
    e = d / 2
    del a, b, c, d, e
print("Hello World")

Python编译器会意识到这是无用的,并且不需要执行任何操作吗?我听说gcc可以理解这一点,但Python不行。

我的测试结果也证实了这一点,但我正在寻找确认,因为我偶然看到了这样的帖子(https://bugs.python.org/issue1346214),这让我想知道它是否实际实现了。这是一个非常旧的Python版本,我使用的是3.11.1,但如果他们已经在讨论这个问题,我相信如果他们要实现它,现在应该已经实现了吧?

在我的笔记本电脑上,这段Python代码运行需要0.07秒,而仅运行Hello World则需要0秒,这就是我认为没有死代码消除的原因。

英文:

If I code something like this:

for _ in range(100_000_000):
    a = 10
    b = 20
    c = a + b
    d = c * 2     
    e = d / 2
    del a, b, c, d, e
print("Hello World")

Will Python compiler realize it is useless and that there is no need to do anything? I've heard gcc can understand that but not Python.

My tests confirm that but I'm looking for a confirmation since I've stumbled across posts like this (https://bugs.python.org/issue1346214) that make me wonder if it's actually implemented. It's a very old version of Python and I'm using 3.11.1 but if they were already talking about it, I'm sure if they were to implement it it's now implemented?

On my laptop, the python code takes 0.07 seconds to run against 0 seconds for just the hello world which is why I think there is no dead code elimination.

答案1

得分: 2

It does not eliminate the useless code, and you can prove it to yourself - compare the disassembled code in slow vs that in fast and you'll see it does indeed do all those useless computations. You can also verify they're being run using timeit. This only applies to cpython though - other implementations of python may optimize this (as you can see below with numba's python subset!)

import dis
import timeit

def slow():
    for _ in range(100_000_000):
        a = 10
        b = 20
        c = a + b
        d = c * 2
        e = d / 2
        del a, b, c, d, e
    print("Hello World")

def fast():
    print("Hello World")

print(timeit.timeit('slow()', number=1000, globals=globals()))
dis.dis(slow)
print("*****")
print(timeit.timeit('fast()', number=5, globals=globals()))
dis.dis(fast)

Yields:

39.3209762
  3           0 LOAD_GLOBAL              0 (range)
              2 LOAD_CONST               1 (100000000)
              4 CALL_FUNCTION            1
              6 GET_ITER
        >>    8 FOR_ITER                46 (to 56)
             10 STORE_FAST               0 (_)

  4          12 LOAD_CONST               2 (10)
             14 STORE_FAST               1 (a)

  5          16 LOAD_CONST               3 (20)
             18 STORE_FAST               2 (b)

  6          20 LOAD_FAST                1 (a)
             22 LOAD_FAST                2 (b)
             24 BINARY_ADD
             26 STORE_FAST               3 (c)

  7          28 LOAD_FAST                3 (c)
             30 LOAD_CONST               4 (2)
             32 BINARY_MULTIPLY
             34 STORE_FAST               4 (d)

  8          36 LOAD_FAST                4 (d)
             38 LOAD_CONST               4 (2)
             40 BINARY_TRUE_DIVIDE
             42 STORE_FAST               5 (e)

  9          44 DELETE_FAST              1 (a)
             46 DELETE_FAST              2 (b)
             48 DELETE_FAST              3 (c)
             50 DELETE_FAST              4 (d)
             52 DELETE_FAST              5 (e)
             54 JUMP_ABSOLUTE            8

 10     >>   56 LOAD_GLOBAL              1 (print)
             58 LOAD_CONST               5 ('Hello World')
             60 CALL_FUNCTION            1
             62 POP_TOP
             64 LOAD_CONST               0 (None)
             66 RETURN_VALUE
*****
8.899999997424857e-06
 13           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('Hello World')
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

If you do want really ripping fast python with optimization, look at numba:

import numba as nb
import timeit

@nb.jit
def slow():
    for _ in range(100_000_000):
        a = 10
        b = 20
        c = a + b
        d = c * 2
        e = d / 2
    print("Hello World")

@nb.jit
def fast():
    print("Hello World")

print(timeit.timeit('slow()', number=5, globals=globals()))
print(timeit.timeit('fast()', number=5, globals=globals()))
print(slow.inspect_llvm()[tuple()])
print("****")
print(fast.inspect_llvm()[tuple()])

These will yield very similar LLVM code. There are a bunch of limitations that come with numba compiled code, though, which are out of scope for this question.

英文:

It does not eliminate the useless code, and you can prove it to yourself - compare the disassembled code in slow vs that in fast and you'll see it does indeed do all those useless computations. You can also verify they're being run using timeit. This only applies to cpython though - other implementations of python may optimize this (as you can see below with numba's python subset!)

import dis
import timeit
def slow():
    for _ in range(100_000_000):
        a = 10
        b = 20
        c = a + b
        d = c * 2
        e = d / 2
        del a, b, c, d, e
    print("Hello World")

def fast():
    print("Hello World")

print(timeit.timeit('slow()',number=1000,globals=globals()))
dis.dis(slow)
print("*****")
print(timeit.timeit('fast()',number=5,globals=globals()))
dis.dis(fast)

Yields:

39.3209762
  3           0 LOAD_GLOBAL              0 (range)
              2 LOAD_CONST               1 (100000000)
              4 CALL_FUNCTION            1
              6 GET_ITER
        >>    8 FOR_ITER                46 (to 56)
             10 STORE_FAST               0 (_)

  4          12 LOAD_CONST               2 (10)
             14 STORE_FAST               1 (a)

  5          16 LOAD_CONST               3 (20)
             18 STORE_FAST               2 (b)

  6          20 LOAD_FAST                1 (a)
             22 LOAD_FAST                2 (b)
             24 BINARY_ADD
             26 STORE_FAST               3 (c)

  7          28 LOAD_FAST                3 (c)
             30 LOAD_CONST               4 (2)
             32 BINARY_MULTIPLY
             34 STORE_FAST               4 (d)

  8          36 LOAD_FAST                4 (d)
             38 LOAD_CONST               4 (2)
             40 BINARY_TRUE_DIVIDE
             42 STORE_FAST               5 (e)

  9          44 DELETE_FAST              1 (a)
             46 DELETE_FAST              2 (b)
             48 DELETE_FAST              3 (c)
             50 DELETE_FAST              4 (d)
             52 DELETE_FAST              5 (e)
             54 JUMP_ABSOLUTE            8

 10     >>   56 LOAD_GLOBAL              1 (print)
             58 LOAD_CONST               5 ('Hello World')
             60 CALL_FUNCTION            1
             62 POP_TOP
             64 LOAD_CONST               0 (None)
             66 RETURN_VALUE
*****
8.899999997424857e-06
 13           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('Hello World')
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

If you do want really ripping fast python with optimization, look at numba:

import numba as nb
import timeit

@nb.jit
def slow():
    for _ in range(100_000_000):
        a = 10
        b = 20
        c = a + b
        d = c * 2
        e = d / 2
    print("Hello World")

@nb.jit
def fast():
    print("Hello World")


print(timeit.timeit('slow()',number=5,globals=globals()))
print(timeit.timeit('fast()',number=5,globals=globals()))
print(slow.inspect_llvm()[tuple()])
print("****")
print(fast.inspect_llvm()[tuple()])

These will yield very similar LLVM code. There are a bunch of limitations that come with numba compiled code, though, which are out of scope for this question.

huangapple
  • 本文由 发表于 2023年6月19日 01:41:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76501826.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定