英文:
How does Python resolve variable conflicts in list comprehensions with 'for x in x' syntax?
问题
我正在学习Python,之前使用过Java/C++,现在已经学到了列表推导。
其中有这个示例:
x = 'ABC'
codes = [ord(x) for x in x]
print(x, codes)
然后会打印出 'ABC' [65, 66, 67]
。
为什么列表推导可以同时使用相同的变量名?它是如何知道要将哪一个变量传递给 ord()
函数的?这是一种良好的实践吗?还是更好使用不同的变量名作为迭代器?
英文:
I'm learning Python, having come from Java/C++, and have reached list comprehension.
In it is this example:
x = 'ABC'
codes = [ord(x) for x in x]
print(x, codes)
And 'ABC' [65, 66, 67] is printed.
Why is the list comprehension okay with both variables having the same name? How does it know which one to send to ord()? Is this good practice or would it be preferred to use a different variable for the iterator?
答案1
得分: 3
当前的实现(截至Python 3.11)基本上将列表推导转换为一个函数,以将两个不同的 x
放入不同的作用域,等同于
x = 'ABC'
def _tmp(x_itr):
rv = []
for x in x_itr:
rv.append(x)
return rv
codes = _tmp(iter(x))
PEP 709 概述了如何“内联”临时函数以提高性能;此更改计划包括在 Python 3.12 中,将在今年晚些时候发布。
您可以使用 dis
模块查看 CPython 当前如何实现您的示例(通过 python3 -m dis tmp.py
,我添加了注释)。
0 0 RESUME 0
# 定义外部的 x
1 2 LOAD_CONST 0 ('ABC')
4 STORE_NAME 0 (x)
# 创建 _tmp(见下文)
2 6 LOAD_CONST 1 (<code object <listcomp> at 0x11049b3c0, file "tmp.py", line 2>)
8 MAKE_FUNCTION 0
# 调用 codes = _tmp(iter(x))
10 LOAD_NAME 0 (x)
12 GET_ITER
14 PRECALL 0
18 CALL 0
28 STORE_NAME 1 (codes)
# 打印结果。
[...]
# _tmp 的实际体
Disassembly of <code object <listcomp> at 0x11049b3c0, file "tmp.py", line 2>:
2 0 RESUME 0
# rv = []
2 BUILD_LIST 0
4 LOAD_FAST 0 (.0)
# for x in <argument>
>> 6 FOR_ITER 17 (to 42)
8 STORE_FAST 1 (x)
# rv.append(ord(x))
10 LOAD_GLOBAL 1 (NULL + ord)
22 LOAD_FAST 1 (x)
24 PRECALL 1
28 CALL 1
38 LIST_APPEND 2
40 JUMP_BACKWARD 18 (to 6)
# return rv
>> 42 RETURN_VALUE
(如果您不熟悉 CPython 使用的堆栈机制,参数通过将函数放入堆栈,然后跟随其参数来传递给函数。`CALL` 用函数和其参数替换堆栈上的内容,并且 `RETURN_VALUE` 返回堆栈顶部的内容。)
引用 `'ABC'` 的 `x` 仅在转储的第一部分中的代码中使用。用于迭代该列表的 `x` 仅在用于创建临时函数的代码块中使用。
<details>
<summary>英文:</summary>
The current implementation (as of Python 3.11) basically turns the list comprehension into a function to put the two different `x`s into different scopes, equivalent to
x = 'ABC'
def _tmp(x_itr):
rv = []
for x in x_itr:
rv.append(x)
return rv
codes = _tmp(iter(x))
[PEP 709](https://peps.python.org/pep-0709/) outlines how the temporary function will be "inlined" to improve performance; this change is slated to be included in Python 3.12, to be released later this year.
---
You can use the `dis` module to see how CPython currently implements your example (via `python3 -m dis tmp.py`, with comments added by me).
0 0 RESUME 0
# Define the outer x
1 2 LOAD_CONST 0 ('ABC')
4 STORE_NAME 0 (x)
# Create _tmp (see below)
2 6 LOAD_CONST 1 (<code object <listcomp> at 0x11049b3c0, file "tmp.py", line 2>)
8 MAKE_FUNCTION 0
# Call codes = _tmp(iter(x))
10 LOAD_NAME 0 (x)
12 GET_ITER
14 PRECALL 0
18 CALL 0
28 STORE_NAME 1 (codes)
# Print the results.
[...]
# Actual body of _tmp
Disassembly of <code object <listcomp> at 0x11049b3c0, file "tmp.py", line 2>:
2 0 RESUME 0
# rv = []
2 BUILD_LIST 0
4 LOAD_FAST 0 (.0)
# for x in <argument>
>> 6 FOR_ITER 17 (to 42)
8 STORE_FAST 1 (x)
# rv.append(ord(x))
10 LOAD_GLOBAL 1 (NULL + ord)
22 LOAD_FAST 1 (x)
24 PRECALL 1
28 CALL 1
38 LIST_APPEND 2
40 JUMP_BACKWARD 18 (to 6)
# return rv
>> 42 RETURN_VALUE
(If you aren't familiar with the stack machine used by CPython, arguments are passed to a function by putting the function on the stack, followed by its arguments. `CALL` replaces the function and its arguments with the return value of the function, and `RETURN_VALUE` returns whatever is on the top of the stack.)
The `x` referring to `'ABC'` is only used by the code in the first part of the dump. The `x` used to iterate over that list is only used inside the code block used to create a temporary function.
</details>
# 答案2
**得分**: 0
因为推导式的行为类似于具有自己范围的函数。如果你写的不是推导式而是一个经典的for循环,正如你所直观地理解的那样,就会出现问题。
它知道将哪一个传递给ord()函数吗?
推导式列表中的x具有局部范围,并且在推导式完成后从内存中删除。看看这个例子:
```python
>>> x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
与这个进行比较:
>>> mylist = []
>>> mylist
[]
>>> for x in range(10):
... mylist.append(x)
...
>>> mylist
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
9
Python并不会“知道”要传递哪个值:它将发送与名称“x”关联的内存中的值。如果你更改这个值,它将使用这个新值,并“忘记”以前的值。
这是一个好的做法吗,还是更倾向于为迭代器使用不同的变量?
绝对不是一个好的做法。请使用不同的名称,并且如果需要的话,尽可能地明确(这并不总是必要)。例如:
[for n in range(10)]
在这里,“n”是完全足够的。你甚至可以使用“_”如果你愿意,以显示它没有重要性。
但如果你有以下字典:
my_dict = {"L": "John", "M": "Paul", "S": "Ringo", "H": "Georges"}
那么显式的变量名称更好:
[name.upper() + " " + initial for initial, name in my_dict.items()]
比[x.upper() + " " + y for y, x in my_dict.items()]
更清晰。
英文:
> Why is the list comprehension okay with both variables having the same
> name?
Because the comprehension behaves like a function with its own scope. If instead of a comprehension you had written a classic for loop, as you intuited, you'd have problems.
> How does it know which one to send to ord()?
The x in the comprehension list has a local scope and is deleted from the memory when the comprehension is done. Check out this example:
>>> x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
And compare it to this:
>>> mylist = []
>>> mylist
[]
>>> for x in range(10):
... mylist.append(x)
...
>>> mylist
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
9
Python doesn't "know" which value to send: it sends the value in memory to which he name "x" is linked to. If you change this value, then it will use this new value and "forget" the previous.
> Is this good practice or would it be preferred to use a different variable for the iterator?
Definitely not a good practice. Use a different name, and feel free to make it as explicit as possible, if necessary (it's not always the case). E.g. :
[for n in range(10)]
Here "n" is perfectly sufficient. You can even use '_' if you prefer, to show that it has no importance
But if you have the following dict:
my_dict = {"L": "John", "M": "Paul", "S": "Ringo", "H": "Georges"}
then explicit vars are better:
[name.upper() + " " + initial for initial, name in my_dict.items()]
is clearer than [x.upper() + " " + y for y, x in my_dict.items()]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论