英文:
Filling memory, what's happening
问题
以下是您要翻译的内容:
"The simple statement, 1..1GB
, fills memory and appears to be hung on a 32 GiB system. Pressing Ctrl-C does nothing. What is happening inside the interpreter? If it is allowed to run long enough, will the correct result be produced?"
简单的语句1..1GB
会填满内存,似乎在32 GiB系统上卡住了。按下Ctrl-C没有任何效果。解释器内部发生了什么?如果允许其运行足够长的时间,是否会产生正确的结果?
PS C:> (Get-CimInstance -ClassName Win32_ComputerSystem).TotalPhysicalMemory / 1MB
32629.1484375
PS C:> $PSVersionTable.PSVersion.ToString()
7.3.4
PS C:> 1..1GB
英文:
The simple statement, 1..1GB
, fills memory and appears to be hung on a 32 GiB system. Pressing <kbd>Ctrl-C</kbd> does nothing. What is happening inside the interpreter? If it is allowed to run long enough, will the correct result be produced?
PS C:\> (Get-CimInstance -ClassName Win32_ComputerSystem).TotalPhysicalMemory / 1MB
32629.1484375
PS C:\> $PSVersionTable.PSVersion.ToString()
7.3.4
PS C:\> 1..1GB
答案1
得分: 4
以下是您要翻译的内容:
..
,PowerShell的**范围操作符**:
-
**惰性枚举**范围端点之间(包括在内)的值(在这种情况下,是
1
,2
,3
,...,1073741824
) -
当结果被 捕获,例如通过将范围操作括在
(...)
中,将其赋值给变量,或使其参与更大表达式中时,PowerShell会为枚举值创建一个[object[]]
数组(使用(1..10).GetType()
进行验证)。
鉴于值的数量庞大,使用1..1GB
来枚举1GB
== 1,073,741,824
(10亿+)个值的速度非常慢 - 但最终会返回结果(除非内存不足),在PowerShell (Core) 7+中可以做到,而在_Windows PowerShell_中会立即失败;正如zett42所指出的:
-
在Windows PowerShell(64位)中,
$array = 1..1GB
失败并引发OutOfMemoryException
(Array dimensions exceeded supported range.
)。实际限制甚至更低($array = 1..300MB
仍然失败,而$array = 1..200MB
成功)。 -
原因是.NET Framework,即Windows PowerShell的底层遗留运行时,不仅对数组的_元素计数_有限制(接近2GB,即每个维度的2+亿个元素),还对数组维度的_总字节大小_有限制,也是2GB。
-
因此,考虑到每个
[object[]]
数组元素的大小与指针(引用)的大小相同,并且在64位进程中每个指针都需要8字节(使用[IntPtr]::Size
进行验证),实际限制接近2GB / 8
;准确地说是256MB - 8
。也就是说,$array = 1..(256MB - 8)
可行,但具有更多值的任何内容都会失败。 -
但是,如果不是创建数组而是使范围操作_流_,则限制较高,并且对于两个PowerShell版本都相同 - 请参阅下一节。
-
.NET(Core),即PowerShell(Core)7+的底层现代跨平台运行时,不再具有字节大小限制,但在_元素计数_方面仍然受到接近2GB,即2+亿元素的限制;zett42已经计算出了确切的限制:
2GB - 57
,即每个维度[int]::MaxValue - 56
的元素数量(使用[object[]]::new(2GB - 57).Count
进行验证 - 任何更高的值,如- 56
,都会引发Array dimensions exceeded supported range.
)。 -
作为附言:在_多维_数组中,另一个限制是_所有维度的组合_元素数量不得超过40亿元素(准确地说是
4,294,967,182
==4GB - 114
)。在PowerShell中很少使用这种数组,而更常用的是_分裂_数组(元素是其他数组的数组)_不_受此限制。
请注意,如果将范围操作作为_管道输入_使用,不需要将其括在(...)
中,则会显现出惰性枚举行为 - 不会创建数组,而是将枚举的值_流式传输_:
# 立即打印1, 2, ..., 10,因为值被*流式传输*到Select-Object
1..1GB | Select-Object -First 10
如果使用此形式:
-
您不受_数组_大小和元素计数约束的限制,因此在_Windows PowerShell_中也可以正常工作。
-
但是,您受到范围操作符本身内置的限制的约束:要枚举的值的计数必须适应
[int]
(System.Int32
),这将限制您在范围中拥有的值的数量为[int]::MaxValue + 1
,即2GB
的值;也就是说,以1
作为起始点,最高结束点是[int]::MaxValue
((2GB-1)
)。
请注意,..
在PowerShell运算符中是_不寻常_的,因为它结合了非管道和管道行为:- 在孤立状态下 - 当您既不捕获它也不将其发送到另一个命令时 - 它会_隐式_捕获在一个[object[]]
数组中,并且只有在创建结果数组后才会开始输出到主机(控制台);也就是说,只有在将操作用作_显式_管道的输入时,_流式传输_才会发生。
- 比较
1..50MB
的输出行为与1..50MB | Write-Output
的输出行为:前者在输出开始之前创建一个大数组,因此会有明显的延迟,而后者_流式_其值,因此立即开始生成输出。
**情境流式行为还会在foreach
和[switch
](https://learn.microsoft.com/en-us
英文:
<!-- language-all: sh -->
..
, PowerShell's range operator:
-
lazily enumerates the values between the (inclusive) range endpoints (in the case at at hand,
1
,2
,3
, ...,1073741824
) -
when the results are captured, such as by enclosing the range operation in
(...)
, assigning it to a variable, or making it participate in a larger expression, PowerShell creates an[object[]]
array for the enumerated values (verify with(1..10).GetType()
).
Given the sheer number of values, enumerating 1GB
== 1,073,741,824
(1+ billion)<sup>[1]</sup> values with 1..1GB
is very slow - but does eventually return (unless you run out of memory) in PowerShell (Core) 7+, whereas in Windows PowerShell it fails right away; as zett42 notes:
-
> in Windows PowerShell (64-bit),
$array = 1..1GB
fails with anOutOfMemoryException
(Array dimensions exceeded supported range.
). The actual limit is even lower ($array = 1..300MB
still fails, while$array = 1..200MB
succeeds). -
The reason is that .NET Framework, the legacy, Windows-only runtime underlying Windows PowerShell, has a limit not just on the element count of an array (close to 2 GB, i.e. 2+ billion elements per dimension), but also on the overall byte size of an array dimension, which is 2 GB as well.
-
Therefore, given that each element of an
[object[]]
array is the size of a pointer (reference) and given that in 64-bit processes each pointer requires 8 bytes (verify with[IntPtr]::Size
), the actual limit is near2GB / 8
;256MB - 8
, to be precise. That is,$array = 1..(256MB - 8)
works, but anything with more values fails. -
However, the limit is higher - and the same for both PowerShell editions - if you stream the range operation instead of causing an array to be created - see next section.
-
-
.NET (Core), the modern, cross-platform runtime underlying PowerShell (Core) 7+, doesn't have this byte-size limit any longer, but is still limited in terms of element count to near 2 GB, i.e. 2+ billion elements; zett42 has worked out the exact limit:
2GB - 57
, i.e.2,147,483,591
elements per dimension ([int]::MaxValue - 56
; verify with[object[]]::new(2GB - 57).Count
- anything higher, such as- 56
, fails withArray dimensions exceeded supported range.
)- As an aside: In multidimensional arrays an additional constraint is that the combined element count, across all dimensions, must not exceed 4+ billion elements (
4,294,967,182
==4GB - 114
, to be precise). Use of such arrays is rare in PowerShell, however, and the more commonly used jagged arrays (arrays whose elements are other arrays) are not subject to this limit.
- As an aside: In multidimensional arrays an additional constraint is that the combined element count, across all dimensions, must not exceed 4+ billion elements (
Note that if you use a range operation as pipeline input, without enclosing it in (...)
, the lazy enumeration behavior comes apparent - no array is created, and the enumerated values are streamed:
# Prints 1, 2, ..., 10 virtually *instantly*, because the values
# are *streamed* to Select-Object
1..1GB | Select-Object -First 10
If you use this form:
-
you are not subject to array size and element-count constraints, so that the above works in Windows PowerShell too.
-
however, you are subject to a limitation built into the range operator itself: the count of values to enumerate must fit into an
[int]
(System.Int32
), which limits you to[int]::MaxValue + 1
, i.e.2GB
values in the range; that is, with1
as the start point, the highest end point is[int]::MaxValue
((2GB-1)
).
Note that ..
is unusual among PowerShell operators, in that it combines non-pipeline and pipeline behaviors: in isolation - when you neither capture it nor send it to another command - it is implicitly captured in an [object[]]
array, and only after its creation does the resulting array's to-host (to-console) output start; that is, streaming does not happen unless you use the operation as input to an explicit pipeline.
- Compare the output behavior of
1..50MB
to that of1..50MB | Write-Output
: the former creates a noticeable delay before output starts, due to a (large) array getting created first, whereas the latter streams its values and therefore starts producing output instantly.
The situationally streaming behavior also surfaces in the context of the foreach
and switch
statements:
-
The following commands work in both PowerShell editions, due to
..
exhibiting streaming behavior there too:# Outputs 1, received in a *streaming* fashion, then exits. # Subject only to the 2GB-1 limit. foreach ($i in 1..(2GB-1)) { $i; break } # Ditto switch (1..(2GB-1)) { default { $_; break } }
<sup>[1] As mclayton points out, the GB
suffix is the (binary) gigabtye multiplier (amounting to a factor of [Math]::Pow(2, 30)
== 1,073,741,824
), one of several binary multipliers that PowerShell supports in number literals.</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论