TArray 结果在 for 循环内不始终初始 ()?

huangapple go评论75阅读模式
英文:

TArray Result not always initially () within for loop?

问题

Test函数中的 Result 在初始情况下并不总是 ()

我找到了 https://stackoverflow.com/questions/5314918/do-i-need-to-setlength-a-dynamic-array-on-initialization/5315254#5315254,但我并没有完全理解那个答案。

更重要的是,如何最好地“打破” Result/A 的连接(我需要循环)?也许有一种方法可以迫使编译器“正确”初始化?在 Test 中手动添加 Result := nil 作为第一行?

begin
  SetLength(Result, 3); // 在这里设置断点
  Result[0] := 2;
end;

procedure TForm1.Button3Click(Sender: TObject);
var
  A: TArray<Integer>; // 无论 A 是局部的还是全局的都无济于事
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A); // 在 Test 断点处:
                  // * 第一次循环:Result 为 ()
                  // * 后续循环:Result 为 (2, 0, 0)
                  //               修改 Result 会立即更改 A
  A := Test(A);   // Result 再次为 ()
end;
英文:

Result in Test is NOT always initially ()

I have found https://stackoverflow.com/questions/5314918/do-i-need-to-setlength-a-dynamic-array-on-initialization/5315254#5315254, however I do not fully understand that answer

More importantly, what is the best approach to "break" the Result/A connection (I need the loop)? Perhaps some way to force the compiler to "properly" initialize? Manually adding Result := nil as first line in Test?

function Test(var A: TArray<Integer>): TArray<Integer>;
begin
  SetLength(Result, 3); // Breakpoint here
  Result[0] := 2;
end;

procedure TForm1.Button3Click(Sender: TObject);
var
  A: TArray<Integer>; // Does not help whether A is local or global
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A); // At Test breakpoint:
                  // * FIRST loop: Result is ()
                  // * NEXT loops: Result is (2, 0, 0)
                  //               modifying Result changes A (immediately)
  A := Test(A);   // Result is again ()
end;

答案1

得分: 1

以下是您要的代码部分的中文翻译:

所引用的问题涉及到类内部的字段,它们都被初始化为零,并且受控类型在实例销毁期间得到正确处理。

您的代码涉及在循环内调用具有受控返回类型的函数。受控类型的局部变量在例程开始时初始化一次。在编译器内部,受控返回类型被视为 var 参数。因此,在第一次调用之后,它传递了看起来像 A 的东西给 Test 两次 - 作为 A 参数和 Result

但是,您的评估修改 Result 也会影响到 A(参数)是不正确的,我们可以通过稍微更改代码来证明这一点:

function Test(var A: TArray<Integer>; I: Integer): TArray<Integer>;
begin
  SetLength(Result, 3); // 在这里设置断点
  Result[0] := I;
end;

procedure Main;
var
  A: TArray<Integer>;
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A, I);
                    
  A := Test(A, 0);
end;

当您逐步执行 Test 时,您将看到修改 Result[0] 不会更改 A。这是因为 SetLength 会创建一个副本,因为编译器引入了一个用于传递 Result 的临时变量,然后在调用 Test 后将其分配给 A(局部变量) - 您可以在反汇编视图中看到这一点,其外观类似于循环中的这一行(我使用 $O+ 使代码比没有优化时更加紧凑):

Project1.dpr.21: A := Test(A, I);
0041A3BD 8D4DF8           lea ecx,[ebp-$08]
0041A3C0 8D45FC           lea eax,
0041A3C3 8BD3             mov edx,ebx
0041A3C5 E8B2FFFFFF       call Test
0041A3CA 8B55F8           mov edx,[ebp-$08]
0041A3CD 8D45FC           lea eax,[ebp-$04]
0041A3D0 8B0DC8244000     mov ecx,[$004024c8]
0041A3D6 E855E7FEFF       call @DynArrayAsg
0041A3DB 43               inc ebx

知道默认的调用约定是前三个参数在 eax、edx 和 ecx 中,我们知道 eax 是 A 参数,edx 是 I,而 ecxResult(前面提到的 Result var 参数始终位于最后)。我们看到它在堆栈上使用不同的位置([ebp-$04]A 变量,[ebp-$08] 是编译器引入的变量)。在调用后,我们看到编译器插入了一个额外的调用 System._DynArrayAsg,然后将编译器引入的临时变量分配给 A

这是第二次调用 Test 的屏幕截图。

英文:

The referenced question is about fields inside of a class and they are all zero-initialized and managed types are properly finalized during instance destruction.

Your code is about calling a function with a managed return type within the loop. A local variable of a managed type is initialized once - at the beginning of the routine. A managed return type under the hood is treated by the compiler as a var parameter. So after the first call, it passes what looks to be A to Test twice - as the A parameter and for the Result.

But your assessment that modifying Result also affects A (the parameter) is not correct which we can prove by changing the code a bit:

function Test(var A: TArray<Integer>; I: Integer): TArray<Integer>;
begin
  SetLength(Result, 3); // Breakpoint here
  Result[0] := I;
end;

procedure Main;
var
  A: TArray<Integer>;
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A, I);
                    
  A := Test(A, 0);
end;

When you single step through Test you will see that changing Result[0] will not change A. That is because SetLength will create a copy because the compiler introduced a second variable it uses temporarily for passing Result and after the call to Test it assigns that to A (the local variable) - you can see that in the disassembly view which will look similar to this for the line in the loop (I use $O+ to make the code a little denser than it would be without optimization):

Project1.dpr.21: A := Test(A, I);
0041A3BD 8D4DF8           lea ecx,[ebp-$08]
0041A3C0 8D45FC           lea eax,
0041A3C3 8BD3             mov edx,ebx
0041A3C5 E8B2FFFFFF       call Test
0041A3CA 8B55F8           mov edx,[ebp-$08]
0041A3CD 8D45FC           lea eax,[ebp-$04]
0041A3D0 8B0DC8244000     mov ecx,[$004024c8]
0041A3D6 E855E7FEFF       call @DynArrayAsg
0041A3DB 43               inc ebx

Knowing the default calling convention is first three parameters in eax, edx, and ecx, we know eax is the A parameter, edx is I and ecx is Result (the aforementioned Result var parameter is always last). We see that it uses different locations on the stack ([ebp-$04] which is the A variable and [ebp-$08] which is the compiler introduced variable). And after the call we see that the compiler inserted an additional call to System._DynArrayAsg which then assigns the compiler introduced temp variable for Result to A.

Here is a screenshot of the second call to Test:

TArray 结果在 for 循环内不始终初始 ()?

答案2

得分: 0

不过我犹豫是否称之为For编译器优化的问题,如果直接修改数组元素,这肯定是没有帮助的:

function Test(var A: TArray<Integer>): TArray<Integer>;
begin
  if Length(Result) > 0 then // 断点
    Result[1] := 66; // A 被修改!
  SetLength(Result, 3);
  Result[0] := Result[0] + 1; // A 没有被修改
  Exit;
  A[9] := 666; // 强制链接器不要消除 A
end;

经过调查,我得出结论,影响整个数组的函数(例如SetLengthCopy或其他返回TArray<Integer>的函数)将--毫不奇怪--破坏了For循环创建的Result/A的一致性。

似乎最安全的方法是(根据原始帖子中链接的答案)在Test的第一行执行Result := nil;

如果没有更多建议,我最终会接受这个答案。

注意:
作为额外的好处,从Result := nil开始可以防止数组被SetLength复制--显而易见,但例如,对一个包含100000个元素的数组进行100000次循环,这个小修改可以使执行时间快大约40%。

英文:

While I hesitate to call this For compiler optimization a bug, this is certainly unhelpful if modifying array elements directly:

function Test(var A: TArray<Integer>): TArray<Integer>;
begin
  if Length(Result) > 0 then // Breakpoint
    Result[1] := 66; // A modified!
  SetLength(Result, 3);
  Result[0] := Result[0] + 1; // A not modified
  Exit;
  A[9] := 666; // Force linker not to eliminate A
end;

After investigation, I conclude that functions that affect the entire array (e.g. SetLength, Copy or some other function that returns TArray<Integer>) will -- unsurprisingly -- "break" the Result/A identicality created by the For loop.

It would appear that the safest approach is (as per the answer linked to in the original post) to Result := nil; as first line in Test.

If there are no further suggestions, I will eventually accept this as the answer.

NOTE:
As an added bonus, starting with Result := nil prevents the array from being copied by SetLength -- obvious, but for e.g. an array of 100000 being looped 100000 times this little modification effectuates a ~40% faster execution time

huangapple
  • 本文由 发表于 2023年4月10日 22:46:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/75978092.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定