Perl对数组排序时,如果正在引用的元素已存在,则创建新元素。

huangapple go评论61阅读模式
英文:

Perl sort array creates new element if element being referenced

问题

我有一个字符串数组,想通过引用其元素来记录它的原始顺序到另一个数组中,然后对字符串数组进行排序,它会在此数组中创建新的元素,而不是原始元素。

```perl
my @item = qw/zz xx/;
my @order;

foreach (@item) {
    push @order, $_;
}

print $item[0] . " " . $item[0] . "\n";
print $item[1] . " " . $item[1] . "\n";

@item = sort {$a cmp $b} @item;

print $item[0] . " " . $item[0] . "\n";
print $item[1] . " " . $item[1] . "\n";

Result:

zz SCALAR(0x5618ffd27220)
xx SCALAR(0x5618ffd273e8)
xx SCALAR(0x5618ffd46668)
zz SCALAR(0x5618ffd46698)

如果我注释掉引用块,那么结果会返回到预期的结果。所以有人可以解释一下引用元素如何影响 "sort" 的结果吗。

Expected Result:

zz SCALAR(0x5618ffd27220)
xx SCALAR(0x5618ffd273e8)
xx SCALAR(0x5618ffd273e8)
zz SCALAR(0x5618ffd27220)

<details>
<summary>英文:</summary>

I have an array of strings, would like to record its original order in another array by referencing its elements, then I sort the array of strings, it creates new element in this array rather than the original element.

my @item = qw/zz xx/;
my @order;

foreach (@item) {
push @order, $_;
}

print $item[0] . " " . $item[0] . "\n";
print $item[1] . " " . $item[1] . "\n";

@item = sort {$a cmp $b} @item;

print $item[0] . " " . $item[0] . "\n";
print $item[1] . " " . $item[1] . "\n";



**Result:**

```none
zz SCALAR(0x5618ffd27220)
xx SCALAR(0x5618ffd273e8)
xx SCALAR(0x5618ffd46668)
zz SCALAR(0x5618ffd46698)

If I comment out the reference block, then the outcome return to expected result. So can anyone explain how referencing the element would affect the result of "sort".

Expected Result:

zz SCALAR(0x5618ffd27220)
xx SCALAR(0x5618ffd273e8)
xx SCALAR(0x5618ffd273e8)
zz SCALAR(0x5618ffd27220)

答案1

得分: 4

将标量分配给数组的元素会创建标量的副本。

例如,

use v5.14;
my $x = 123;
say \$x;
my @a = $x;
say \$a[0];

输出类似于

SCALAR(0x55dd65bf0e18)
SCALAR(0x55dd65bc14b8)

因此,您得到的结果是您应该期望的结果。


但是对于

my @c = qw( l k j );
say for \( @c );
@c = sort @c;
say for \( @c );
SCALAR(0x55c396e0d4b8)
SCALAR(0x55c396e0d4e8)
SCALAR(0x55c396e0d680)
SCALAR(0x55c396e0d680)
SCALAR(0x55c396e0d4e8)
SCALAR(0x55c396e0d4b8)

sort 对于对同一个数组进行排序并将结果分配给相同数组的情况有一个优化(例如 @a = sort @a;)。在这些情况下,它执行原地排序。它只是移动 C 指针,而不创建任何新的标量。

只要没有引用数组的值,这样做是安全的。


但是对于

my @c = qw( l k j );
my @r = \( @c );
say for \( @c );
@c = sort @c;
say for \( @c );
SCALAR(0x55abf592e4b8)
SCALAR(0x55abf592e4e8)
SCALAR(0x55abf592e680)
SCALAR(0x55abf595d660)
SCALAR(0x55abf595d648)
SCALAR(0x55abf595d6f0)

仅当优化是透明的时候,它才是正确的。如果优化导致代码的行为与未优化的代码不同,那么就是有错误的。换句话说,sort 在使用优化时应该与不使用优化时的行为相同。这意味着即使使用了优化,sort 也需要复制数组以外的某些东西引用的值。

如果没有其他东西引用数组的值,那么可以省略这一步,但在这种情况下并非如此。


请注意,在Perl 5.26之前的版本中存在错误。这些程序应该输出相同的结果,但它们却没有:

$ perl5.24t -le&#39;@c = qw(l k j); $d = $c[2]; @c = @e = sort @c; $$d = &quot;X&quot;; print @c&#39;
jkl

$ perl5.24t -le&#39;@c = qw(l k j); $d = $c[2]; @c =      sort @c; $$d = &quot;X&quot;; print @c&#39;
Xkl

这在5.26中已被修复。从那时起,排序仍然是原地完成的,但对于引用计数大于一的每个值都会首先创建一个副本。

$ perl5.26t -le&#39;@c = qw(l k j); $d = $c[2]; @c = @e = sort @c; $$d = &quot;X&quot;; print @c&#39;
jkl

$ perl5.26t -le&#39;@c = qw(l k j); $d = $c[2]; @c =      sort @c; $$d = &quot;X&quot;; print @c&#39;
jkl
英文:

Assigning a scalar to an element of an array makes a copy of the scalar.

For example,

use v5.14;
my $x = 123;
say \$x;
my @a = $x;
say \$a[0];

outputs something like

SCALAR(0x55dd65bf0e18)
SCALAR(0x55dd65bc14b8)

So the result you get is the result you should expect.


But what about

my @c = qw( l k j );
say for \( @c );
@c = sort @c;
say for \( @c );
SCALAR(0x55c396e0d4b8)
SCALAR(0x55c396e0d4e8)
SCALAR(0x55c396e0d680)
SCALAR(0x55c396e0d680)
SCALAR(0x55c396e0d4e8)
SCALAR(0x55c396e0d4b8)

sort has an optimization for the case when an array is being sorted, and the results are assigned to the same array (e.g. @a = sort @a;). In these situations, it performs an in-place sort. It simply moves the C pointers around without creating any new scalars.

As long as nothing references the values of the array, it is safe to do so.


But what about

my @c = qw( l k j );
my @r = \( @c );
say for \( @c );
@c = sort @c;
say for \( @c );
SCALAR(0x55abf592e4b8)
SCALAR(0x55abf592e4e8)
SCALAR(0x55abf592e680)
SCALAR(0x55abf595d660)
SCALAR(0x55abf595d648)
SCALAR(0x55abf595d6f0)

An optimization is only correct if it's transparent. An optimization that causes code to behave differently than unoptimized code is buggy. In other words, sort should work the same with or without the optimization. Which means that sort needs to make copies of values referenced by something other the array even when the optimization is used.

It can get away from doing this if nothing else references the array values, but that's not the case here.


Note that versions of Perl before 5.26 were buggy. These programs should output the same, but they don't:

$ perl5.24t -le&#39;@c = qw(l k j); $d = $c[2]; @c = @e = sort @c; $$d = &quot;X&quot;; print @c&#39;
jkl

$ perl5.24t -le&#39;@c = qw(l k j); $d = $c[2]; @c =      sort @c; $$d = &quot;X&quot;; print @c&#39;
Xkl

This was fixed in 5.26. Since then, the sort is still done in-place, but a copy of every value with a refcount greater than one is made first.

$ perl5.26t -le&#39;@c = qw(l k j); $d = $c[2]; @c = @e = sort @c; $$d = &quot;X&quot;; print @c&#39;
jkl

$ perl5.26t -le&#39;@c = qw(l k j); $d = $c[2]; @c =      sort @c; $$d = &quot;X&quot;; print @c&#39;
jkl

huangapple
  • 本文由 发表于 2023年6月6日 14:13:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76411868.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定