英文:
File::Find::Rule - duplicated output
问题
我的子程序正确解析我的哈希表并返回(基本上)正确的结果。
问题是我得到了两次返回。
就是这两次的部分我不明白。
示例数据:
$local_h{$local}{name} = "somefile.txt";
$local_h{$local}{size} = 12345;
sub already_here {
foreach my $local (keys(%local_h)) {
my $tmp = $local_h{$local}{name};
my $FFR_rule = File::Find::Rule
->size($local_h{$local}{size})
->start( @volumes );
while ( defined ( my $match = $FFR_rule->match ) ) {
my ( $name, $path, $suffix ) = fileparse($match);
if ( $name =~ /$local_h{$local}{name}/ ) {
say "\t$name $name has been matched by size and name to:\n\t $path$name\n";
# 匹配可能会发生多次,稍后/其他地方处理
} else {
say "$match Matched by size only\n";
# 也许这真的是位置,但在本地重命名了。
# 现在我会考虑它是边缘情况。
}
}
}
}
输出:
somefile.txt has been matched by size and name to: a/path/to/somefile.txt
somefile.txt has been matched by size and name to: some/other/path/to/somefile.txt
34thx.foo Matched by size only
somefile.txt has been matched by size and name to: a/path/to/somefile.txt
somefile.txt has been matched by size and name to: some/other/path/to/somefile.txt
34thx.foo Matched by size only
我期望看到3行输出(如果计算空白行则为4行)。
我完全看不出重复的来源。
英文:
My subroutine parses my hashes correctly and returns (basically) correct.
The problem is I get the return twice.
It is the twice part that I do not understand.
Sample data:
$local_h{$local}{name} = "somefile.txt";
$local_h{$local}{size} = 12345;
sub already_here {
foreach my $local (keys(%local_h)) {
my $tmp = $local_h{$local}{name};
my $FFR_rule = File::Find::Rule
->size($local_h{$local}{size})
->start( @volumes );
while ( defined ( my $match = $FFR_rule->match ) ) {
my ( $name, $path, $suffix ) = fileparse($match);
if ( $name =~ /$local_h{$local}{name}/ ) {
say "\t$name $name has been matched by size and name to:\n\t $path$name\n";
# Matches can occur multiple times, to be dealt with later/elsewhere
} else {
say "$match Matched by size only\n";
# Maybe this really is the location but got renamed locally.
# For now I will consider it an edge-case.
}
}
}
}
output:
somefile.txt has been matched by size and name to: a/path/to/somefile.txt
somefile.txt has been matched by size and name to: some/other/path/to/somefile.txt
34thx.foo Matched by size only
somefile.txt has been matched by size and name to: a/path/to/somefile.txt
somefile.txt has been matched by size and name to: some/other/path/to/somefile.txt
34thx.foo Matched by size only
I am expecting to see 3 lines of output (4 if you count the blank line).
I am completely failing to see the source of the duplication.
答案1
得分: 2
以下是已翻译的部分:
有两种可能性。在%local_h
中有两个相同名称和大小的条目,或者您两次扫描了同一个目录。如果@volumes
中有两个条目,其中一个是另一个的祖先,那么您可能会两次扫描相同的目录。
话虽如此,您所采用的方法很糟糕。
假设%local_h
中有10个文件,磁盘上也有10个文件。您将遍历树10次。这意味着您将对这10个文件进行每次10次检查,总共调用stat
函数100次!对于20和20,这将是400次调用!没有理由这样做。在这两种情况下,以下代码只分别调用stat
函数20次(而不是100次)和40次(而不是400次):
my %interesting;
for ( values( %local_h ) ) {
++$interesting{ $_->{ size } }{ $_->{ name } };
}
my $ffr = File::Find::Rule->start( @volumes );
while ( defined( my $qfn = $ffr->match ) ) {
my $fn = basename( $qfn );
my $size = -s $qfn;
if ( my $r = $interesting{ $size } ) {
if ( $r->{ $fn } ) {
say "$fn size+name $qfn";
} else {
say "$fn size $qfn";
}
}
}
英文:
There are two possibilities. There are two entries in %local_h
with the same name and size, or you scan the same directory twice. You could scan the same directory twice if two entries in @volumes
are it or an ancestor.
That said, the approach you are taking is awful.
Let's say there are 10 files in %local_h
and 10 files on disk. You will be traversing the tree 10 times. Which means you will check the 10 files 10 times each, for a total of 100 calls to stat
! For 20 and 20, that's 400 calls! There's no reason to do that. In each of those scenarios, the following only does 20 (instead of 100) and 40 (instead of 400) calls to stat
respectively:
my %interesting;
for ( values( %local_h ) ) {
++$interesting{ $_->{ size } }{ $_->{ name } };
}
my $ffr = File::Find::Rule->start( @volumes );
while ( defined( my $qfn = $ffr->match ) ) {
my $fn = basename( $qfn );
my $size = -s $qfn;
if ( my $r = $interesting{ $size } ) {
if ( $r->{ $fn } ) {
say "$fn size+name $qfn";
} else {
say "$fn size $qfn";
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论