在哈希数组中扩展数组

huangapple go评论139阅读模式
英文:

Extend Array in Hashes of Arrays

问题

以下是您要翻译的代码部分:

我将字符串解析为HTML并从中提取表格。

表格有两列:第1列单一(键),第2列多值(值)

我想将这些值存储到散列中的数组中。

use strict;
use warnings;

use Data::Dumper qw(Dumper);

my $html='
<p class="auto-cursor-target"><br /></p>
<table class="wrapped">
<colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" />
</colgroup>
<tbody>
<tr><th><p>Wikispace</p></th><th><p>right</p></th></tr>
<tr><td>mimi</td><td>right1</td></tr>
<tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr>
</tbody>
</table>
<p class="auto-cursor-target"><br /></p>
'';

use HTML::TableExtract;
my $te = HTML::TableExtract->new( headers => [qw(Wikispace right)] );
$te->parse($html);

my %known;
foreach my $ts ($te->tables) {
   foreach my $row ($ts->rows) {
     print @$row[0], ":::", @$row[1], ":  ";
     foreach my $val (split(/,/,@$row[1])) {
             print $val, ";";
             if (! $known{@$row[0]}) {
               my @arr = ($val);
               @known{@$row[0]}=\@arr;
             } else {
                     # my @arr = \@known{@$row[0]};
                     #              push (@arr, $val);
                     #         print Dumper @arr;
                     push (@$known{@$row[0]}, $val);
             };
     }
     print "\n";
   }
 }

print Dumper \%known;

请注意,这里没有任何翻译错误,代码的语法看起来是正确的。如果您有关于最后一个push的问题或其他需要修改的部分,请提出具体问题,我将乐意帮助您。另外,您还可以直接将数组分配给散列,而不必生成一个数组并稍后链接其地址。以下是如何直接将数组分配给散列的示例:

my %hash = (
    key1 => [1, 2, 3],  # 直接将数组分配给散列的值
    key2 => [4, 5, 6]
);

# 访问散列中的数组
print @{$hash{key1}};  # 打印数组 [1, 2, 3]

希望这有所帮助!

英文:

I parse a string to HTML and extract tables from it.

The tables have two columns: 1st single (key), 2nd multi-value (values)

I want to store the values in a hash to an arrays.

use strict;
use warnings;

use Data::Dumper qw(Dumper);

my $html='
<p class="auto-cursor-target"><br /></p>
<table class="wrapped">
<colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" />
</colgroup>
<tbody>
<tr><th><p>Wikispace</p></th><th><p>right</p></th></tr>
<tr><td>mimi</td><td>right1</td></tr>
<tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr>
</tbody>
</table>
<p class="auto-cursor-target"><br /></p>
';

use HTML::TableExtract;
my $te = HTML::TableExtract->new( headers => [qw(Wikispace right)] );
$te->parse($html);

my %known;
foreach my $ts ($te->tables) {
   foreach my $row ($ts->rows) {
     print @$row[0], ":::", @$row[1], ":  ";
     foreach my $val (split(/,/,@$row[1])) {
             print $val, ";";
             if (! $known{@$row[0]}) {
               my @arr = ($val);
               @known{@$row[0]}=\@arr;
             } else {
                     # my @arr = \@known{@$row[0]};
                     #              push (@arr, $val);
                     #         print Dumper @arr;
                     push (@$known{@$row[0]}, $val);
             };
     }
     print "\n";
   }
 }

print Dumper \%known;

What am I doing wrong? What's wrong with the last push, and how would you do it differently?

Also is there no way to assign an array directly to a hash (dictionary) instead of first having to generate an array and later linking its address?

答案1

得分: 2

整体方法是正确的,但存在许多基本错误。我建议首先阅读一些扎实的入门材料,而不是苦苦挣扎于语言的基本概念和语法。

基本错误:$row 是一个数组引用(通常简称为 "arrayref"),所以要提取一个元素,你需要使用 $row->[0];然后,这些元素本身不是数组引用,所以你不能对它们进行解引用(@{ $row->[0] } 是错误的)。还有,你指定的标题是错误的 - 你的文档中没有这样的标题。

我不完全理解整个目的,但这里是经过清理以使其工作的程序示例:

use strict;
use warnings;
use feature 'say';

use Data::Dumper qw(Dumper);

my $html = '<p class="auto-cursor-target"><br /></p><table class="wrapped"><colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" /></colgroup><tbody><tr><th><p>Wiki    space</p></th><th><p>right</p></th></tr><tr><td>mimi</td><td>right1</td></tr><tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr></tbody></table><p class="auto-cursor-target"><br /></p>';

use HTML::TableExtract;

my $te = HTML::TableExtract->new( headers => ['Wiki    space', 'right'] );
$te->parse($html);

my %known;
foreach my $ts ($te->tables) {
    foreach my $row ($ts->rows) {
        foreach my $val ( split /,/, $row->[1] ) {
            print $val, ";";
            if (not $known{$row->[0]}) {
                $known{$row->[0]} = [ $val ];
            }
            else {
                push @{$known{$row->[0]}}, $val;
            };
        }
        say '';
    }
}

print Dumper \%known;

这会输出:

right1;
right3;right2;
$VAR1 = {
          'mimi' => [
                      'right1'
                    ],
          'mama' => [
                      'right3',
                      'right2'
                    ]
        };
英文:

The overall approach is fine but there are many basic errors throughout. I'd suggest to first make a good go over a solid introductory material, instead of suffering with basic notions and syntax of the language.

Basic errors: that $row is an array reference (often called "arrayref" for short) so to extract an element you need $row-&gt;[0]; then, those elements themselves are not arrayrefs so you can't dereference them (@{ $row-&gt;[0] } is wrong). And, the headers you specify are wrong -- your document doesn't have such headers.

I don't fully understand the whole purpose but here is youor program cleaned up so that it works

use strict;
use warnings;
use feature &#39;say&#39;;

use Data::Dumper qw(Dumper);

my $html=&#39;&lt;p class=&quot;auto-cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;table class=&quot;wrapped&quot;&gt;&lt;colgroup&gt;&lt;col style=&quot;width: 50.0px;&quot; /&gt;&lt;col style=&quot;width: 29.0px;&quot; /&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;&lt;p&gt;Wiki    space&lt;/p&gt;&lt;/th&gt;&lt;th&gt;&lt;p&gt;right&lt;/p&gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;mimi&lt;/td&gt;&lt;td&gt;right1&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;1&quot;&gt;mama&lt;/td&gt;&lt;td colspan=&quot;1&quot;&gt;right3,right2&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class=&quot;auto-    cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&#39;;

use HTML::TableExtract;

my $te = HTML::TableExtract-&gt;new( headers =&gt; [&#39;Wiki    space&#39;, &#39;right&#39;] );
$te-&gt;parse($html);

my %known;
foreach my $ts ($te-&gt;tables) {
    #say &quot;ts: $ts&quot;;
    foreach my $row ($ts-&gt;rows) {
        #say &quot;row: @{$row}&quot;;
        foreach my $val ( split /,/, $row-&gt;[1] ) {
            print $val, &quot;;&quot;;
            if (not $known{$row-&gt;[0]}) {
                $known{$row-&gt;[0]} = [ $val ];
            }
            else {
                push @{$known{$row-&gt;[0]}}, $val;
            };
        }
        say &#39;&#39;;
    }
}

print Dumper \%known;

This prints

right1;
right3;right2;
$VAR1 = {
          &#39;mimi&#39; =&gt; [
                      &#39;right1&#39;
                    ],
          &#39;mama&#39; =&gt; [
                      &#39;right3&#39;,
                      &#39;right2&#39;
                    ]
        };

答案2

得分: 2

请看Perl解引用语法。我们可以看到

@$known{ ... }

等同于

@{ $known }{ ... }

但你没有一个标量 $known。你想要的是

@{ $known{ ... } }

或者

$known{ ... }->@*

这将给我们带来以下代码:

for my $val ( split /,/, $row->[1] ) {
   if ( !$known{ $row->[0] } ) {
      my @arr = $val;                       # 删除了无用的括号。
      @known{ $row->[0] } =  \@arr;
   } else {
      push @{ $known{ $row->[0] } }, $val;
   }                                        # 删除了无用的分号。
}

但让我们清理一下你的代码。

  1. 当切片中只有一个元素时,不鼓励使用数组切片。

    @{ $row }[0]    # 数组切片(通过引用),使用中缀语法
    

    应该改为

    ${ $row }[0]    # 数组索引(通过引用),使用中缀语法
    

    更清晰的写法是:

    $row->[0]       # 数组索引(通过引用),使用后缀/箭头语法
    

    这将给我们以下代码:

    for my $val ( split /,/, $row->[1] ) {
       if ( !$known{ $row->[0] } ) {
          my @arr = $val;
          $known{ $row->[0] } = \@arr;
       } else {
          push @{ $known{ $row->[0] } }, $val;
       }
    }
    
  2. my @a = ...; \@a 可以简化为 [ ... ]

    这将给我们以下代码:

    for my $val ( split /,/, $row->[1] ) {
       if ( !$known{ $row->[0] } ) {
          $known{ $row->[0] } = [ $val ];
       } else {
          push @{ $known{ $row->[0] } }, $val;
       }
    }
    
  3. 我们不需要那个if语句。

    for my $val ( split /,/, $row->[1] ) {
       $known{ $row->[0] } //= [];
       push @{ $known{ $row->[0] } }, $val;
    }
    

    我们甚至可以合并这两个内部语句。

    for my $val ( split /,/, $row->[1] ) {
       push @{ $known{ $row->[0] } //= [] }, $val;
    }
    

    实际上,借助于自动创建,@{ EXPR //= [] } 可以写成 @{ EXPR }。Perl将在需要时自动创建数组。

    for my $val ( split /,/, $row->[1] ) {
       push @{ $known{ $row->[0] } }, $val;
    }
    
  4. 你可以一次推送多个值。

    这意味着你的整个内部循环可以简化为以下代码:

    push @{ $known{ $row->[0] } }, split /,/, $row->[1];
    
  5. 最后,如果第一列是一个键(即唯一值),那么我们根本不需要push

    $known{ $row->[0] } = [ split /,/, $row->[1] ];
    
英文:

See Perl Dereferencing Syntax. We see that

@$known{ ... }

is short for

@{ $known }{ ... }

But you don't have a scalar $known. You want

@{ $known{ ... } }

or

$known{ ... }-&gt;@*

This gives us

for my $val ( split /,/, $row-&gt;[1] ) {
   if ( !$known{ @$row[0] } ) {
      my @arr = $val;                       # Useless parens removed.
      @known{ @$row[0] } =  \@arr;
   } else {
      push @{ $known{ @$row[0] } }, $val;
   }                                        # Useless `;` removed.
}

But let's clean up your code.

  1. Using an array slice is discouraged when the slice is just one element.

    @{ $row }[0]    # Array slice (via reference), using infix syntax
    

    should be

    ${ $row }[0]    # Array index (via reference), using infix syntax
    

    Cleaner:

    $row-&gt;[0]       # Array index (via reference), using the postfix/arrow syntax
    

    This gives us

    for my $val ( split /,/, $row-&gt;[1] ) {
       if ( !$known{ $row-&gt;[0] } ) {
          my @arr = $val;
          $known{ $row-&gt;[0] } = \@arr;
       } else {
          push @{ $known{ $row-&gt;[0] } }, $val;
       }
    }
    
  2. my @a = ...; \@a can be shortened to [ ... ].

    This gives us

    for my $val ( split /,/, $row-&gt;[1] ) {
       if ( !$known{ $row-&gt;[0] } ) {
          $known{ $row-&gt;[0] } = [ $val ];
       } else {
          push @{ $known{ $row-&gt;[0] } }, $val;
       }
    }
    
  3. We don't need that if statement.

    for my $val ( split /,/, $row-&gt;[1] ) {
       $known{ $row-&gt;[0] } //= [];
       push @{ $known{ $row-&gt;[0] } }, $val;
    }
    

    We can even combine those two inner statements.

    for my $val ( split /,/, $row-&gt;[1] ) {
       push @{ $known{ $row-&gt;[0] } //= [] }, $val;
    }
    

    In fact, thanks to autovivification, @{ EXPR //= [] } can be written as @{ EXPR }. Perl will automatically create the array if needed.

    for my $val ( split /,/, $row-&gt;[1] ) {
       push @{ $known{ $row-&gt;[0] } }, $val;
    }
    
  4. You can push multiple values at once.

    That means your entire inner loop can be reduced to the following:

    push @{ $known{ $row-&gt;[0] } }, split /,/, $row-&gt;[1];
    
  5. Finally, if the first column is a key (i.e. unique values), then we don't need push at all.

    $known{ $row-&gt;[0] } = [ split /,/, $row-&gt;[1] ];
    

答案3

得分: 1

你在这一行上遇到了语法错误:

push (@$known{@$row[0]}, $val);

因为你将变量声明为哈希(%known),但你尝试将其视为标量($known)访问。

以下是一个更简单版本的代码,可以无错误运行:

use strict;
use warnings;

use Data::Dumper qw(Dumper);

my $html = '<p class="auto-cursor-target"><br /></p><table class="wrapped"><colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" /></colgroup><tbody><tr><th><p>Wikispace</p></th><th><p>right</p></th></tr><tr><td>mimi</td><td>right1</td></tr><tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr></tbody></table><p class="auto- cursor-target"><br /></p>';

use HTML::TableExtract;
my $te = HTML::TableExtract->new( headers => [qw(Wikispace right)] );
$te->parse($html);

my %known;
foreach my $ts ($te->tables) {
    foreach my $row (@$ts) {
        my @vals = split(/,/, $row->[1]);
        push(@{$known{$row->[0]}}, @vals);
    }
}
print Dumper(\%known);

输出结果:

$VAR1 = {
          'mama' => [
                      'right3',
                      'right2'
                    ],
          'mimi' => [
                      'right1'
                    ]
        };
英文:

You get a syntax error on the line:

                 push (@$known{@$row[0]}, $val);

because you declared the variable as a hash (%known), but you are trying to access it as a scalar ($known).

Here is a simpler version of your code that runs without errors:

use strict;
use warnings;

use Data::Dumper qw(Dumper);

my $html=&#39;&lt;p class=&quot;auto-cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;table class=&quot;wrapped&quot;&gt;&lt;colgroup&gt;&lt;col style=&quot;width: 50.0px;&quot; /&gt;&lt;col style=&quot;width: 29.0px;&quot; /&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;&lt;p&gt;Wikispace&lt;/p&gt;&lt;/th&gt;&lt;th&gt;&lt;p&gt;right&lt;/p&gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;mimi&lt;/td&gt;&lt;td&gt;right1&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;1&quot;&gt;mama&lt;/td&gt;&lt;td colspan=&quot;1&quot;&gt;right3,right2&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class=&quot;auto-    cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&#39;;

use HTML::TableExtract;
my $te = HTML::TableExtract-&gt;new( headers =&gt; [qw(Wikispace right)] );
$te-&gt;parse($html);

my %known;
foreach my $ts ($te-&gt;tables) {
    foreach my $row ($ts-&gt;rows) {
        my @vals = split(/,/, $row-&gt;[1]);
        $known{ $row-&gt;[0] } = [@vals];
    }
 }
print Dumper(\%known);

Output:

$VAR1 = {
          &#39;mama&#39; =&gt; [
                      &#39;right3&#39;,
                      &#39;right2&#39;
                    ],
          &#39;mimi&#39; =&gt; [
                      &#39;right1&#39;
                    ]
        };

huangapple
  • 本文由 发表于 2023年6月15日 23:49:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76483419.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定