在哈希数组中扩展数组

huangapple go评论173阅读模式
英文:

Extend Array in Hashes of Arrays

问题

以下是您要翻译的代码部分:

  1. 我将字符串解析为HTML并从中提取表格。
  2. 表格有两列:第1列单一(键),第2列多值(值)
  3. 我想将这些值存储到散列中的数组中。
  4. use strict;
  5. use warnings;
  6. use Data::Dumper qw(Dumper);
  7. my $html='
  8. <p class="auto-cursor-target"><br /></p>
  9. <table class="wrapped">
  10. <colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" />
  11. </colgroup>
  12. <tbody>
  13. <tr><th><p>Wikispace</p></th><th><p>right</p></th></tr>
  14. <tr><td>mimi</td><td>right1</td></tr>
  15. <tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr>
  16. </tbody>
  17. </table>
  18. <p class="auto-cursor-target"><br /></p>
  19. '';
  20. use HTML::TableExtract;
  21. my $te = HTML::TableExtract->new( headers => [qw(Wikispace right)] );
  22. $te->parse($html);
  23. my %known;
  24. foreach my $ts ($te->tables) {
  25. foreach my $row ($ts->rows) {
  26. print @$row[0], ":::", @$row[1], ": ";
  27. foreach my $val (split(/,/,@$row[1])) {
  28. print $val, ";";
  29. if (! $known{@$row[0]}) {
  30. my @arr = ($val);
  31. @known{@$row[0]}=\@arr;
  32. } else {
  33. # my @arr = \@known{@$row[0]};
  34. # push (@arr, $val);
  35. # print Dumper @arr;
  36. push (@$known{@$row[0]}, $val);
  37. };
  38. }
  39. print "\n";
  40. }
  41. }
  42. print Dumper \%known;

请注意,这里没有任何翻译错误,代码的语法看起来是正确的。如果您有关于最后一个push的问题或其他需要修改的部分,请提出具体问题,我将乐意帮助您。另外,您还可以直接将数组分配给散列,而不必生成一个数组并稍后链接其地址。以下是如何直接将数组分配给散列的示例:

  1. my %hash = (
  2. key1 => [1, 2, 3], # 直接将数组分配给散列的值
  3. key2 => [4, 5, 6]
  4. );
  5. # 访问散列中的数组
  6. print @{$hash{key1}}; # 打印数组 [1, 2, 3]

希望这有所帮助!

英文:

I parse a string to HTML and extract tables from it.

The tables have two columns: 1st single (key), 2nd multi-value (values)

I want to store the values in a hash to an arrays.

  1. use strict;
  2. use warnings;
  3. use Data::Dumper qw(Dumper);
  4. my $html='
  5. <p class="auto-cursor-target"><br /></p>
  6. <table class="wrapped">
  7. <colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" />
  8. </colgroup>
  9. <tbody>
  10. <tr><th><p>Wikispace</p></th><th><p>right</p></th></tr>
  11. <tr><td>mimi</td><td>right1</td></tr>
  12. <tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr>
  13. </tbody>
  14. </table>
  15. <p class="auto-cursor-target"><br /></p>
  16. ';
  17. use HTML::TableExtract;
  18. my $te = HTML::TableExtract->new( headers => [qw(Wikispace right)] );
  19. $te->parse($html);
  20. my %known;
  21. foreach my $ts ($te->tables) {
  22. foreach my $row ($ts->rows) {
  23. print @$row[0], ":::", @$row[1], ": ";
  24. foreach my $val (split(/,/,@$row[1])) {
  25. print $val, ";";
  26. if (! $known{@$row[0]}) {
  27. my @arr = ($val);
  28. @known{@$row[0]}=\@arr;
  29. } else {
  30. # my @arr = \@known{@$row[0]};
  31. # push (@arr, $val);
  32. # print Dumper @arr;
  33. push (@$known{@$row[0]}, $val);
  34. };
  35. }
  36. print "\n";
  37. }
  38. }
  39. print Dumper \%known;

What am I doing wrong? What's wrong with the last push, and how would you do it differently?

Also is there no way to assign an array directly to a hash (dictionary) instead of first having to generate an array and later linking its address?

答案1

得分: 2

整体方法是正确的,但存在许多基本错误。我建议首先阅读一些扎实的入门材料,而不是苦苦挣扎于语言的基本概念和语法。

基本错误:$row 是一个数组引用(通常简称为 "arrayref"),所以要提取一个元素,你需要使用 $row->[0];然后,这些元素本身不是数组引用,所以你不能对它们进行解引用(@{ $row->[0] } 是错误的)。还有,你指定的标题是错误的 - 你的文档中没有这样的标题。

我不完全理解整个目的,但这里是经过清理以使其工作的程序示例:

  1. use strict;
  2. use warnings;
  3. use feature 'say';
  4. use Data::Dumper qw(Dumper);
  5. my $html = '<p class="auto-cursor-target"><br /></p><table class="wrapped"><colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" /></colgroup><tbody><tr><th><p>Wiki space</p></th><th><p>right</p></th></tr><tr><td>mimi</td><td>right1</td></tr><tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr></tbody></table><p class="auto-cursor-target"><br /></p>';
  6. use HTML::TableExtract;
  7. my $te = HTML::TableExtract->new( headers => ['Wiki space', 'right'] );
  8. $te->parse($html);
  9. my %known;
  10. foreach my $ts ($te->tables) {
  11. foreach my $row ($ts->rows) {
  12. foreach my $val ( split /,/, $row->[1] ) {
  13. print $val, ";";
  14. if (not $known{$row->[0]}) {
  15. $known{$row->[0]} = [ $val ];
  16. }
  17. else {
  18. push @{$known{$row->[0]}}, $val;
  19. };
  20. }
  21. say '';
  22. }
  23. }
  24. print Dumper \%known;

这会输出:

  1. right1;
  2. right3;right2;
  3. $VAR1 = {
  4. 'mimi' => [
  5. 'right1'
  6. ],
  7. 'mama' => [
  8. 'right3',
  9. 'right2'
  10. ]
  11. };
英文:

The overall approach is fine but there are many basic errors throughout. I'd suggest to first make a good go over a solid introductory material, instead of suffering with basic notions and syntax of the language.

Basic errors: that $row is an array reference (often called "arrayref" for short) so to extract an element you need $row-&gt;[0]; then, those elements themselves are not arrayrefs so you can't dereference them (@{ $row-&gt;[0] } is wrong). And, the headers you specify are wrong -- your document doesn't have such headers.

I don't fully understand the whole purpose but here is youor program cleaned up so that it works

  1. use strict;
  2. use warnings;
  3. use feature &#39;say&#39;;
  4. use Data::Dumper qw(Dumper);
  5. my $html=&#39;&lt;p class=&quot;auto-cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;table class=&quot;wrapped&quot;&gt;&lt;colgroup&gt;&lt;col style=&quot;width: 50.0px;&quot; /&gt;&lt;col style=&quot;width: 29.0px;&quot; /&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;&lt;p&gt;Wiki space&lt;/p&gt;&lt;/th&gt;&lt;th&gt;&lt;p&gt;right&lt;/p&gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;mimi&lt;/td&gt;&lt;td&gt;right1&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;1&quot;&gt;mama&lt;/td&gt;&lt;td colspan=&quot;1&quot;&gt;right3,right2&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class=&quot;auto- cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&#39;;
  6. use HTML::TableExtract;
  7. my $te = HTML::TableExtract-&gt;new( headers =&gt; [&#39;Wiki space&#39;, &#39;right&#39;] );
  8. $te-&gt;parse($html);
  9. my %known;
  10. foreach my $ts ($te-&gt;tables) {
  11. #say &quot;ts: $ts&quot;;
  12. foreach my $row ($ts-&gt;rows) {
  13. #say &quot;row: @{$row}&quot;;
  14. foreach my $val ( split /,/, $row-&gt;[1] ) {
  15. print $val, &quot;;&quot;;
  16. if (not $known{$row-&gt;[0]}) {
  17. $known{$row-&gt;[0]} = [ $val ];
  18. }
  19. else {
  20. push @{$known{$row-&gt;[0]}}, $val;
  21. };
  22. }
  23. say &#39;&#39;;
  24. }
  25. }
  26. print Dumper \%known;

This prints

  1. right1;
  2. right3;right2;
  3. $VAR1 = {
  4. &#39;mimi&#39; =&gt; [
  5. &#39;right1&#39;
  6. ],
  7. &#39;mama&#39; =&gt; [
  8. &#39;right3&#39;,
  9. &#39;right2&#39;
  10. ]
  11. };

答案2

得分: 2

请看Perl解引用语法。我们可以看到

  1. @$known{ ... }

等同于

  1. @{ $known }{ ... }

但你没有一个标量 $known。你想要的是

  1. @{ $known{ ... } }

或者

  1. $known{ ... }->@*

这将给我们带来以下代码:

  1. for my $val ( split /,/, $row->[1] ) {
  2. if ( !$known{ $row->[0] } ) {
  3. my @arr = $val; # 删除了无用的括号。
  4. @known{ $row->[0] } = \@arr;
  5. } else {
  6. push @{ $known{ $row->[0] } }, $val;
  7. } # 删除了无用的分号。
  8. }

但让我们清理一下你的代码。

  1. 当切片中只有一个元素时,不鼓励使用数组切片。

    1. @{ $row }[0] # 数组切片(通过引用),使用中缀语法

    应该改为

    1. ${ $row }[0] # 数组索引(通过引用),使用中缀语法

    更清晰的写法是:

    1. $row->[0] # 数组索引(通过引用),使用后缀/箭头语法

    这将给我们以下代码:

    1. for my $val ( split /,/, $row->[1] ) {
    2. if ( !$known{ $row->[0] } ) {
    3. my @arr = $val;
    4. $known{ $row->[0] } = \@arr;
    5. } else {
    6. push @{ $known{ $row->[0] } }, $val;
    7. }
    8. }
  2. my @a = ...; \@a 可以简化为 [ ... ]

    这将给我们以下代码:

    1. for my $val ( split /,/, $row->[1] ) {
    2. if ( !$known{ $row->[0] } ) {
    3. $known{ $row->[0] } = [ $val ];
    4. } else {
    5. push @{ $known{ $row->[0] } }, $val;
    6. }
    7. }
  3. 我们不需要那个if语句。

    1. for my $val ( split /,/, $row->[1] ) {
    2. $known{ $row->[0] } //= [];
    3. push @{ $known{ $row->[0] } }, $val;
    4. }

    我们甚至可以合并这两个内部语句。

    1. for my $val ( split /,/, $row->[1] ) {
    2. push @{ $known{ $row->[0] } //= [] }, $val;
    3. }

    实际上,借助于自动创建,@{ EXPR //= [] } 可以写成 @{ EXPR }。Perl将在需要时自动创建数组。

    1. for my $val ( split /,/, $row->[1] ) {
    2. push @{ $known{ $row->[0] } }, $val;
    3. }
  4. 你可以一次推送多个值。

    这意味着你的整个内部循环可以简化为以下代码:

    1. push @{ $known{ $row->[0] } }, split /,/, $row->[1];
  5. 最后,如果第一列是一个键(即唯一值),那么我们根本不需要push

    1. $known{ $row->[0] } = [ split /,/, $row->[1] ];
英文:

See Perl Dereferencing Syntax. We see that

  1. @$known{ ... }

is short for

  1. @{ $known }{ ... }

But you don't have a scalar $known. You want

  1. @{ $known{ ... } }

or

  1. $known{ ... }-&gt;@*

This gives us

  1. for my $val ( split /,/, $row-&gt;[1] ) {
  2. if ( !$known{ @$row[0] } ) {
  3. my @arr = $val; # Useless parens removed.
  4. @known{ @$row[0] } = \@arr;
  5. } else {
  6. push @{ $known{ @$row[0] } }, $val;
  7. } # Useless `;` removed.
  8. }

But let's clean up your code.

  1. Using an array slice is discouraged when the slice is just one element.

    1. @{ $row }[0] # Array slice (via reference), using infix syntax

    should be

    1. ${ $row }[0] # Array index (via reference), using infix syntax

    Cleaner:

    1. $row-&gt;[0] # Array index (via reference), using the postfix/arrow syntax

    This gives us

    1. for my $val ( split /,/, $row-&gt;[1] ) {
    2. if ( !$known{ $row-&gt;[0] } ) {
    3. my @arr = $val;
    4. $known{ $row-&gt;[0] } = \@arr;
    5. } else {
    6. push @{ $known{ $row-&gt;[0] } }, $val;
    7. }
    8. }
  2. my @a = ...; \@a can be shortened to [ ... ].

    This gives us

    1. for my $val ( split /,/, $row-&gt;[1] ) {
    2. if ( !$known{ $row-&gt;[0] } ) {
    3. $known{ $row-&gt;[0] } = [ $val ];
    4. } else {
    5. push @{ $known{ $row-&gt;[0] } }, $val;
    6. }
    7. }
  3. We don't need that if statement.

    1. for my $val ( split /,/, $row-&gt;[1] ) {
    2. $known{ $row-&gt;[0] } //= [];
    3. push @{ $known{ $row-&gt;[0] } }, $val;
    4. }

    We can even combine those two inner statements.

    1. for my $val ( split /,/, $row-&gt;[1] ) {
    2. push @{ $known{ $row-&gt;[0] } //= [] }, $val;
    3. }

    In fact, thanks to autovivification, @{ EXPR //= [] } can be written as @{ EXPR }. Perl will automatically create the array if needed.

    1. for my $val ( split /,/, $row-&gt;[1] ) {
    2. push @{ $known{ $row-&gt;[0] } }, $val;
    3. }
  4. You can push multiple values at once.

    That means your entire inner loop can be reduced to the following:

    1. push @{ $known{ $row-&gt;[0] } }, split /,/, $row-&gt;[1];
  5. Finally, if the first column is a key (i.e. unique values), then we don't need push at all.

    1. $known{ $row-&gt;[0] } = [ split /,/, $row-&gt;[1] ];

答案3

得分: 1

你在这一行上遇到了语法错误:

  1. push (@$known{@$row[0]}, $val);

因为你将变量声明为哈希(%known),但你尝试将其视为标量($known)访问。

以下是一个更简单版本的代码,可以无错误运行:

  1. use strict;
  2. use warnings;
  3. use Data::Dumper qw(Dumper);
  4. my $html = '<p class="auto-cursor-target"><br /></p><table class="wrapped"><colgroup><col style="width: 50.0px;" /><col style="width: 29.0px;" /></colgroup><tbody><tr><th><p>Wikispace</p></th><th><p>right</p></th></tr><tr><td>mimi</td><td>right1</td></tr><tr><td colspan="1">mama</td><td colspan="1">right3,right2</td></tr></tbody></table><p class="auto- cursor-target"><br /></p>';
  5. use HTML::TableExtract;
  6. my $te = HTML::TableExtract->new( headers => [qw(Wikispace right)] );
  7. $te->parse($html);
  8. my %known;
  9. foreach my $ts ($te->tables) {
  10. foreach my $row (@$ts) {
  11. my @vals = split(/,/, $row->[1]);
  12. push(@{$known{$row->[0]}}, @vals);
  13. }
  14. }
  15. print Dumper(\%known);

输出结果:

  1. $VAR1 = {
  2. 'mama' => [
  3. 'right3',
  4. 'right2'
  5. ],
  6. 'mimi' => [
  7. 'right1'
  8. ]
  9. };
英文:

You get a syntax error on the line:

  1. push (@$known{@$row[0]}, $val);

because you declared the variable as a hash (%known), but you are trying to access it as a scalar ($known).

Here is a simpler version of your code that runs without errors:

  1. use strict;
  2. use warnings;
  3. use Data::Dumper qw(Dumper);
  4. my $html=&#39;&lt;p class=&quot;auto-cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;table class=&quot;wrapped&quot;&gt;&lt;colgroup&gt;&lt;col style=&quot;width: 50.0px;&quot; /&gt;&lt;col style=&quot;width: 29.0px;&quot; /&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;&lt;p&gt;Wikispace&lt;/p&gt;&lt;/th&gt;&lt;th&gt;&lt;p&gt;right&lt;/p&gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;mimi&lt;/td&gt;&lt;td&gt;right1&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;1&quot;&gt;mama&lt;/td&gt;&lt;td colspan=&quot;1&quot;&gt;right3,right2&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class=&quot;auto- cursor-target&quot;&gt;&lt;br /&gt;&lt;/p&gt;&#39;;
  5. use HTML::TableExtract;
  6. my $te = HTML::TableExtract-&gt;new( headers =&gt; [qw(Wikispace right)] );
  7. $te-&gt;parse($html);
  8. my %known;
  9. foreach my $ts ($te-&gt;tables) {
  10. foreach my $row ($ts-&gt;rows) {
  11. my @vals = split(/,/, $row-&gt;[1]);
  12. $known{ $row-&gt;[0] } = [@vals];
  13. }
  14. }
  15. print Dumper(\%known);

Output:

  1. $VAR1 = {
  2. &#39;mama&#39; =&gt; [
  3. &#39;right3&#39;,
  4. &#39;right2&#39;
  5. ],
  6. &#39;mimi&#39; =&gt; [
  7. &#39;right1&#39;
  8. ]
  9. };

huangapple
  • 本文由 发表于 2023年6月15日 23:49:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76483419.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定