英文:
Using Perl to sort an array using values that occur after a RegEx or symbol
问题
我有一个数组:
@all (
<side.effect signif="life.threat">myocardial infarction</side.effect>
<side.effect signif="life.threat">hypersensitivity reactions</side.effect>
<side.effect signif="life.threat">lactic acidosis</side.effect>
<side.effect signif="most.freq">vomiting</side.effect>
<side.effect signif="most.freq">diarrhea</side.effect>
);
我想要按照在开放的 XML 标签/属性后的值对数组进行排序,以产生以下输出:
<side.effect signif="most.freq">diarrhea</side.effect>
<side.effect signif="life.threat">hypersensitivity reactions</side.effect>
<side.effect signif="life.threat">lactic acidosis</side.effect>
<side.effect signif="life.threat">myocardial infarction</side.effect>
<side.effect signif="most.freq">vomiting</side.effect>
我不能将它转换为哈希表,因为那会因为重复而消除标签。我尝试过以下代码,但它不会对它们进行排序:
my @sorted_all = sort {
my ($aa, $bb) = map { (split)[1] } $a, $b;
$bb <=> $aa;
} @all;
如果你有任何问题,请随时提出。
英文:
I have an array:
@all (
<side.effect signif="life.threat">myocardial infarction</side.effect>
<side.effect signif="life.threat">hypersensitivity reactions</side.effect>
<side.effect signif="life.threat">lactic acidosis</side.effect>
<side.effect signif="most.freq">vomiting</side.effect>
<side.effect signif="most.freq">diarrhea</side.effect>
);
I want to sort the array on the values after the opening XML tags/attributes (">) to produce this output:
<side.effect signif="most.freq">diarrhea</side.effect>
<side.effect signif="life.threat">hypersensitivity reactions</side.effect>
<side.effect signif="life.threat">lactic acidosis</side.effect>
<side.effect signif="life.threat">myocardial infarction</side.effect>
<side.effect signif="most.freq">vomiting</side.effect>
I cannot convert it to a hash as that would eliminate the tags due to replication.
I tried this but it doesn't sort them:
my @sorted_all = sort {
my ($aa, $bb) = map { (split)[1] } $a, $b;
$bb <=> $aa;
} @all;
答案1
得分: 5
使用 [Sort::Key](https://metacpan.org/pod/Sort::Key)
use strict;
use warnings;
use feature qw(say);
use Sort::Key qw(keysort);
my @all = (
q{<side.effect signif="life.threat">myocardial infarctio</side.effect>},
q{<side.effect signif="life.threat">hypersensitivity reations</side.effect>},
q{<side.effect signif="life.threat">lactic acidosis</side.effect>},
q{<side.effect signif="most.freq">vomiting</side.effect>},
q{<side.effect signif="most.freq">diarrhea</side.effect>},
);
my @sorted = keysort { ( /">(.+?)<\// )[0] } @all;
say for @sorted;
该库在需要时使用 [Schwartzian Transform](https://en.wikipedia.org/wiki/Schwartzian_transform) ,首先为所有项构建比较模式(而不是在每对比较时重新构建)。我按原样复制了输入,包括拼写错误。
使用正则表达式来解析XML标签依赖于这种非常具体的输入格式。如果格式会有变化,请使用一个合适的XML解析器,如 [XML::LibXML](https://metacpan.org/pod/XML::LibXML)。例如
use XML::LibXML;
my $parser = XML::LibXML->new;
my @sorted = keysort {
$parser->parse_string($_)
->findnodes('side.effect')->[0]
->textContent
} @all;
有关此代码,请参阅 [XML::LibXML::Parser](https://metacpan.org/dist/XML-LibXML/view/lib/XML/LibXML/Parser.pod) 和 [XML::LibXML::Node](https://metacpan.org/release/SHLOMIF/XML-LibXML-2.0128/view/lib/XML/LibXML/Node.pod)。该库附带了更多文档,请参阅首次提及的顶部文档链接。
_为了使其工作,我不得不纠正一个节点中的拼写错误 `sid.effect` ,以使其成为有效的XML。_
英文:
Using Sort::Key
use strict;
use warnings;
use feature qw(say);
use Sort::Key qw(keysort);
my @all = (
q{<side.effect signif="life.threat">myocardial infarctio</side.effect>},
q{<side.effect signif="life.threat">hypersensitivity reations</side.effect>},
q{<side.effect signif="life.threat">lactic acidosis</sid.effect>},
q{<side.effect signif="most.freq">vomiting</side.effect>},
q{<side.effect signif="most.freq">diarrhea</side.effect>},
);
my @sorted = keysort { ( /">(.+?)<\// )[0] } @all;
say for @sorted;
The library uses Schwartzian Transform when needed, to first build the comparison patterns for all items (and not re-do it at each pair comparison). I copied input as given, typos and all.
Using a regex to parse an XML tag relies on this very specific input format. If there are going to be variations in format then please use a proper XML parser, like XML::LibXML. For example
use XML::LibXML;
my $parser = XML::LibXML->new;
my @sorted = keysort {
$parser -> parse_string($_)
-> findnodes('side.effect') -> [0]
-> textContent
} @all;
For this code see XML::LibXML::Parser and XML::LibXML::Node. This library comes with a lot more documentation, see the top document linked at first mention.
For this to work I had to correct the typo sid.effect
in one node so to have a valid XML.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论