Perl: 强制 Spreadsheet::Read 使用 Text::CSV_XS

huangapple go评论72阅读模式
英文:

Perl: force Spreadsheet::Read to use Text::CSV_XS

问题

You can force Spreadsheet::Read to use Text::CSV_XS for parsing CSV files, which should provide similar performance to your second example. Here's how to do it:

  1. Specify the parser in your code:
my $book = Spreadsheet::Read->new(
    'file.csv',
    sep => ';',
    parser => 'csv',
);
  1. Set the parser environment variable before using Spreadsheet::Read:
$ENV{SPREADSHEET_READ_CSV} = 'Text::CSV_XS';

With these changes, Spreadsheet::Read will use Text::CSV_XS for parsing CSV files, and you should expect similar performance to your second code example.

英文:

I have an 8MB CSV file. Using Spreadsheet::Read it takes 10 seconds to read:

my $book = ReadData ( 'file.csv' );
my @rows = Spreadsheet::Read::rows($book->[1]); # first sheet
foreach my $i (2 .. scalar @rows) { # ignore first header row
    my $first = $rows[$i-1][1];
    #...
}

Using Text::CSV_XS, it takes 1 second:

open my $fh, "<:encoding(utf8)", 'file.csv' or die $!;
my $csv = Text::CSV_XS->new ({ diag_verbose=>1, auto_diag=>1, binary=>1, sep_char=>";" });
$csv->getline($fh); # Ignore Header
while (my $row = $csv->getline ($fh)) {	
    my $first = $row->[1];
    #...
}
close ($fh);

Can I force Spreadsheet::Read to use Text::CSV_XS and expect similar peformance? I tried:

  1. Specifying a parser:
my $book = Spreadsheet::Read->new (
    'file.csv',
	sep => ';',
	parser => 'csv',
	);
  1. Setting the parser environment variable:
$ENV{SPREADSHEET_READ_CSV} = 'Text::CSV_XS';

Output of Spreadsheet::Read->parsers() is:

$VAR1 = {
          'ext' => 'csv',
          'def' => '',
          'mod' => 'Text::CSV',
          'min' => '1.17',
          'vsn' => '-'
        };
$VAR2 = {
          'ext' => 'csv',
          'def' => '',
          'mod' => 'Text::CSV_PP',
          'min' => '1.17',
          'vsn' => '-'
        };
$VAR3 = {
          'vsn' => '1.50',
          'min' => '0.71',
          'ext' => 'csv',
          'mod' => 'Text::CSV_XS',
          'def' => '*'
        };
$VAR4 = {
          'min' => '0.01',
          'vsn' => '0.87',
          'def' => '*',
          'mod' => 'Spreadsheet::Read',
          'ext' => 'sc'
        };
$VAR5 = {
          'vsn' => '0.65',
          'min' => '0.34',
          'ext' => 'xls',
          'mod' => 'Spreadsheet::ParseExcel',
          'def' => '*'
        };
$VAR6 = {
          'min' => '0.24',
          'vsn' => '0.27',
          'ext' => 'xlsm',
          'def' => '*',
          'mod' => 'Spreadsheet::ParseXLSX'
        };
$VAR7 = {
          'min' => '0.24',
          'vsn' => '0.27',
          'def' => '*',
          'mod' => 'Spreadsheet::ParseXLSX',
          'ext' => 'xlsx'
        };
$VAR8 = {
          'min' => '0.13',
          'vsn' => '-',
          'ext' => 'xlsx',
          'def' => '',
          'mod' => 'Spreadsheet::XLSX'
        };
$VAR9 = {
          'vsn' => undef,
          'min' => '',
          'ext' => 'zzz2',
          'mod' => 'Z20::Just::For::Testing',
          'def' => '*'
        };

also:

$ perl -MSpreadsheet::Read -E'say Spreadsheet::Read::parses( "csv" )'
Text::CSV_XS
$ perl -MText::CSV_XS -E'say Text::CSV_XS->VERSION'
1.50

答案1

得分: 0

你问是否可以强制 Spreadsheet::Read 使用 Text::CSV_XS。

但你也说输出来自以下命令是 Text::CSV_XS

perl -Mv5.14 -MSpreadsheet::Read -e'say Spreadsheet::Read::parses( "csv" )'

这证明了确实正在使用 Text::CSV_XS。

英文:

You asked if you could force Spreadsheet::Read to use Text::CSV_XS.

But you also said the output from the follow is Text::CSV_XS.

perl -Mv5.14 -MSpreadsheet::Read -e'say Spreadsheet::Read::parses( "csv" )'

This demonstrates that Text::CSV_XS is being used.

huangapple
  • 本文由 发表于 2023年5月26日 11:37:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76337503.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定