PHP:在同时提取一些统计信息的同时验证“文本”文件的行吗?

huangapple go评论97阅读模式
英文:

PHP: validate the lines of a "text" file while extracting some stats at the same time?

问题

以下是您提供的代码的中文翻译部分:

我有一个文件(来自POST请求),我想要根据一些约束进行验证:

  • 所有行必须仅由ASCII可打印字符组成。
  • 必须至少有一个XYZ记录(以@XYZ 开头的行)。
  • 最多可以有999999个XYZ记录。

为此,我创建了一个通用函数,它按块读取文件并将每行传递给回调函数进行验证:

  1. /*
  2. * 遍历文件的每一行,将它们传递给回调函数进行验证。
  3. * 当回调函数返回false或出现错误时,验证过程结束。
  4. *
  5. * @param string $filename 要验证的文件的名称。
  6. * @param callable $callback 用于验证每一行的回调函数。
  7. * @param string $line_delimiter 行结束分隔符(默认为"\n")。
  8. * @param integer $buffer_size 一次从文件中读取的最大字节数(默认为8192)。
  9. *
  10. * @return 当$callback对每一行都返回true时返回true,否则返回false,出错时返回null。
  11. *
  12. * @warning 当$buffer_size不足以包含整行时,$callback将验证行的块。
  13. */
  14. function validate_file_lines($filename, $callback, $line_delimiter = "\n", $buffer_size = 8192)
  15. {
  16. $handle = fopen($filename, 'rb');
  17. $is_valid = (false === $handle ? null : true);
  18. $remainder = '';
  19. while ($is_valid && !feof($handle))
  20. {
  21. $buffer = fread($handle, $buffer_size);
  22. if (false === $buffer)
  23. {
  24. $is_valid = null;
  25. }
  26. else
  27. {
  28. $lines_array = explode($line_delimiter, $buffer);
  29. $lines_array_key_last = count($lines_array) - 1;
  30. $lines_array[0] = $remainder . $lines_array[0];
  31. if ($lines_array_key_last !== 0)
  32. {
  33. $remainder = $lines_array[$lines_array_key_last];
  34. unset($lines_array[$lines_array_key_last]);
  35. }
  36. foreach ($lines_array as $line)
  37. {
  38. $is_valid = $callback($line);
  39. if (!$is_valid)
  40. break;
  41. }
  42. }
  43. }
  44. @fclose($handle);
  45. return $is_valid;
  46. }

现在,我正在尝试使用它来验证一个文件,例如:

  1. HEAD good
  2. @XYZ 1
  3. @XYZ 1
  4. %END
  5. HEAD better
  6. @XYZ 2 2
  7. %END
  1. $xyz_count = 0;
  2. $xyz_min = 1;
  3. $xyz_max = 999999;
  4. $is_valid_line = function ($line) use (&$xyz_count, $xyz_max) {
  5. $is_valid = true;
  6. if (ctype_print($line))
  7. {
  8. if (substr($line, 0, 6) === '@XYZ ')
  9. {
  10. ++$xyz_count;
  11. $is_valid = $xyz_count <= $xyz_max;
  12. }
  13. }
  14. else if ('' !== @$line[0])
  15. {
  16. $is_valid = false;
  17. }
  18. return $is_valid;
  19. };
  20. var_dump(
  21. validate_file_lines('file.txt', $is_valid_line) && $xyz_count >= $xyz_min
  22. );

当前输出为:

  1. bool(false)

而我期望的是:

  1. bool(true)

我做错了什么?


顺便问一下

SPL是否提供用于遍历文件行的任何类?

英文:

I have a file (from a POST request) that I would like to validate against some constraints:

  • All lines must be composed of ASCII printable characters only.
  • There must be at least one XYZ record (lines that start with @XYZ ).
  • There must be at most 999999 XYZ records

For that purpose I made a generic function that reads a file by chunks and pass each line to a callback for validation:

  1. /*
  2. * Iterates over each line of the file, passing them to the callback function for validation.
  3. * When the callback function returns false, or when there is an error,
  4. * the validation process ends.
  5. *
  6. * @param string $filename The name of the file to validate.
  7. * @param callable $callback The callback function to use for validating each line.
  8. * @param string $line_delimiter The line-ending delimiter (default is &quot;\n&quot;).
  9. * @param integer $buffer_size The maximum number of bytes to read from the file at a time (default is 8192).
  10. *
  11. * @return Returns true when $callback returned true for each line, false if not, and NULL on error.
  12. *
  13. * @warning When $buffer_size is not large enough to contain a whole line, $callback will validate chunks of lines.
  14. */
  15. function validate_file_lines($filename, $callback, $line_delimiter = &quot;\n&quot;, $buffer_size = 8192)
  16. {
  17. $handle = fopen($filename, &#39;rb&#39;);
  18. $is_valid = (false === $handle ? null : true);
  19. $remainder = &#39;&#39;;
  20. while ( $is_valid &amp;&amp; !feof($handle) )
  21. {
  22. $buffer = fread($handle, $buffer_size);
  23. if ( false === $buffer )
  24. {
  25. $is_valid = null;
  26. }
  27. else
  28. {
  29. $lines_array = explode($line_delimiter, $buffer);
  30. $lines_array_key_last = count($lines_array) - 1;
  31. $lines_array[0] = $remainder . $lines_array[0];
  32. if ( $lines_array_key_last !== 0 )
  33. {
  34. $remainder = $lines_array[$lines_array_key_last];
  35. unset($lines_array[$lines_array_key_last]);
  36. }
  37. foreach ( $lines_array as $line )
  38. {
  39. $is_valid = $callback($line);
  40. if ( ! $is_valid )
  41. break;
  42. }
  43. }
  44. }
  45. @fclose($handle);
  46. return $is_valid;
  47. }

Now, using it, I'm trying to validate a file, for example:

  1. HEAD good
  2. @XYZ 1
  3. @XYZ 1
  4. %END
  5. HEAD better
  6. @XYZ 2 2
  7. %END
  1. $xyz_count = 0;
  2. $xyz_min = 1;
  3. $xyz_max = 999999;
  4. $is_valid_line = function($line) use(&amp;$xyz_count, $xyz_max) {
  5. $is_valid = true;
  6. if ( ctype_print($line) )
  7. {
  8. if ( substr($line, 0, 6) === &#39;@XYZ &#39; )
  9. {
  10. ++$xyz_count;
  11. $is_valid = $xyz_count &lt;= $xyz_max;
  12. }
  13. }
  14. else if ( &#39;&#39; !== @$line[0] )
  15. {
  16. $is_valid = false;
  17. }
  18. return $is_valid;
  19. };
  20. var_dump(
  21. validate_file_lines(&#39;file.txt&#39;, $is_valid_line) &amp;&amp; $xyz_count &gt;= $xyz_min
  22. );

The current output is:

  1. bool(false)

While I'm expecting:

  1. bool(true)

What am I doing wrong?


ASIDE

Does the SPL provide any class for iterating over file lines?

答案1

得分: 1

你的 substr() 需要是 5 个字符,而不是 6 个。你可以使用 fgets() 按行读取。这是一个可能会起作用的简单解决方案。你的模式应该只是 r

此外,你可以添加调试打印来显示发生错误的位置。

  1. <?php
  2. $fh = fopen($filename, 'r');
  3. $valid = true;
  4. $xyz_count = 0;
  5. while ($valid && $line = fgets($fh)){
  6. if (!ctype_print($line)) $valid = false;
  7. if (substr($line, 0, 5) == '@XYZ ') $xyz_count++;
  8. if ($xyz_count >= $xyz_max) $valid = false;
  9. // if (!$valid) echo "LINE (fail): {$line}";
  10. }
  11. if ($xyz_count === 0) $valid = false;
  12. fclose($fh);
英文:

Your substr() needs to be 5 chars, not 6. You can use fgets() to read by line. Here's a barebones solution that might probably work. And your mode should just be r

Also, you might add debug printing to show where errors are happening.

  1. &lt;?php
  2. $fh = fopen($filename, &#39;r&#39;);
  3. $valid = true;
  4. $xyz_count = 0;
  5. while ($valid &amp;&amp; $line = fgets($fh)){
  6. if (!ctype_print($line))$valid = false;
  7. if (substr($line, 0, 5) == &#39;@XYZ &#39;)$xyz_count++;
  8. if ($xyz_count &gt;= $xyz_max)$valid = false;
  9. // if (!$valid)echo &quot;LINE (fail): {$line}&quot;;
  10. }
  11. if ($xyz_count === 0)$valid = false;
  12. fclose($fh);

huangapple
  • 本文由 发表于 2023年7月3日 03:13:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76600430.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定