英文:
PHP: validate the lines of a "text" file while extracting some stats at the same time?
问题
以下是您提供的代码的中文翻译部分:
我有一个文件(来自POST请求),我想要根据一些约束进行验证:
- 所有行必须仅由ASCII可打印字符组成。
- 必须至少有一个XYZ记录(以
@XYZ
开头的行)。 - 最多可以有999999个XYZ记录。
为此,我创建了一个通用函数,它按块读取文件并将每行传递给回调函数进行验证:
/*
* 遍历文件的每一行,将它们传递给回调函数进行验证。
* 当回调函数返回false或出现错误时,验证过程结束。
*
* @param string $filename 要验证的文件的名称。
* @param callable $callback 用于验证每一行的回调函数。
* @param string $line_delimiter 行结束分隔符(默认为"\n")。
* @param integer $buffer_size 一次从文件中读取的最大字节数(默认为8192)。
*
* @return 当$callback对每一行都返回true时返回true,否则返回false,出错时返回null。
*
* @warning 当$buffer_size不足以包含整行时,$callback将验证行的块。
*/
function validate_file_lines($filename, $callback, $line_delimiter = "\n", $buffer_size = 8192)
{
$handle = fopen($filename, 'rb');
$is_valid = (false === $handle ? null : true);
$remainder = '';
while ($is_valid && !feof($handle))
{
$buffer = fread($handle, $buffer_size);
if (false === $buffer)
{
$is_valid = null;
}
else
{
$lines_array = explode($line_delimiter, $buffer);
$lines_array_key_last = count($lines_array) - 1;
$lines_array[0] = $remainder . $lines_array[0];
if ($lines_array_key_last !== 0)
{
$remainder = $lines_array[$lines_array_key_last];
unset($lines_array[$lines_array_key_last]);
}
foreach ($lines_array as $line)
{
$is_valid = $callback($line);
if (!$is_valid)
break;
}
}
}
@fclose($handle);
return $is_valid;
}
现在,我正在尝试使用它来验证一个文件,例如:
HEAD good
@XYZ 1
@XYZ 1
%END
HEAD better
@XYZ 2 2
%END
$xyz_count = 0;
$xyz_min = 1;
$xyz_max = 999999;
$is_valid_line = function ($line) use (&$xyz_count, $xyz_max) {
$is_valid = true;
if (ctype_print($line))
{
if (substr($line, 0, 6) === '@XYZ ')
{
++$xyz_count;
$is_valid = $xyz_count <= $xyz_max;
}
}
else if ('' !== @$line[0])
{
$is_valid = false;
}
return $is_valid;
};
var_dump(
validate_file_lines('file.txt', $is_valid_line) && $xyz_count >= $xyz_min
);
当前输出为:
bool(false)
而我期望的是:
bool(true)
我做错了什么?
顺便问一下
SPL是否提供用于遍历文件行的任何类?
英文:
I have a file (from a POST request) that I would like to validate against some constraints:
- All lines must be composed of ASCII printable characters only.
- There must be at least one XYZ record (lines that start with
@XYZ
). - There must be at most 999999 XYZ records
For that purpose I made a generic function that reads a file by chunks and pass each line to a callback for validation:
/*
* Iterates over each line of the file, passing them to the callback function for validation.
* When the callback function returns false, or when there is an error,
* the validation process ends.
*
* @param string $filename The name of the file to validate.
* @param callable $callback The callback function to use for validating each line.
* @param string $line_delimiter The line-ending delimiter (default is "\n").
* @param integer $buffer_size The maximum number of bytes to read from the file at a time (default is 8192).
*
* @return Returns true when $callback returned true for each line, false if not, and NULL on error.
*
* @warning When $buffer_size is not large enough to contain a whole line, $callback will validate chunks of lines.
*/
function validate_file_lines($filename, $callback, $line_delimiter = "\n", $buffer_size = 8192)
{
$handle = fopen($filename, 'rb');
$is_valid = (false === $handle ? null : true);
$remainder = '';
while ( $is_valid && !feof($handle) )
{
$buffer = fread($handle, $buffer_size);
if ( false === $buffer )
{
$is_valid = null;
}
else
{
$lines_array = explode($line_delimiter, $buffer);
$lines_array_key_last = count($lines_array) - 1;
$lines_array[0] = $remainder . $lines_array[0];
if ( $lines_array_key_last !== 0 )
{
$remainder = $lines_array[$lines_array_key_last];
unset($lines_array[$lines_array_key_last]);
}
foreach ( $lines_array as $line )
{
$is_valid = $callback($line);
if ( ! $is_valid )
break;
}
}
}
@fclose($handle);
return $is_valid;
}
Now, using it, I'm trying to validate a file, for example:
HEAD good
@XYZ 1
@XYZ 1
%END
HEAD better
@XYZ 2 2
%END
$xyz_count = 0;
$xyz_min = 1;
$xyz_max = 999999;
$is_valid_line = function($line) use(&$xyz_count, $xyz_max) {
$is_valid = true;
if ( ctype_print($line) )
{
if ( substr($line, 0, 6) === '@XYZ ' )
{
++$xyz_count;
$is_valid = $xyz_count <= $xyz_max;
}
}
else if ( '' !== @$line[0] )
{
$is_valid = false;
}
return $is_valid;
};
var_dump(
validate_file_lines('file.txt', $is_valid_line) && $xyz_count >= $xyz_min
);
The current output is:
bool(false)
While I'm expecting:
bool(true)
What am I doing wrong?
ASIDE
Does the SPL provide any class for iterating over file lines?
答案1
得分: 1
你的 substr()
需要是 5 个字符,而不是 6 个。你可以使用 fgets()
按行读取。这是一个可能会起作用的简单解决方案。你的模式应该只是 r
。
此外,你可以添加调试打印来显示发生错误的位置。
<?php
$fh = fopen($filename, 'r');
$valid = true;
$xyz_count = 0;
while ($valid && $line = fgets($fh)){
if (!ctype_print($line)) $valid = false;
if (substr($line, 0, 5) == '@XYZ ') $xyz_count++;
if ($xyz_count >= $xyz_max) $valid = false;
// if (!$valid) echo "LINE (fail): {$line}";
}
if ($xyz_count === 0) $valid = false;
fclose($fh);
英文:
Your substr()
needs to be 5 chars, not 6. You can use fgets()
to read by line. Here's a barebones solution that might probably work. And your mode should just be r
Also, you might add debug printing to show where errors are happening.
<?php
$fh = fopen($filename, 'r');
$valid = true;
$xyz_count = 0;
while ($valid && $line = fgets($fh)){
if (!ctype_print($line))$valid = false;
if (substr($line, 0, 5) == '@XYZ ')$xyz_count++;
if ($xyz_count >= $xyz_max)$valid = false;
// if (!$valid)echo "LINE (fail): {$line}";
}
if ($xyz_count === 0)$valid = false;
fclose($fh);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论