英文:
PHP count and seperate text from text file
问题
在给定的文本文件中,您想要统计每个用户说了多少个单词。以下是您的代码的翻译部分:
$usercount1 = 0;
$usercount2 = 0;
// 以只读模式打开文件
$file = fopen("logfile.txt", "r");
// 逐行读取文件内容
while (($line = fgets($file)) !== false) {
// 拆分每一行成单词
$words = explode(" ", $line);
// 如果行包含"<Amanda>",则计算Amanda说的单词数
if (strpos($line, "<Amanda>") !== false) {
$usercount1 = $usercount1 + count($words);
}
// 如果行包含"<Jack>",则计算Jack说的单词数
if (strpos($line, "<Jack>") !== false) {
$usercount2 = $usercount2 + count($words);
}
}
// 输出Amanda和Jack说的单词数
echo "Amanda: " . $usercount1 . "\n";
echo "Jack: " . $usercount2 . "\n";
// 关闭文件
fclose($file);
这段代码会统计每个用户说的单词数,并将结果存储在$usercount1
和$usercount2
变量中。然后,它会输出Amanda和Jack说的单词数。
英文:
Let's say inside the text file is this information:
<Amanda> Hi there, how are you?
<Jack> Hi, im fine
.
.
.
.
<Jack> see you later
I want to count the words each user have said the output should be for example like this
Amanda: 50
Jack: 40
First I want to not count the <Amanda>
or <Jack>
and next I want to count every words they said and insert it to variables Amanda and Jack
This is what I have done
$usercount1 = 0;
$usercount2 = 0;
//Opens a file in read mode
$file = fopen("logfile.txt", "r");
//Gets each line till end of file is reached
while (($line = fgets($file)) !== false) {
//Splits each line into words
$words = explode(" ", $line);
$words = explode("<Amanda>", $line);
//Counts each word
$usercount1 = $usercount1 + count($words);
}
while (($line = fgets($file)) !== false) {
//Splits each line into words
$words = explode(" ", $line);
//Counts each word
$usercount2 = $usercount2 + count($words);
}
答案1
得分: 2
以下是您提供的代码的翻译:
根据我的理解,这可能是一个可能的解决方案。
// 输入
$input = "<Amanda> 你好,你好吗?
<Jack> 嗨,我很好。
<Jack> 再见。";
// 初始化计数器
$amandaCount = 0;
$jackCount = 0;
// 按行分割输入
$lines = explode("\n", $input);
// 遍历每一行
foreach ($lines as $line) {
// 删除用户标签
$cleanLine = preg_replace("/<.+?>/", "", $line);
// 将行分割成单词
$words = str_word_count($cleanLine, 1);
// 统计每个用户的单词数
if (strpos($line, "<Amanda>") !== false) {
$amandaCount += count($words);
} elseif (strpos($line, "<Jack>") !== false) {
$jackCount += count($words);
}
}
// 输出
echo "Amanda: $amandaCount\n";
echo "Jack: $jackCount\n";
英文:
As per my understanding this could be a possible solution.
// Input
$input = "<Amanda> Hi there, how are you?\n<Jack> Hi, im fine \n <Jack> see you later";
// Initialize counters
$amandaCount = 0;
$jackCount = 0;
// Split input by lines
$lines = explode("\n", $input);
// Loop over lines
foreach ($lines as $line) {
// Remove user tags
$cleanLine = preg_replace("/<.+?>/", "", $line);
// Split line into words
$words = str_word_count($cleanLine, 1);
// Count words per user
if (strpos($line, "<Amanda>") !== false) {
$amandaCount += count($words);
} elseif (strpos($line, "<Jack>") !== false) {
$jackCount += count($words);
}
}
// Output
echo "Amanda: $amandaCount\n";
echo "Jack: $jackCount\n";
答案2
得分: 1
我会采用更通用的方法。这样,你可以分析所有用户。使用黑名单,只需将它们排除。
- 首先,遍历所有行,匹配用户名和文本。
- 通过迭代和使用黑名单进行计数来重建数据结构。
黑名单的格式如下,因为查找键比查找值更快。
$input = <<<'_TEXT'
<Amanda> Hi there, how are you?
<Jack> Hi, im fine
<Jack> see you later
<John> Hello World, my friends!
<Daniel> Foo!
_TEXT;
preg_match_all('/^<([^>]+)>(.*?)$/m', $input, $matches);
$blacklist = ['Amanda' => 1, 'Jack' => 1];
$words = [];
foreach ($matches[2] as $index => $match) {
$user = $matches[1][$index];
if (isset($blacklist[$user])) {
continue;
}
$words[$user] = ($words[$user] ?? 0) + str_word_count($match);
}
print_r($words);
Array
(
[John] => 4
[Daniel] => 1
)
英文:
I would go a more general approach. This way you can analyze all users. Using a blacklist, just exclude them.
- First go through all the lines and match for username and text.
- Rebuild data structure by iterating and counting up using a blacklist.
The blacklist is formatted like this, because finding keys is faster than finding values.
$input = <<<'_TEXT'
<Amanda> Hi there, how are you?
<Jack> Hi, im fine
<Jack> see you later
<John> Hello World, my friends!
<Daniel> Foo!
_TEXT;
preg_match_all('/^<([^>]+)>(.*?)$/m', $input, $matches);
$blacklist = ['Amanda' => 1, 'Jack' => 1];
$words = [];
foreach ($matches[2] as $index => $match) {
$user = $matches[1][$index];
if (isset($blacklist[$user])) {
continue;
}
$words[$user] = ($words[$user] ?? 0) + str_word_count($match);
}
print_r($words);
Array
(
[John] => 4
[Daniel] => 1
)
答案3
得分: 1
I would implement the blacklisted names in the regex to filter them out as early as possible.
在正则表达式中实现黑名单的名字以尽早将它们排除。
A negated lookahead ensures that Amanda and Jack are excluded. (?!Amanda>|Jack>)
否定前瞻确保了排除了Amanda和Jack。(?!Amanda>|Jack>)
The m
pattern modifier changes the meaning of the ^
("start of string" anchor) to be the "start of a line" anchor.
模式修饰符m
改变了^
("字符串开头"锚点)的意义,使其成为"行的开头"锚点。
Parentheses around the name subpattern will create capture group 1 (accessible as element [1]
). \K
will restart the fullstring match, so the space-delimited words substring will be accessible via [0]
.
在名称子模式周围的括号将创建捕获组1(可通过元素[1]
访问)。\K
会重新开始完整字符串的匹配,因此以空格分隔的单词子字符串可以通过[0]
访问。
Use destructuring syntax in the foreach()
for convenient variables.
在foreach()
中使用解构语法以获取便捷的变量。
Code: (Demo)
代码:(示例)
preg_match_all(
'/^<((?!Amanda>|Jack>)[^>]+)> \K.+/m',
$chat,
$matches,
PREG_SET_ORDER
);
$result = [];
foreach ($matches as [$words, $name]) {
$result[$name] = ($result[$name] ?? 0) + str_word_count($words);
}
var_export($result);
英文:
I would implement the blacklisted names in the regex to filter them out as early as possible.
A negated lookahead ensures that Amanda and Jack are excluded. (?!Amanda>|Jack>)
The m
pattern modifier changes the meaning of the ^
("start of string" anchor) to be the "start of a line" anchor.
Parentheses around the name subpattern will create capture group 1 (accessible as element [1]
). \K
will restart the fullstring match, so the space-delimited words substring will be accessible via [0]
.
Use destructuring syntax in the foreach()
for convenient variables.
Code: (Demo)
preg_match_all(
'/^<((?!Amanda>|Jack>)[^>]+)> \K.+/m',
$chat,
$matches,
PREG_SET_ORDER
);
$result = [];
foreach ($matches as [$words, $name]) {
$result[$name] = ($result[$name] ?? 0) + str_word_count($words);
}
var_export($result);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论