2023年4月17日 16:19:49go评论72阅读模式

英文:

PHP count and seperate text from text file

问题

在给定的文本文件中，您想要统计每个用户说了多少个单词。以下是您的代码的翻译部分：

$usercount1 = 0;
$usercount2 = 0;

// 以只读模式打开文件
$file = fopen("logfile.txt", "r");

// 逐行读取文件内容
while (($line = fgets($file)) !== false) {
    // 拆分每一行成单词
    $words = explode(" ", $line);

    // 如果行包含"<Amanda>"，则计算Amanda说的单词数
    if (strpos($line, "<Amanda>") !== false) {
        $usercount1 = $usercount1 + count($words);
    }

    // 如果行包含"<Jack>"，则计算Jack说的单词数
    if (strpos($line, "<Jack>") !== false) {
        $usercount2 = $usercount2 + count($words);
    }
}

// 输出Amanda和Jack说的单词数
echo "Amanda: " . $usercount1 . "\n";
echo "Jack: " . $usercount2 . "\n";

// 关闭文件
fclose($file);

这段代码会统计每个用户说的单词数，并将结果存储在$usercount1和$usercount2变量中。然后，它会输出Amanda和Jack说的单词数。

英文:

Let's say inside the text file is this information:

&lt;Amanda&gt; Hi there, how are you?
&lt;Jack&gt; Hi, im fine 
.
.
.
.
&lt;Jack&gt; see you later

I want to count the words each user have said the output should be for example like this

Amanda: 50
Jack: 40

First I want to not count the <Amanda> or <Jack> and next I want to count every words they said and insert it to variables Amanda and Jack

This is what I have done

    $usercount1 = 0;
    $usercount2 = 0;  

    //Opens a file in read mode  
    $file = fopen(&quot;logfile.txt&quot;, &quot;r&quot;);  
    //Gets each line till end of file is reached  
    while (($line = fgets($file)) !== false) {  
        //Splits each line into words
        $words = explode(&quot; &quot;, $line);  
        $words = explode(&quot;&lt;Amanda&gt;&quot;, $line);  
        //Counts each word  
        $usercount1 = $usercount1 + count($words);  
    }

    while (($line = fgets($file)) !== false) {  
        //Splits each line into words  
        $words = explode(&quot; &quot;, $line);
        //Counts each word  
        $usercount2 = $usercount2 + count($words);  
    }

答案1

得分: 2

以下是您提供的代码的翻译：

根据我的理解，这可能是一个可能的解决方案。

// 输入
$input = "<Amanda> 你好，你好吗？
<Jack> 嗨，我很好。
<Jack> 再见。";

// 初始化计数器
$amandaCount = 0;
$jackCount = 0;

// 按行分割输入
$lines = explode("\n", $input);

// 遍历每一行
foreach ($lines as $line) {
  // 删除用户标签
  $cleanLine = preg_replace("/<.+?>/", "", $line);
  
  // 将行分割成单词
  $words = str_word_count($cleanLine, 1);
  
  // 统计每个用户的单词数
  if (strpos($line, "<Amanda>") !== false) {
    $amandaCount += count($words);
  } elseif (strpos($line, "<Jack>") !== false) {
    $jackCount += count($words);
  }
}

// 输出
echo "Amanda: $amandaCount\n";
echo "Jack: $jackCount\n";

英文:

As per my understanding this could be a possible solution.


// Input
$input = &quot;&lt;Amanda&gt; Hi there, how are you?\n&lt;Jack&gt; Hi, im fine \n &lt;Jack&gt; see you later&quot;;

// Initialize counters
$amandaCount = 0;
$jackCount = 0;

// Split input by lines
$lines = explode(&quot;\n&quot;, $input);

// Loop over lines
foreach ($lines as $line) {
  // Remove user tags
  $cleanLine = preg_replace(&quot;/&lt;.+?&gt;/&quot;, &quot;&quot;, $line);
  
  // Split line into words
  $words = str_word_count($cleanLine, 1);
  
  // Count words per user
  if (strpos($line, &quot;&lt;Amanda&gt;&quot;) !== false) {
    $amandaCount += count($words);
  } elseif (strpos($line, &quot;&lt;Jack&gt;&quot;) !== false) {
    $jackCount += count($words);
  }
}

// Output
echo &quot;Amanda: $amandaCount\n&quot;;
echo &quot;Jack: $jackCount\n&quot;;

答案2

得分: 1

我会采用更通用的方法。这样，你可以分析所有用户。使用黑名单，只需将它们排除。

首先，遍历所有行，匹配用户名和文本。
通过迭代和使用黑名单进行计数来重建数据结构。

黑名单的格式如下，因为查找键比查找值更快。

$input = &lt;&lt;&lt;&#39;_TEXT&#39;
&lt;Amanda&gt; Hi there, how are you?
&lt;Jack&gt; Hi, im fine
&lt;Jack&gt; see you later
&lt;John&gt; Hello World, my friends!
&lt;Daniel&gt; Foo!
_TEXT;
preg_match_all(&#39;/^&lt;([^&gt;]+)&gt;(.*?)$/m&#39;, $input, $matches);

$blacklist = [&#39;Amanda&#39; =&gt; 1, &#39;Jack&#39; =&gt; 1];
$words = [];
foreach ($matches[2] as $index =&gt; $match) {
    $user = $matches[1][$index];
    if (isset($blacklist[$user])) {
        continue;
    }
    $words[$user] = ($words[$user] ?? 0) + str_word_count($match);
}
print_r($words);

Array
(
    [John] =&gt; 4
    [Daniel] =&gt; 1
)

英文:

I would go a more general approach. This way you can analyze all users. Using a blacklist, just exclude them.

First go through all the lines and match for username and text.
Rebuild data structure by iterating and counting up using a blacklist.

The blacklist is formatted like this, because finding keys is faster than finding values.

$input = &lt;&lt;&lt;&#39;_TEXT&#39;
&lt;Amanda&gt; Hi there, how are you?
&lt;Jack&gt; Hi, im fine
&lt;Jack&gt; see you later
&lt;John&gt; Hello World, my friends!
&lt;Daniel&gt; Foo!
_TEXT;
preg_match_all(&#39;/^&lt;([^&gt;]+)&gt;(.*?)$/m&#39;, $input, $matches);

$blacklist = [&#39;Amanda&#39; =&gt; 1, &#39;Jack&#39; =&gt; 1];
$words = [];
foreach ($matches[2] as $index =&gt; $match) {
    $user = $matches[1][$index];
    if (isset($blacklist[$user])) {
        continue;
    }
    $words[$user] = ($words[$user] ?? 0) + str_word_count($match);
}
print_r($words);

Array
(
    [John] =&gt; 4
    [Daniel] =&gt; 1
)

答案3

得分: 1

I would implement the blacklisted names in the regex to filter them out as early as possible.

在正则表达式中实现黑名单的名字以尽早将它们排除。

A negated lookahead ensures that Amanda and Jack are excluded. (?!Amanda>|Jack>)

否定前瞻确保了排除了Amanda和Jack。(?!Amanda>|Jack>)

The m pattern modifier changes the meaning of the ^ ("start of string" anchor) to be the "start of a line" anchor.

模式修饰符m改变了^（"字符串开头"锚点）的意义，使其成为"行的开头"锚点。

Parentheses around the name subpattern will create capture group 1 (accessible as element [1]). \K will restart the fullstring match, so the space-delimited words substring will be accessible via [0].

在名称子模式周围的括号将创建捕获组1（可通过元素[1]访问）。\K会重新开始完整字符串的匹配，因此以空格分隔的单词子字符串可以通过[0]访问。

Use destructuring syntax in the foreach() for convenient variables.

在foreach()中使用解构语法以获取便捷的变量。

Code: (Demo)

代码：（示例）

preg_match_all(
    '/^&lt;((?!Amanda&gt;|Jack&gt;)[^&gt;]+)&gt; \K.+/m',
    $chat,
    $matches,
    PREG_SET_ORDER
);
$result = [];
foreach ($matches as [$words, $name]) {
    $result[$name] = ($result[$name] ?? 0) + str_word_count($words);
}
var_export($result);

英文:

I would implement the blacklisted names in the regex to filter them out as early as possible.

A negated lookahead ensures that Amanda and Jack are excluded. (?!Amanda>|Jack>)

The m pattern modifier changes the meaning of the ^ ("start of string" anchor) to be the "start of a line" anchor.

Use destructuring syntax in the foreach() for convenient variables.

Code: (Demo)

preg_match_all(
    &#39;/^&lt;((?!Amanda&gt;|Jack&gt;)[^&gt;]+)&gt; \K.+/m&#39;,
    $chat,
    $matches,
    PREG_SET_ORDER
);
$result = [];
foreach ($matches as [$words, $name]) {
    $result[$name] = ($result[$name] ?? 0) + str_word_count($words);
}
var_export($result);

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

PHP统计并从文本文件中分离文本。

问题

答案1

答案2

答案3

如何使用Laravel 8将数据保存到数据库？

如何使函数参数在数组和字符串中都可选，使用同一个变量。

php8.2在Ubuntu 18.04中：E: 无法找到软件包php8.2

Illuminate\Database\Eloquent\Collection::hasAnyUnit 方法不存在

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论