统计 TXT / Powershell 中动态字符串出现的次数

huangapple go评论58阅读模式
英文:

Counting occurrences of dynamic strings in TXT / Powershell

问题

我有一个包含 350,000 个URL路径的TXT文件。我想要统计路径中包含的字符串的不同出现次数。我事先不知道这些字符串是什么,只知道它们位于路径的第二个位置。

路径的样式如下:

images/STRINGA/2XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGA/3XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/4XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGC/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg

所以,我使用 $searchstring= $item.Split("/")[1] 来拆分路径。现在,我可以迭代TXT文件中的所有行来提取字符串。但如何创建一个最终的文档,看起来像这样:

STRINGA;2
STRINGB;2
STRINGC;1

如何动态提取字符串然后搜索其出现次数?我只需要一点提示,然后我可以让其余的工作起来。

英文:

I have a TXT containing 350.000 URL paths. I want to count the distinct occurrences of a string that is included in the path. I don't know in advance the strings. I just know that it comes in 2nd place of the path.

The paths looks like this:

images/STRINGA/2XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGA/3XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/4XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGC/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg

So I split the path with $searchstring= $item.Split("/")[1].
Now I can iterate through all the lines in the TXT extracting the string.
But how can I make a final document looking like this:
STRINGA;2
STRINGB;2
STRINGC;1

How can I dynamically extract a string and then search its occurrences? I need just a hint, I can get the rest working then.

答案1

得分: 1

使用Group-Object命令:

Get-Content path\to\urls.txt | Group-Object { $_.Split('/')[1] } -NoElement | Select-Object @{Name='String';Expression='Name'}, Count | Export-Csv path\to\output.csv -Delimiter ';' -NoTypeInformation
英文:

Use the Group-Object command:

Get-Content path\to\urls.txt |Group-Object { $_.Split('/')[1] } -NoElement |Select-Object @{Name='String';Expression='Name'},Count |Export-Csv path\to\output.csv -Delimiter ';' -NoTypeInformation

huangapple
  • 本文由 发表于 2023年2月9日 00:04:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/75388530.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定