英文:
Counting occurrences of dynamic strings in TXT / Powershell
问题
我有一个包含 350,000 个URL路径的TXT文件。我想要统计路径中包含的字符串的不同出现次数。我事先不知道这些字符串是什么,只知道它们位于路径的第二个位置。
路径的样式如下:
images/STRINGA/2XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGA/3XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/4XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGC/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
所以,我使用 $searchstring= $item.Split("/")[1] 来拆分路径。现在,我可以迭代TXT文件中的所有行来提取字符串。但如何创建一个最终的文档,看起来像这样:
STRINGA;2
STRINGB;2
STRINGC;1
如何动态提取字符串然后搜索其出现次数?我只需要一点提示,然后我可以让其余的工作起来。
英文:
I have a TXT containing 350.000 URL paths. I want to count the distinct occurrences of a string that is included in the path. I don't know in advance the strings. I just know that it comes in 2nd place of the path.
The paths looks like this:
images/STRINGA/2XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGA/3XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/4XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGC/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
So I split the path with $searchstring= $item.Split("/")[1].
Now I can iterate through all the lines in the TXT extracting the string.
But how can I make a final document looking like this:
STRINGA;2
STRINGB;2
STRINGC;1
How can I dynamically extract a string and then search its occurrences? I need just a hint, I can get the rest working then.
答案1
得分: 1
使用Group-Object
命令:
Get-Content path\to\urls.txt | Group-Object { $_.Split('/')[1] } -NoElement | Select-Object @{Name='String';Expression='Name'}, Count | Export-Csv path\to\output.csv -Delimiter ';' -NoTypeInformation
英文:
Use the Group-Object
command:
Get-Content path\to\urls.txt |Group-Object { $_.Split('/')[1] } -NoElement |Select-Object @{Name='String';Expression='Name'},Count |Export-Csv path\to\output.csv -Delimiter ';' -NoTypeInformation
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论