英文:
Why are some words printed twice when working with word frequency
问题
output:
the : 2230 times
of : 1254 times
to : 1177 times
a : 1121 times
and : 1109 times
said : 680 times
it : 665 times
was : 605 times
in : 590 times
he : 546 times
that : 520 times
you : 495 times
I : 428 times
on : 349 times
Arthur : 332 times
his : 324 times
Ford : 314 times
The : 307 times
at : 306 times
for : 284 times
is : 281 times
with : 273 times
had : 252 times
He : 242 times
this : 220 times
as : 207 times
Zaphod : 206 times
be : 188 times
all : 186 times
him : 182 times
"the" is printed twice. Also "could not open file" is printed at the top even though the file was open and its content is stored in the map.
英文:
I read some words from a file and print the 30 most frequent words but some words are printed
twice as you can see in the output.
#include <iostream>
#include <vector>
#include <map>
#include <iterator>
#include <fstream>
using namespace std;
int main(){
fstream fs, output;
fs.open("/Users/brah79/Downloads/skola/c++/inlämningsuppgifter/labb4/L4_wc/hitchhikersguide.txt");
output.open("/Users/brah79/Downloads/skola/c++/inlämningsuppgifter/labb4/labb4/output.txt");
if(!fs.is_open() || !output.is_open()){
cout << "could not open file" << endl;
}
map <string, int> mp;
string word;
while(fs >> word){
for(int i = 0; i < word.length(); i++){
if(!isalpha(word[i])){
word.erase(i--, 1);
}
}
if(word.empty()){
continue;
}
mp[word]++;
}
vector<pair<int, string>> v;
v.reserve(mp.size());
for (const auto& p : mp){
v.emplace_back(p.second, p.first);
}
sort(v.rbegin(), v.rend());
cout << "Theese are the 30 most frequent words: " << endl;
for(int i = 0; i < 30; i++){
cout << v[i].second << " : " << v[i].first << " times" << endl;
}
output << "Theese are the 30 most frequent words: " << endl;
for(int i = 0; i < 30; i++){
cout << v[i].second << " : " << v[i].first << " times" << endl;
}
return 0;
}
output:
the : 2230 times !!!
of : 1254 times
to : 1177 times
a : 1121 times
and : 1109 times
said : 680 times
it : 665 times
was : 605 times
in : 590 times
he : 546 times
that : 520 times
you : 495 times
I : 428 times
on : 349 times
Arthur : 332 times
his : 324 times
Ford : 314 times
The : 307 times !!!
at : 306 times
for : 284 times
is : 281 times
with : 273 times
had : 252 times
He : 242 times
this : 220 times
as : 207 times
Zaphod : 206 times
be : 188 times
all : 186 times
him : 182 times
"the" is printed twice. Also "could not open file" is printed at the top even
though the file was open and it's content is stored in the map.
答案1
得分: 2
因为你以区分大小写的方式编写了你的程序。
特别是,The
和 the
被认为是不同的,因此它们具有不同的频率。例如,the
出现了2230次,而 The
出现了307次。
英文:
Because you've written your program in an case-sensitive manner.
In particular, The
and the
are considered different from each other and so have different frequencies. For example, the
is 2230 times while The
is 307 times.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论