英文:
fread is reading the last part of my text file twice
问题
以下是您要翻译的代码部分:
void copyFile(std::vector<char> & output, const char * filename) {
output.clear();
FILE * file = fopen(filename, "r");
if (!file)
return;
{
struct stat statBuffer;
stat(filename, &statBuffer);
output.resize(statBuffer.st_size + 1);
fread(output.data(), 1, statBuffer.st_size, file);
output[statBuffer.st_size] = 0; // 确保它是空终止的
}
fclose(file);
}
希望这有所帮助。
英文:
I have a text file I want to read into an std::vector
. It's okay if the vector is a little too big, but it seems to be doing a very weird thing: It's copying the entire file, then copying a portion of the file near the end twice and appending it. (I think this might simply be garbage, but I don't know.)
So if the file looked like this (it's a pretty large txt
file):
0kb 100kb 200kb 300kb
v v v v
[1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ]
The copy in memory looks like this:
0kb 100kb 200kb 300kb 302kb
v v v v v
[1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ TUVW]
^^^^ this section is repeated at the end
I'm not entirely sure what's causing it, the code I wrote to perform this copy is this.
- I first use
stat
to attain a size that can hold the file, in bytes. This might end up being larger due to how windows does line endings. - I allocate my memory.
- Using
fread()
I copy the file into the vector in one shot.
void copyFile(std::vector<char> & output, const char * filename) {
output.clear();
FILE * file = fopen(filename, "r");
if (!file)
return;
{
struct stat statBuffer;
stat(filename, &statBuffer);
output.resize(statBuffer.st_size + 1);
fread(output.data(), 1, statBuffer.st_size, file);
output[statBuffer.st_size] = 0; // make sure it's null terminated
}
fclose(file);
}
My theory is that fread()
is reading past the end of the file and copying garbage? I am expecting fread()
to read n
bytes from the file, but perhaps that argument refers to n
bytes outputted instead? These values would differ since it's reading 2 bytes for each newline, then outputting 1... But I can't find any information on this. Nor would I know how to handle that without breaking my read operation into a bunch of really tiny "getline()" commands. But maybe that's just necessary? Any help is appreciated.
答案1
得分: 1
You should always check the return values of I/O functions. One sufficient reason is to check for errors, but when fread
might store fewer bytes than it reads (e.g., on Windows with files open in the default text mode), the return value is how you know how much was stored and thus how much of the buffer to use.
The apparent repetition of data at the end of the buffer is evidence of an implementation strategy of reading into the buffer in binary mode and then shifting characters back to hide the carriage returns. This isn’t significant to a correct program, but it makes sense that the standard library would make use of the provided buffer this way.
英文:
You should always check the return values of I/O functions. One sufficient reason is to check for errors, but when fread
might store fewer bytes than it reads (e.g., on Windows with files open in the default text mode), the return value is how you know how much was stored and thus how much of the buffer to use.
The apparent repetition of data at the end of the buffer is evidence of an implementation strategy of reading into the buffer in binary mode and then shifting characters back to hide the carriage returns. This isn’t significant to a correct program, but it makes sense that the standard library would make use of the provided buffer this way.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论