英文:
Using C to Fails to read a file of 4 byte Integers (Int32)
问题
我的C代码(使用VS 2015)在尝试读取包含多个4字节有符号整数(int32)的文件时出现问题,而二进制查看程序显示文件中的数据没有问题(图像1)。我尝试了几种读取数据文件的方法,但结果类似。我的问题很简单,下面的示例代码有什么问题?如果代码没有问题,那么数据文件可能出了什么问题?
我提供了一个示例数据文件的链接,如果有人有时间和兴趣来检查它,那就好了。在下面的两个代码示例中,读取在整数78处停止,根据二进制查看器,该整数为26。
示例代码1:
typedef signed __int32 INT32;
FILE *fp = NULL;
INT32 k;
int i=0;
fp = fopen(myfilePath, "r");
while(!feof(fp))
{
fread(&k,sizeof(INT32),1,fp);
printf("a[%d] = %d\n",i,k);
i++;
}
fclose(fp);
示例代码2:
typedef signed __int32 INT32;
FILE *fp = NULL;
long sz=0;
INT32 k;
int i=0;
fp = fopen(myfilePath, "r");
// 找到文件的大小
fseek(fp, 0L, SEEK_END);
sz = ftell(fp)/4; // 存储Int32数据计数
rewind(fp);
for(i=0;i<sz;i++)
{
fread(&k,sizeof(INT32),1,fp);
printf("a[%d] = %d\n",i,k);
}
fclose(fp);
二进制查看器正确读取整个文件,并指示(以黄色显示)C停止读取文件的位置
示例数据文件链接。大小:3,572字节。包含893个Int32值
感谢您的帮助!
英文:
My C code (using VS 2015) is failing to completely read files containing multiple 4 byte signed integers (int32), while a binary viewer program shows no issue with the data in the files (Image 1). I have tried several methods of reading the data files with similar results. My question is simply what is incorrect in the example codes below? If nothing is wrong with the code then what could be wrong with the data file?
I have provided a link to an example data file below if someone has the time and interest to examine it. In both code examples (below) the reading ceases at Integer number 78 which is = 26 according to the binary viewer.
Example Code 1:
typedef signed __int32 INT32;
FILE *fp = NULL;
INT32 k;
int i=0;
fp = fopen(myfilePath, "r");
while(!feof(fp))
{
fread(&k,sizeof(INT32),1,fp);
printf("a[%d] = %d\n",i,k);
i++;
}
fclose(fp);
Example Code 2:
typedef signed __int32 INT32;
FILE *fp = NULL;
long sz=0;
INT32 k;
int i=0
fp = fopen(myfilePath, "r");
// find the size of the file
fseek(fp, 0L, SEEK_END);
sz = ftell(fp)/4; // store the Int32 data count
rewind(fp);
for(i=0;i<sz;i++)
{
fread(&k,sizeof(INT32),1,fp);
printf("a[%d] = %d\n",i,k);
}
fclose(fp);
Link to example data file. Size: 3,572 bytes. Contains 893 Int32 values
Thank you for your assistance!
答案1
得分: 1
以下是您要翻译的部分:
你的输入文件稍微偏离了目标。你的输入文件提供了文件中包含的32位整数的数量作为第一个值。您只需读取第一个整数就可以知道需要为剩余的值分配多少存储空间。
而不是使用typedef
为有符号32位整数,标准C库提供了stdint.h
头文件,其中包括已提供为int32_t
的已知宽度类型。inttypes.h
头文件提供了用于打印和读取精确宽度类型的宏(例如,PRId32
用于打印,其中d
可以是u
无符号,x
十六进制,o
八进制,或i
整数,例如,SCNd32
用于与scanf一起使用)
因此,你实际上只需要打开文件以供读取(使用"rb"
作为可移植性的模式,'b'
不起作用,仅提供C89兼容性)。你打开文件并将第一个32位值读入变量中。这告诉您后面有多少32位值 -- 并使文件位置指示器准备好读取剩余的值,例如:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main (int argc, char **argv) {
int32_t *a = NULL, nint = 0; /* 指针和整数数量 */
/* 使用提供的文件名作为第一个参数(默认为“000002.dat”) */
FILE *fp = fopen (argc > 1 ? argv[1] : "000002.dat", "rb");
if (!fp) { /* 验证文件打开是否成功 */
perror ("文件打开失败");
return 1;
}
/* 从文件中的第一个值中读取int32_t值的数量 */
if (fread (&nint, 1, sizeof nint, fp) != sizeof nint) {
perror ("fread-nint");
return 1;
}
...
现在只需为剩余的值分配存储空间,可以使用`malloc()`(推荐),或者您可以声明*变长数组*,但是应该检查值的数量以确保不会尝试声明超过堆栈大小的数组 -- 这将依赖于编译器/操作系统。MS通常提供1M的堆栈,因此您应该能够声明约200K个整数的VLA -- 与您的其他堆栈使用相平衡。使用`malloc`的简单动态分配将消除堆栈溢出的风险...
...
剩下的就是使用`fread`从文件中读取其余的值。`fread`函数读取给定大小的一定数量的块,并将结果存储在提供的地址中。因此,您只需读取`nint`个`sizeof nint`(或者您可以使用`sizeof *a`-两者都是相同的类型)值。返回值将是从文件中读取的该大小的块数。您可以使用以下代码读取文件中的其余值:
...
(**注意:**始终验证您的分配是否成功,并通过检查`fread`的返回值来验证从文件中读取的数据。)
一个完整的示例,确认从文件中读取的32位值的数量可能是:
...
**示例用法/输出**
使用您提供的链接的文件,并将要读取的文件名作为程序的第一个参数传递给程序(或者默认情况下从当前目录中的文件中读取),您将得到:
$ ./bin/freadint32_t ../dat/000002.dat
从文件中读取了892个int32_t。
**文件中的值**
如果通过添加一个简单的循环输出文件中的值,您会发现:
16 25 22 11 17 20 19 23 22 16
17 22 25 25 18 22 24 17 15 18
25 14 14 29 16 14 23 23 21 20
28 24 17 22 18 21 22 24 27 16
...
25 23 18 19 25 23 19 23 22 18
22 19 16 15 13 25 26 23 26 20
23 16 14 23 20 23 22 24 26 19
20 18
所有892个值都已读取。
查看一下,如果您有进一步的问题,请告诉我。
<details>
<summary>英文:</summary>
You are slightly off-target on what you have in your input file. Your input file provides the number of 32-bit integers contained in the file as the first value. You need only read the first integer to know how much storage you need to allocate for the remaining values.
Rather then using a `typedef` to a signed 32-bit integer, the standard C library provides the `stdint.h` header with all exact-width types, including the signed 32-bit type already provided as `int32_t`. The `inttypes.h` header provides the macros for printing and reading the exact width types (e.g. `PRId32` for printing where `d` can be `u` unsigned, `x` hexadecimal, `o` octal, or `i` integer, and e.g. `SCNd32` for use with scanf)
Therefore, all you really need to do is open the file for reading (using `"rb"` as the mode for portabilty, the `'b'` doesn't do anything and is provided for C89 compatibility) You open the file and read the first 32-bit value into a variable. That tells you how many 32-bit values follow -- and leaves the file position indicator ready to read the remaining values, e.g.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main (int argc, char **argv) {
int32_t *a = NULL, nint = 0; /* pointer and no. of int */
/* use filename provided as 1st argument ("000002.dat" by default) */
FILE *fp = fopen (argc > 1 ? argv[1] : "000002.dat", "rb");
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
/* read no. of int32_t values from 1st value in file */
if (fread (&nint, 1, sizeof nint, fp) != sizeof nint) {
perror ("fread-nint");
return 1;
}
...
Now simply allocate storage for the remaining values, either with `malloc()` (recommended), or you can declare a *Variable Length Array* but you should check the number of values to ensure you don't attempt to declare an array that exceeds your stack size -- which will be compiler/OS dependent. MS usually provides a 1M stack, so you should be able to declare a VLA of about 200K integers -- balanced against whatever other stack use you have. A simple dynamic allocation with `malloc` will eliminate the risk of StackOverflow...
...
/* allocate/validate storage */
if (!(a = malloc (nint * sizeof nint))) {
perror ("malloc-a");
return 1;
}
...
All that remains is reading the rest of the values from your file with `fread`. The `fread` function reads a number of blocks of a given size storing the results in the address provided. So you simply want to read `nint` values of `sizeof nint` (or you could use `sizeof *a`- both are the same type). The return will be the number of blocks of that size read from the file. You can read the remaining values with:
...
/* read remaining values from file into a */
if (fread (a, sizeof nint, (size_t)nint, fp) != (size_t)nint) {
perror ("fread-a");
return 1;
}
fclose (fp); /* close file */
...
(**note:** always VALIDATE that your allocation succeeds, and validate your read from the file by checking the return of `fread`.)
A complete example that confirms the number of 32-bit values read from the file could be:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main (int argc, char **argv) {
int32_t *a = NULL, nint = 0; /* pointer and no. of int */
/* use filename provided as 1st argument ("000002.dat" by default) */
FILE *fp = fopen (argc > 1 ? argv[1] : "000002.dat", "rb");
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
/* read no. of int32_t values from 1st value in file */
if (fread (&nint, 1, sizeof nint, fp) != sizeof nint) {
perror ("fread-nint");
return 1;
}
/* allocate/validate storage */
if (!(a = malloc (nint * sizeof nint))) {
perror ("malloc-a");
return 1;
}
/* read remaining values from file into a */
if (fread (a, sizeof nint, (size_t)nint, fp) != (size_t)nint) {
perror ("fread-a");
return 1;
}
fclose (fp); /* close file */
/* report number of integers read */
printf ("%d int32_t read from file.\n", nint);
}
**Example Use/Output**
Using the file you provided the link to, and passing the filename to read as the first argument to the program (or reading from the file in the current directory by default), you would get:
$ ./bin/freadint32_t ../dat/000002.dat
892 int32_t read from file.
**Values In File**
If you output the values in the file by adding a simple loop, you would find:
16 25 22 11 17 20 19 23 22 16
17 22 25 25 18 22 24 17 15 18
25 14 14 29 16 14 23 23 21 20
28 24 17 22 18 21 22 24 27 16
16 21 22 30 28 18 23 20 15 23
20 19 22 22 23 20 18 20 28 22
21 22 20 30 21 17 24 22 21 18
19 20 20 25 22 20 30 26 25 33
21 15 23 22 19 17 17 20 21 21
27 35 27 19 21 22 19 13 18 18
12 20 25 22 24 21 20 26 22 24
30 22 18 22 20 16 18 23 22 24
23 17 22 22 17 23 22 16 24 25
20 18 18 25 24 23 22 17 23 26
22 16 17 25 27 24 23 26 23 20
24 17 10 23 22 13 20 16 16 22
18 23 25 20 28 24 21 26 22 24
22 24 25 19 26 28 21 18 21 25
24 19 20 21 19 20 19 19 18 29
...
25 23 18 19 25 23 19 23 22 18
22 19 16 15 13 25 26 23 26 20
23 16 14 23 20 23 22 24 26 19
20 18
All 892 values read.
Look things over and let me know if you have further questions.
</details>
# 答案2
**得分**: 0
```plaintext
fopen()
http://www.cplusplus.com/reference/cstdio/fopen/
> **mode**
C string containing a file access mode. It can be:
**"r"** read: Open file for input operations. The file must exist.
**"w"** write: Create an empty file for output operations. If a file with the same name already exists, its contents are discarded and the file is treated as a new empty file.
**"a"** append: Open file for output at the end of a file. Output operations always write data at the end of the file, expanding it. Repositioning operations (fseek, fsetpos, rewind) are ignored. The file is created if it does not exist.
**"r+"** read/update: Open a file for update (both for input and output). The file must exist.
**"w+"** write/update: Create an empty file and open it for update (both for input and output). If a file with the same name already exists its contents are discarded and the file is treated as a new empty file.
**"a+"** append/update: Open a file for update (both for input and output) with all output operations writing data at the end of the file. Repositioning operations (fseek, fsetpos, rewind) affects the next input operations, but output operations move the position back to the end of file. The file is created if it does not exist.
With the mode specifiers above the file is open as a text file. In order to open a file as a binary file, a **"b"** character has to be included in the mode string. This additional **"b"** character can either be appended at the end of the string (thus making the following compound modes: **"rb", "wb", "ab", "r+b", "w+b", "a+b"**) or be inserted between the letter and the **"+"** sign for the mixed modes (**"rb+", "wb+", "ab+"**).
You are opening and reading your file in text mode change the following line
`fp = fopen(myfilePath, "r");`
into
`fp = fopen(myfilePath, "rb");`
to open and read file content in binary mode.
英文:
fopen()
http://www.cplusplus.com/reference/cstdio/fopen/
> mode
C string containing a file access mode. It can be:
"r" read: Open file for input operations. The file must exist.
"w" write: Create an empty file for output operations. If a file with the same name already exists, its contents are discarded and the file is treated as a new empty file.
"a" append: Open file for output at the end of a file. Output operations always write data at the end of the file, expanding it. Repositioning operations (fseek, fsetpos, rewind) are ignored. The file is created if it does not exist.
"r+" read/update: Open a file for update (both for input and output). The file must exist.
"w+" write/update: Create an empty file and open it for update (both for input and output). If a file with the same name already exists its contents are discarded and the file is treated as a new empty file.
"a+" append/update: Open a file for update (both for input and output) with all output operations writing data at the end of the file. Repositioning operations (fseek, fsetpos, rewind) affects the next input operations, but output operations move the position back to the end of file. The file is created if it does not exist.
With the mode specifiers above the file is open as a text file. In order to open a file as a binary file, a "b" character has to be included in the mode string. This additional "b" character can either be appended at the end of the string (thus making the following compound modes: "rb", "wb", "ab", "r+b", "w+b", "a+b") or be inserted between the letter and the "+" sign for the mixed modes ("rb+", "wb+", "ab+").
You are opening and reading your file in text mode change the following line
fp = fopen(myfilePath, "r");
into
fp = fopen(myfilePath, "rb");
to open and read file content in binary mode.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论