为什么binary.Read()不能正确读取整数?

huangapple go评论79阅读模式
英文:

Why isn't binary.Read() reading integers correctly?

问题

我正在尝试在Go中读取一个二进制文件。

基本上,我有一个像这样的结构体:

type foo struct {
    A int16
    B int32
    C [32]byte
    // 其他字段...
}

我正在像这样从文件中读取到结构体中:

fi, err := os.Open(fname)
// 错误检查,延迟关闭等
var bar foo
binary.Read(fi, binary.LittleEndian, &bar)

现在,这个应该可以工作,但我得到了一些奇怪的结果。例如,当我读取到结构体中时,我应该得到这样的结果:

A: 7
B: 8105
C: // 一些字符串

但我得到的结果是:

A: 7
B: 531169280
C: // 一些正确的字符串

这是因为当binary.Read()读取文件时,在将[]byte{7, 0}读取为int16(7)A的正确值)之后,它遇到了切片[]byte{0, 0, 169, 31},并尝试将其转换为int32。然而,binary.Read()的转换过程是这样的:

uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24,其中b是字节切片。

但真正让我困惑的是,在C中做完全相同的事情却完全正常。

如果我在C中写下这个:

int main()
{
    int fd;
    struct cool_struct {
        short int A;
        int32_t B;
        char C[32];
        // 你懂的...
    } foo;
    int sz = sizeof(struct cool_struct);
    const char* file_name = "/path/to/my/file";

    fd = open(file_name, O_RDONLY);
    // 更多代码
    read(fd, &foo, sz);
    // 打印值
}

我得到了正确的结果。为什么我的C代码能够得到正确的结果,而我的Go代码却不能呢?

英文:

I'm trying to read a binary file in Go.

Essentially I have a struct like this:

<!-- language: lang-go -->

type foo struct {
    A int16
    B int32
    C [32]byte
    // and so on...
}

and I'm reading from the file into the struct like this:

<!-- language: lang-go -->

fi, err := os.Open(fname)
// error checking, defer close, etc.
var bar foo
binary.Read(fi, binary.LittleEndian, &amp;bar)

Now, that should work, but I'm getting some weird results. For instance, when I read into the struct I should get this:

A: 7
B: 8105
C: // some string

but what I get is this:

A: 7
B: 531169280
C: // some correct string

The reason for this is because when binary.Read() is reading the file, after it reads the []byte{7, 0} as int16(7) (the correct value for A), it comes across the slice []byte{0, 0, 169, 31} and tries to convert it into an int32. However, binary.Read()'s conversion does this:

uint32(b[0]) | uint32(b[1])&lt;&lt;8 | uint32(b[2])&lt;&lt;16 | uint32(b[3])&lt;&lt;24 where b is the byte slice.

But what really confuses me is doing the exact same thing in C works perfectly fine.

If I write this in C:

<!-- language: c -->

int main()
{
    int fd;
    struct cool_struct {
        short int A;
        int32_t B;
        char C[32];
        // you get the picture...
    } foo;
    int sz = sizeof(struct cool_struct);
    const char* file_name = &quot;/path/to/my/file&quot;

    fd = open(file_name, O_RDONLY);
    // more code
    read(fd, &amp;foo, sz);
    // print values
}

I get the correct results. Why is my C code getting this correct while my Go code isn't?

答案1

得分: 6

假设字符串的前两个字符不是'\000',你遇到的问题是对齐问题,你的C编译器在int16后面添加了额外的两个字节,而Go语言没有这样做。

最简单的解决方法可能是在'A'后面添加一个虚拟(填充)int16:

type foo struct {
    A     int16
    A_pad int16
    B     int32
    C     [32]byte
}

或者可能有一种方法告诉Go语言int32需要"4字节对齐",如果你知道这种方法,请编辑这个答案或发表评论。

英文:

Assuming the first two characters of the string aren't '\000'

what you've got there is an alignment problem, your C compiler is putting an extra two bytes of padding after the int16, Go isn't

easiest fix is probably just to add a dummy (padding) int16 after 'A'

type foo struct 
{
    A int16
    A_pad int16
    B int32
    C [32]byte
}

or the may be a way to tell go that the int32 needs to be "4-byte aligned"

if you know of one please edit this answer or post a comment

答案2

得分: 1

给定的代码部分如下所示:

0000000: 0700 0000 a91f 0000 7074 732f 3300 0000 ........pts/3...

根据结构体,字段的含义如下:

0700h 是一个短整型字段,小端格式 = 7

0000a91fh 是一个整型字段,小端格式 = 一个很大的数字
...

你的结构体需要一个第二个短整型字段来接收 0000h
然后
0700h = 7
0000h = 新字段中的 0
a91f0000 = 8105
....

这表明(除其他外),结构体在短整型字段和整型字段之间缺少了预期的两个字节的填充。
C 代码中是否有 #pragma pack 指令?

英文:
given:

0000000: 0700 0000 a91f 0000 7074 732f 3300 0000 ........pts/3...

the fields, per the struct, are:
0700h that will be the short int field, little endian format =  7

0000a91fh that will be the  int field, little endian format = the big number
...

your struct needs a second short field to absorb the 0000h
then 
0700h = 7
0000h = 0 in new field
a91f0000 = 8105
....

which indicates (amongst other things) that the struct is missing 
the expected 2 byte padding between the short and the int fields
does the C code have #pragma pack?

huangapple
  • 本文由 发表于 2015年1月2日 18:20:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/27740574.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定