在Go语言中如何读取打包的二进制数据?

huangapple go评论83阅读模式
英文:

How to read packed binary data in Go?

问题

我正在尝试找出在Go语言中读取由Python生成的打包二进制文件的最佳方法,类似于以下代码:

import struct
f = open('tst.bin', 'wb')
fmt = 'iih' #请注意这是打包的二进制文件4字节整数4字节整数2字节整数
f.write(struct.pack(fmt,4, 185765, 1020))
f.write(struct.pack(fmt,4, 185765, 1022))
f.close()

我已经尝试了一些在Github.com和其他一些来源上看到的示例,但似乎没有得到正确的结果(更新后的代码显示了工作方法)。**在Go语言中,有什么惯用的方法来做这样的事情?**这是我尝试的其中一种方法。

更新和工作代码

package main

import (
	"fmt"
	"os"
	"encoding/binary"
	"io"
)

func main() {
	fp, err := os.Open("tst.bin")

	if err != nil {
		panic(err)
	}

	defer fp.Close()

	lineBuf := make([]byte, 10) //每行4字节整数,4字节整数,2字节整数

	for true {
		_, err := fp.Read(lineBuf)

		if err == io.EOF{
			break
		}

		aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // 同样的操作:int32(uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24)
		bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
		cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) //同样的操作:int16(uint32(b[0]) | uint32(b[1])<<8)
		fmt.Println(aVal, bVal, cVal)
	}
}

以上是你要翻译的内容。

英文:

I'm trying to figure out the best way to read a packed binary file in Go that was produced by Python like the following:

import struct
f = open(&#39;tst.bin&#39;, &#39;wb&#39;)
fmt = &#39;iih&#39; #please note this is packed binary: 4byte int, 4byte int, 2byte int
f.write(struct.pack(fmt,4, 185765, 1020))
f.write(struct.pack(fmt,4, 185765, 1022))
f.close()

I have been tinkering with some of the examples I've seen on Github.com and a few other sources <del>but I can't seem to get anything working correctly</del> (update shows working method). What is the idiomatic way to do this sort of thing in Go? This is one of several attempts

UPDATE and WORKING

package main

    import (
            &quot;fmt&quot;
            &quot;os&quot;
            &quot;encoding/binary&quot;
    		&quot;io&quot;
            )
    
    func main() {
            fp, err := os.Open(&quot;tst.bin&quot;)
    
            if err != nil {
                    panic(err)
            }
    
            defer fp.Close()
    
    		lineBuf := make([]byte, 10) //4 byte int, 4 byte int, 2 byte int per line
    
    		for true {
    			_, err := fp.Read(lineBuf)
    			
    			if err == io.EOF{
    				break
    			}
    			
    			aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // same as: int32(uint32(b[0]) | uint32(b[1])&lt;&lt;8 | uint32(b[2])&lt;&lt;16 | uint32(b[3])&lt;&lt;24)
    			bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
    			cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) //same as: int16(uint32(b[0]) | uint32(b[1])&lt;&lt;8)
    			fmt.Println(aVal, bVal, cVal)
    		}
    }

答案1

得分: 5

一个非常方便和简单的处理方法是使用Google的"Protocol Buffers"。虽然现在已经晚了,因为你已经解决了问题,但我还是花了一些时间来解释和编码,所以我还是会发布出来。

你可以在https://github.com/mwmahlberg/ProtoBufDemo找到代码。

你需要使用你喜欢的方法(pip、操作系统包管理器、源码)安装Python的协议缓冲区和Go的协议缓冲区。

.proto文件

我们的示例中.proto文件非常简单。我将其命名为data.proto

syntax = "proto2";
package main;

message Demo {
  required uint32  A = 1;
  required uint32 B = 2;

  // 缺点:没有16位整数
  // 我们需要在应用程序中确保这一点
  required uint32 C = 3;
}

现在你需要调用protoc命令,并让它为Python和Go生成代码:

protoc --go_out=. --python_out=. data.proto

这将生成data_pb2.pydata.pb.go两个文件。这些文件提供了与协议缓冲区数据的语言特定访问方式。

当使用GitHub上的代码时,你只需要在源代码目录中运行以下命令:

go generate

Python代码

import data_pb2

def main():
    # 创建一个"Demo"消息类型的实例...
    data = data_pb2.Demo()

    # ...并填充数据
    data.A = 5
    data.B = 5
    data.C = 2015

    print("* Python writing to file")
    f = open('tst.bin', 'wb')

    # 注意,"data.SerializeToString()"出乎意料地写入二进制数据
    f.write(data.SerializeToString())
    f.close()

    f = open('tst.bin', 'rb')
    read = data_pb2.Demo()
    read.ParseFromString(f.read())
    f.close()

    print("* Python reading from file")
    print("\tDemo.A: %d, Demo.B: %d, Demo.C: %d" %(read.A, read.B, read.C))

if __name__ == '__main__':
    main()

我们导入由protoc生成的文件并使用它。这里没有太多的魔法。

Go代码

package main

//go:generate protoc --python_out=. data.proto
//go:generate protoc --go_out=. data.proto
import (
	"fmt"
	"os"

	"github.com/golang/protobuf/proto"
)

func main() {
	// 注意,为了简洁起见,我们没有处理任何错误
	d := Demo{}
	f, _ := os.Open("tst.bin")
	fi, _ := f.Stat()

	// 创建一个足够大的缓冲区来容纳整个消息
	b := make([]byte, fi.Size())

	f.Read(b)

	proto.Unmarshal(b, &d)
	fmt.Println("* Go reading from file")

	// 注意,字段是指向指针的指针,所以需要显式解引用指针
	fmt.Printf("\tDemo.A: %d, Demo.B: %d, Demo.C: %d\n", *d.A, *d.B, *d.C)
}

注意,我们不需要显式导入,因为data.proto的包名是main

结果

在生成所需的文件并编译源代码之后,当你运行以下命令时:

python writer.py && ./ProtoBufDemo

结果将是:

* Python writing to file
* Python reading from file
    Demo.A: 5, Demo.B: 5, Demo.C: 2015
* Go reading from file
    Demo.A: 5, Demo.B: 5, Demo.C: 2015

请注意,存储库中的Makefile提供了一个快捷方式,用于生成代码、编译.go文件并运行两个程序:

make run
英文:

A well portable and rather easy way to handle the problem are Google's "Protocol Buffers". Though this is too late now since you got it working, I took some effort in explaining and coding it, so I am posting it anyway.

You can find the code on https://github.com/mwmahlberg/ProtoBufDemo

You need to install the protocol buffers for python using your preferred method (pip, OS package management, source) and for Go

The .proto file

The .proto file is rather simple for our example. I called it data.proto

<!-- language: lang-porto -->

syntax = &quot;proto2&quot;;
package main;

message Demo {
  required uint32  A = 1;
  required uint32 B = 2;

  // A shortcomning: no 16 bit ints
  // We need to make this sure in the applications
  required uint32 C = 3;
}

Now you need to call protoc on the file and have it provide the code for both Python and Go:

protoc --go_out=. --python_out=. data.proto

which generates the files data_pb2.py and data.pb.go. Those files provide the language specific access to the protocol buffer data.

When using the code from github, all you need to do is to issue

go generate

in the source directory.

The Python code

import data_pb2

def main():

    # We create an instance of the message type &quot;Demo&quot;...
    data = data_pb2.Demo()

    # ...and fill it with data
    data.A = long(5)
    data.B = long(5)
    data.C = long(2015)


    print &quot;* Python writing to file&quot;
    f = open(&#39;tst.bin&#39;, &#39;wb&#39;)

    # Note that &quot;data.SerializeToString()&quot; counterintuitively
    # writes binary data
    f.write(data.SerializeToString())
    f.close()

    f = open(&#39;tst.bin&#39;, &#39;rb&#39;)
    read = data_pb2.Demo()
    read.ParseFromString(f.read())
    f.close()

    print &quot;* Python reading from file&quot;
    print &quot;\tDemo.A: %d, Demo.B: %d, Demo.C: %d&quot; %(read.A, read.B, read.C)

if __name__ == &#39;__main__&#39;:
    main()

We import the file generated by protoc and use it. Not much magic here.

The Go File

package main

//go:generate protoc --python_out=. data.proto
//go:generate protoc --go_out=. data.proto
import (
	&quot;fmt&quot;
	&quot;os&quot;

	&quot;github.com/golang/protobuf/proto&quot;
)

func main() {

	// Note that we do not handle any errors for the sake of brevity
	d := Demo{}
	f, _ := os.Open(&quot;tst.bin&quot;)
	fi, _ := f.Stat()

	// We create a buffer which is big enough to hold the entire message
	b := make([]byte,fi.Size())

	f.Read(b)

	proto.Unmarshal(b, &amp;d)
	fmt.Println(&quot;* Go reading from file&quot;)

	// Note the explicit pointer dereference, as the fields are pointers to a pointers
	fmt.Printf(&quot;\tDemo.A: %d, Demo.B: %d, Demo.C: %d\n&quot;,*d.A,*d.B,*d.C)
}

Note that we do not need to explicitly import, as the package of data.proto is main.

The result

After generation the required files and compiling the source, when you issue

$ python writer.py &amp;&amp; ./ProtoBufDemo

the result is

* Python writing to file
* Python reading from file
    Demo.A: 5, Demo.B: 5, Demo.C: 2015
* Go reading from file
    Demo.A: 5, Demo.B: 5, Demo.C: 2015

Note that the Makefile in the repository offers a shorcut for generating the code, compiling the .go files and run both programs:

make run

答案2

得分: 4

Python的格式化字符串是iih,表示两个32位有符号整数和一个16位有符号整数(参见文档)。你可以简单地使用你的第一个例子,但是将结构体改为:

type binData struct {
    A int32
    B int32
    C int16
}

func main() {
    fp, err := os.Open("tst.bin")

    if err != nil {
        panic(err)
    }

    defer fp.Close()

    for {
        thing := binData{}
        err := binary.Read(fp, binary.LittleEndian, &thing)

        if err == io.EOF {
            break
        }

        fmt.Println(thing.A, thing.B, thing.C)
    }
}

请注意,Python的打包没有明确指定字节顺序,但如果你确定运行它的系统生成的是小端字节序的二进制文件,这段代码应该可以工作。

编辑: 添加了main()函数以解释我的意思。

编辑2: 将结构体字段首字母大写,以便binary.Read可以写入它们。

英文:

The Python format string is iih, meaning two 32-bit signed integers and one 16-bit signed integer (see the docs). You can simply use your first example but change the struct to:

type binData struct {
    A int32
    B int32
    C int16
}

func main() {
        fp, err := os.Open(&quot;tst.bin&quot;)

        if err != nil {
                panic(err)
        }

        defer fp.Close()

        for {
            thing := binData{}
            err := binary.Read(fp, binary.LittleEndian, &amp;thing)

            if err == io.EOF{
                break
            }

            fmt.Println(thing.A, thing.B, thing.C)
        }
}

Note that the Python packing didn't specify the endianness explicitly, but if you're sure the system that ran it generated little-endian binary, this should work.

Edit: Added main() function to explain what I mean.

Edit 2: Capitalized struct fields so binary.Read could write into them.

答案3

得分: 1

如我在帖子中提到的,我不确定这是否是Go中的惯用方法,但这是我在进行了一些尝试并适应了几个不同的示例之后得出的解决方案。请注意,这将4字节和2字节的整数解包为Go中的int32和int16。我发布这个答案,以防有人需要。希望有人能发布更符合惯用方式的解决方法,但目前这个方法可行。

package main

import (
    "fmt"
    "os"
    "encoding/binary"
    "io"
)

func main() {
    fp, err := os.Open("tst.bin")

    if err != nil {
        panic(err)
    }

    defer fp.Close()

    lineBuf := make([]byte, 10) //每行有4字节整数、4字节整数和2字节整数

    for true {
        _, err := fp.Read(lineBuf)

        if err == io.EOF{
            break
        }

        aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // 同样的操作:int32(uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24)
        bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
        cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) // 同样的操作:int16(uint32(b[0]) | uint32(b[1])<<8)
        fmt.Println(aVal, bVal, cVal)
    }
}
英文:

As I mentioned in my post, I'm not sure this is THE idiomatic way to do this in Go but this is the solution that I came up with after a fair bit of tinkering and adapting several different examples. Note again that this unpacks 4 and 2 byte int into Go int32 and int16 respectively. Posting so that there is a valid answer in case someone comes looking. Hopefully someone will post a more idiomatic way of accomplishing this but for now, this works.

package main

    import (
            &quot;fmt&quot;
            &quot;os&quot;
            &quot;encoding/binary&quot;
    		&quot;io&quot;
            )
    
    func main() {
            fp, err := os.Open(&quot;tst.bin&quot;)
    
            if err != nil {
                    panic(err)
            }
    
            defer fp.Close()
    
    		lineBuf := make([]byte, 10) //4 byte int, 4 byte int, 2 byte int per line
    
    		for true {
    			_, err := fp.Read(lineBuf)
    			
    			if err == io.EOF{
    				break
    			}
    			
    			aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // same as: int32(uint32(b[0]) | uint32(b[1])&lt;&lt;8 | uint32(b[2])&lt;&lt;16 | uint32(b[3])&lt;&lt;24)
    			bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
    			cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) //same as: int16(uint32(b[0]) | uint32(b[1])&lt;&lt;8)
    			fmt.Println(aVal, bVal, cVal)
    		}
    }

答案4

得分: 0

尝试使用binpacker库。

示例:

示例数据:

buffer := new(bytes.Buffer)
packer := binpacker.NewPacker(buffer)
unpacker := binpacker.NewUnpacker(buffer)
packer.PushByte(0x01)
packer.PushUint16(math.MaxUint16)

解包:

var val1 byte
var val2 uint16
var err error
val1, err = unpacker.ShiftByte()
val2, err = unpacker.ShiftUint16()

或者:

var val1 byte
var val2 uint16
var err error
unpacker.FetchByte(&val1).FetchUint16(&val2)
unpacker.Error() // 确保错误为nil
英文:

Try binpacker libary.

Example:

Example data:

buffer := new(bytes.Buffer)
packer := binpacker.NewPacker(buffer)
unpacker := binpacker.NewUnpacker(buffer)
packer.PushByte(0x01)
packer.PushUint16(math.MaxUint16)

Unpack:

var val1 byte
var val2 uint16
var err error
val1, err = unpacker.ShiftByte()
val2, err = unpacker.ShiftUint16()

Or:

var val1 byte
var val2 uint16
var err error
unpacker.FetchByte(&amp;val1).FetchUint16(&amp;val2)
unpacker.Error() // Make sure error is nil

huangapple
  • 本文由 发表于 2015年12月4日 07:43:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/34078427.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定