英文:
How to read packed binary data in Go?
问题
我正在尝试找出在Go语言中读取由Python生成的打包二进制文件的最佳方法,类似于以下代码:
import struct
f = open('tst.bin', 'wb')
fmt = 'iih' #请注意这是打包的二进制文件:4字节整数,4字节整数,2字节整数
f.write(struct.pack(fmt,4, 185765, 1020))
f.write(struct.pack(fmt,4, 185765, 1022))
f.close()
我已经尝试了一些在Github.com和其他一些来源上看到的示例,但似乎没有得到正确的结果(更新后的代码显示了工作方法)。**在Go语言中,有什么惯用的方法来做这样的事情?**这是我尝试的其中一种方法。
更新和工作代码
package main
import (
"fmt"
"os"
"encoding/binary"
"io"
)
func main() {
fp, err := os.Open("tst.bin")
if err != nil {
panic(err)
}
defer fp.Close()
lineBuf := make([]byte, 10) //每行4字节整数,4字节整数,2字节整数
for true {
_, err := fp.Read(lineBuf)
if err == io.EOF{
break
}
aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // 同样的操作:int32(uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24)
bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) //同样的操作:int16(uint32(b[0]) | uint32(b[1])<<8)
fmt.Println(aVal, bVal, cVal)
}
}
以上是你要翻译的内容。
英文:
I'm trying to figure out the best way to read a packed binary file in Go that was produced by Python like the following:
import struct
f = open('tst.bin', 'wb')
fmt = 'iih' #please note this is packed binary: 4byte int, 4byte int, 2byte int
f.write(struct.pack(fmt,4, 185765, 1020))
f.write(struct.pack(fmt,4, 185765, 1022))
f.close()
I have been tinkering with some of the examples I've seen on Github.com and a few other sources <del>but I can't seem to get anything working correctly</del> (update shows working method). What is the idiomatic way to do this sort of thing in Go? This is one of several attempts
UPDATE and WORKING
package main
import (
"fmt"
"os"
"encoding/binary"
"io"
)
func main() {
fp, err := os.Open("tst.bin")
if err != nil {
panic(err)
}
defer fp.Close()
lineBuf := make([]byte, 10) //4 byte int, 4 byte int, 2 byte int per line
for true {
_, err := fp.Read(lineBuf)
if err == io.EOF{
break
}
aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // same as: int32(uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24)
bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) //same as: int16(uint32(b[0]) | uint32(b[1])<<8)
fmt.Println(aVal, bVal, cVal)
}
}
答案1
得分: 5
一个非常方便和简单的处理方法是使用Google的"Protocol Buffers"。虽然现在已经晚了,因为你已经解决了问题,但我还是花了一些时间来解释和编码,所以我还是会发布出来。
你可以在https://github.com/mwmahlberg/ProtoBufDemo找到代码。
你需要使用你喜欢的方法(pip、操作系统包管理器、源码)安装Python的协议缓冲区和Go的协议缓冲区。
.proto
文件
我们的示例中.proto
文件非常简单。我将其命名为data.proto
。
syntax = "proto2";
package main;
message Demo {
required uint32 A = 1;
required uint32 B = 2;
// 缺点:没有16位整数
// 我们需要在应用程序中确保这一点
required uint32 C = 3;
}
现在你需要调用protoc
命令,并让它为Python和Go生成代码:
protoc --go_out=. --python_out=. data.proto
这将生成data_pb2.py
和data.pb.go
两个文件。这些文件提供了与协议缓冲区数据的语言特定访问方式。
当使用GitHub上的代码时,你只需要在源代码目录中运行以下命令:
go generate
Python代码
import data_pb2
def main():
# 创建一个"Demo"消息类型的实例...
data = data_pb2.Demo()
# ...并填充数据
data.A = 5
data.B = 5
data.C = 2015
print("* Python writing to file")
f = open('tst.bin', 'wb')
# 注意,"data.SerializeToString()"出乎意料地写入二进制数据
f.write(data.SerializeToString())
f.close()
f = open('tst.bin', 'rb')
read = data_pb2.Demo()
read.ParseFromString(f.read())
f.close()
print("* Python reading from file")
print("\tDemo.A: %d, Demo.B: %d, Demo.C: %d" %(read.A, read.B, read.C))
if __name__ == '__main__':
main()
我们导入由protoc
生成的文件并使用它。这里没有太多的魔法。
Go代码
package main
//go:generate protoc --python_out=. data.proto
//go:generate protoc --go_out=. data.proto
import (
"fmt"
"os"
"github.com/golang/protobuf/proto"
)
func main() {
// 注意,为了简洁起见,我们没有处理任何错误
d := Demo{}
f, _ := os.Open("tst.bin")
fi, _ := f.Stat()
// 创建一个足够大的缓冲区来容纳整个消息
b := make([]byte, fi.Size())
f.Read(b)
proto.Unmarshal(b, &d)
fmt.Println("* Go reading from file")
// 注意,字段是指向指针的指针,所以需要显式解引用指针
fmt.Printf("\tDemo.A: %d, Demo.B: %d, Demo.C: %d\n", *d.A, *d.B, *d.C)
}
注意,我们不需要显式导入,因为data.proto
的包名是main
。
结果
在生成所需的文件并编译源代码之后,当你运行以下命令时:
python writer.py && ./ProtoBufDemo
结果将是:
* Python writing to file
* Python reading from file
Demo.A: 5, Demo.B: 5, Demo.C: 2015
* Go reading from file
Demo.A: 5, Demo.B: 5, Demo.C: 2015
请注意,存储库中的Makefile提供了一个快捷方式,用于生成代码、编译.go
文件并运行两个程序:
make run
英文:
A well portable and rather easy way to handle the problem are Google's "Protocol Buffers". Though this is too late now since you got it working, I took some effort in explaining and coding it, so I am posting it anyway.
You can find the code on https://github.com/mwmahlberg/ProtoBufDemo
You need to install the protocol buffers for python using your preferred method (pip, OS package management, source) and for Go
The .proto
file
The .proto
file is rather simple for our example. I called it data.proto
<!-- language: lang-porto -->
syntax = "proto2";
package main;
message Demo {
required uint32 A = 1;
required uint32 B = 2;
// A shortcomning: no 16 bit ints
// We need to make this sure in the applications
required uint32 C = 3;
}
Now you need to call protoc
on the file and have it provide the code for both Python and Go:
protoc --go_out=. --python_out=. data.proto
which generates the files data_pb2.py
and data.pb.go
. Those files provide the language specific access to the protocol buffer data.
When using the code from github, all you need to do is to issue
go generate
in the source directory.
The Python code
import data_pb2
def main():
# We create an instance of the message type "Demo"...
data = data_pb2.Demo()
# ...and fill it with data
data.A = long(5)
data.B = long(5)
data.C = long(2015)
print "* Python writing to file"
f = open('tst.bin', 'wb')
# Note that "data.SerializeToString()" counterintuitively
# writes binary data
f.write(data.SerializeToString())
f.close()
f = open('tst.bin', 'rb')
read = data_pb2.Demo()
read.ParseFromString(f.read())
f.close()
print "* Python reading from file"
print "\tDemo.A: %d, Demo.B: %d, Demo.C: %d" %(read.A, read.B, read.C)
if __name__ == '__main__':
main()
We import the file generated by protoc
and use it. Not much magic here.
The Go File
package main
//go:generate protoc --python_out=. data.proto
//go:generate protoc --go_out=. data.proto
import (
"fmt"
"os"
"github.com/golang/protobuf/proto"
)
func main() {
// Note that we do not handle any errors for the sake of brevity
d := Demo{}
f, _ := os.Open("tst.bin")
fi, _ := f.Stat()
// We create a buffer which is big enough to hold the entire message
b := make([]byte,fi.Size())
f.Read(b)
proto.Unmarshal(b, &d)
fmt.Println("* Go reading from file")
// Note the explicit pointer dereference, as the fields are pointers to a pointers
fmt.Printf("\tDemo.A: %d, Demo.B: %d, Demo.C: %d\n",*d.A,*d.B,*d.C)
}
Note that we do not need to explicitly import, as the package of data.proto
is main
.
The result
After generation the required files and compiling the source, when you issue
$ python writer.py && ./ProtoBufDemo
the result is
* Python writing to file
* Python reading from file
Demo.A: 5, Demo.B: 5, Demo.C: 2015
* Go reading from file
Demo.A: 5, Demo.B: 5, Demo.C: 2015
Note that the Makefile in the repository offers a shorcut for generating the code, compiling the .go
files and run both programs:
make run
答案2
得分: 4
Python的格式化字符串是iih
,表示两个32位有符号整数和一个16位有符号整数(参见文档)。你可以简单地使用你的第一个例子,但是将结构体改为:
type binData struct {
A int32
B int32
C int16
}
func main() {
fp, err := os.Open("tst.bin")
if err != nil {
panic(err)
}
defer fp.Close()
for {
thing := binData{}
err := binary.Read(fp, binary.LittleEndian, &thing)
if err == io.EOF {
break
}
fmt.Println(thing.A, thing.B, thing.C)
}
}
请注意,Python的打包没有明确指定字节顺序,但如果你确定运行它的系统生成的是小端字节序的二进制文件,这段代码应该可以工作。
编辑: 添加了main()
函数以解释我的意思。
编辑2: 将结构体字段首字母大写,以便binary.Read
可以写入它们。
英文:
The Python format string is iih
, meaning two 32-bit signed integers and one 16-bit signed integer (see the docs). You can simply use your first example but change the struct to:
type binData struct {
A int32
B int32
C int16
}
func main() {
fp, err := os.Open("tst.bin")
if err != nil {
panic(err)
}
defer fp.Close()
for {
thing := binData{}
err := binary.Read(fp, binary.LittleEndian, &thing)
if err == io.EOF{
break
}
fmt.Println(thing.A, thing.B, thing.C)
}
}
Note that the Python packing didn't specify the endianness explicitly, but if you're sure the system that ran it generated little-endian binary, this should work.
Edit: Added main()
function to explain what I mean.
Edit 2: Capitalized struct fields so binary.Read
could write into them.
答案3
得分: 1
如我在帖子中提到的,我不确定这是否是Go中的惯用方法,但这是我在进行了一些尝试并适应了几个不同的示例之后得出的解决方案。请注意,这将4字节和2字节的整数解包为Go中的int32和int16。我发布这个答案,以防有人需要。希望有人能发布更符合惯用方式的解决方法,但目前这个方法可行。
package main
import (
"fmt"
"os"
"encoding/binary"
"io"
)
func main() {
fp, err := os.Open("tst.bin")
if err != nil {
panic(err)
}
defer fp.Close()
lineBuf := make([]byte, 10) //每行有4字节整数、4字节整数和2字节整数
for true {
_, err := fp.Read(lineBuf)
if err == io.EOF{
break
}
aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // 同样的操作:int32(uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24)
bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) // 同样的操作:int16(uint32(b[0]) | uint32(b[1])<<8)
fmt.Println(aVal, bVal, cVal)
}
}
英文:
As I mentioned in my post, I'm not sure this is THE idiomatic way to do this in Go but this is the solution that I came up with after a fair bit of tinkering and adapting several different examples. Note again that this unpacks 4 and 2 byte int into Go int32 and int16 respectively. Posting so that there is a valid answer in case someone comes looking. Hopefully someone will post a more idiomatic way of accomplishing this but for now, this works.
package main
import (
"fmt"
"os"
"encoding/binary"
"io"
)
func main() {
fp, err := os.Open("tst.bin")
if err != nil {
panic(err)
}
defer fp.Close()
lineBuf := make([]byte, 10) //4 byte int, 4 byte int, 2 byte int per line
for true {
_, err := fp.Read(lineBuf)
if err == io.EOF{
break
}
aVal := int32(binary.LittleEndian.Uint32(lineBuf[0:4])) // same as: int32(uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24)
bVal := int32(binary.LittleEndian.Uint32(lineBuf[4:8]))
cVal := int16(binary.LittleEndian.Uint16(lineBuf[8:10])) //same as: int16(uint32(b[0]) | uint32(b[1])<<8)
fmt.Println(aVal, bVal, cVal)
}
}
答案4
得分: 0
尝试使用binpacker库。
示例:
示例数据:
buffer := new(bytes.Buffer)
packer := binpacker.NewPacker(buffer)
unpacker := binpacker.NewUnpacker(buffer)
packer.PushByte(0x01)
packer.PushUint16(math.MaxUint16)
解包:
var val1 byte
var val2 uint16
var err error
val1, err = unpacker.ShiftByte()
val2, err = unpacker.ShiftUint16()
或者:
var val1 byte
var val2 uint16
var err error
unpacker.FetchByte(&val1).FetchUint16(&val2)
unpacker.Error() // 确保错误为nil
英文:
Try binpacker libary.
Example:
Example data:
buffer := new(bytes.Buffer)
packer := binpacker.NewPacker(buffer)
unpacker := binpacker.NewUnpacker(buffer)
packer.PushByte(0x01)
packer.PushUint16(math.MaxUint16)
Unpack:
var val1 byte
var val2 uint16
var err error
val1, err = unpacker.ShiftByte()
val2, err = unpacker.ShiftUint16()
Or:
var val1 byte
var val2 uint16
var err error
unpacker.FetchByte(&val1).FetchUint16(&val2)
unpacker.Error() // Make sure error is nil
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论