如何反序列化非标准大小的字段?

huangapple go评论83阅读模式
英文:

How to deserialize a non-standard size field?

问题

我必须对来自另一个应用程序的一些二进制消息进行反序列化。我很想使用restruct.io,但是消息结构中的一些字段使用了“非标准”的位数(5位、3位、10位等)。

有没有办法处理这种类型的结构?我已经搜索了一段时间,没有任何成功的结果,所以任何帮助都将非常受欢迎。

提前感谢。

我将尝试给出一个示例来澄清我的问题。给定以下代码:

package main

import (
	"encoding/binary"
	"fmt"

	restruct "gopkg.in/restruct.v1"
)

type MessageType uint8

const (
	MessageTypeOne MessageType = iota + 1
	MessageTypeTwo
	MessageTypeThree
)

// Message is the data to deserialize from the binary stream
type Message struct {
	Length     uint32      `struct:"uint32"` // message size in bytes (including length)
	Type       MessageType `struct:"uint8"`
	Version    uint8       `struct:"uint8:4"` // Just need 4 bits
	Subversion uint8       `struct:"uint8:2"` // just need 2 bits
	Optional   uint8       `struct:"uint8:1"` // just one bit --> '1' means next field is NOT present
	NodeName   string      ``
	ANumber    uint16      `struct:"uint16:10"` // just need 10 bits
}

// (length(4)+type(1)+(version(4bits)+Subversion(2bits)+Optional(1bit))) = 6 bytes
// need 32bit alignment
func main() {
	var inStream = []byte{0x08, // just 8 bytes needed
		0x01,       // messge type = MessageTypeOne
		0x4a,       // Version=0100 Subversion=10 Optional=1 ANumber = 0 (MSB bit)
		0x00, 0x60, // ANumber(000 0000 011) Padding = 0 0000 for 32 bits alignment
	}
	var msg Message

	err := restruct.Unpack(inStream, binary.BigEndian, &msg)
	if err != nil {
		panic(err)
	}
	fmt.Println(msg)
	// Expected:
	// msg.Length = 8
	// msg.Type = 1
	// msg.Version = 4
	// msg.Subversion = 2
	// msg.Optional = 1
	// msg.NodeName = ""
	// msg.ANumber = 3
}

我将从TCP连接接收inStream,并希望对二进制数据进行反序列化,并获得一个具有预期值的Message结构体。

希望这能澄清我的问题。

再次感谢;)

英文:

I have to deserialize some binary messages coming from another application. I would love to use restruct.io but some fields in the message structure use a "non-standard" number of bits ( 5 bits, 3 bits, ... 10 bits ... ).

Is there any way to handle this type of structs? I have been searching for some time without any success so any help will be very welcomed.

thanks in advance

I wil try to give an example to clarify my question. Given the code:

package main

import (
	"encoding/binary"
	"fmt"

	restruct "gopkg.in/restruct.v1"
)

type MessageType uint8

const (
	MessageTypeOne MessageType = iota + 1
	MessageTypeTwo
	MessageTypeThree
)

// Message is the data to deserialize from the binary stream
type Message struct {
	Length     uint32      `struct:"uint32"` // message size in bytes (including length)
	Type       MessageType `struct:"uint8"`
	Version    uint8       `struct:"uint8:4"` // Just need 4 bits
	Subversion uint8       `struct:"uint8:2"` // just need 2 bits
	Optional   uint8       `struct:"uint8:1"` // just one bit --> '1' means next field is NOT present
	NodeName   string      ``
	ANumber    uint16      `struct:"uint16:10"` // just need 10 bits
}

// (length(4)+type(1)+(version(4bits)+Subversion(2bits)+Optional(1bit))) = 6 bytes
// need 32bit alignment
func main() {
	var inStream = []byte{0x08, // just 8 bytes needed
		0x01,       // messge type = MessageTypeOne
		0x4a,       // Version=0100 Subversion=10 Optional=1 ANumber = 0 (MSB bit)
		0x00, 0x60, // ANumber(000 0000 011) Padding = 0 0000 for 32 bits alignment
	}
	var msg Message

	err := restruct.Unpack(inStream, binary.BigEndian, &msg)
	if err != nil {
		panic(err)
	}
	fmt.Println(msg)
	// Expected:
	// msg.Length = 8
	// msg.Type = 1
	// msg.Version = 4
	// msg.Subversion = 2
	// msg.Optional = 1
	// msg.NodeName = ""
	// msg.ANumber = 3
}

I will receive inStream from a TCP connection and will want to deserialize the binary data and get a Message struct with the expected values ...

Hope this will clarify my question.

Thanks again 如何反序列化非标准大小的字段?

答案1

得分: 1

虽然可能没有通用的包来实现这种自定义结构的打包,但你可以很容易地创建自己的方法,仅提取每个字段所需的位。

func (m *Message) UnmarshalBinary(data []byte) error {
    m.Length = binary.BigEndian.Uint32(data[:4])

    if int(m.Length) > len(data) {
        return fmt.Errorf("not enough bytes")
    }

    m.Type = MessageType(data[4])

    m.Version = data[5] >> 4
    m.Subversion = data[5] >> 2 & 0x03
    m.Optional = data[5] >> 1 & 0x01

    // 如果有可选字符串,将 ANumber 的索引移回
    idx := 6
    if m.Optional == 0 {
        // 移除 ANumber 的最后两个字节
        end := int(m.Length) - 2
        m.NodeName = string(data[6:end])
        idx = end
    }

    m.ANumber = uint16(data[idx]&0xc0)<<2 | uint16(data[idx]&0x3f<<2|data[idx+1]>>6)
    return nil
}

当然,你可以添加更多的边界检查,以返回错误,而不是在索引越界时引发 panic。

我稍微修改了你的 inStream 切片以匹配你的定义,你可以在这里看到示例输出:https://play.golang.org/p/FoNoazluOF

英文:

While there's probably no generic package to implement this custom struct packing, you can easily create your own method extracting just the bits required for each field.

func (m *Message) UnmarshalBinary(data []byte) error {
	m.Length = binary.BigEndian.Uint32(data[:4])

	if int(m.Length) &gt; len(data) {
		return fmt.Errorf(&quot;not enough bytes&quot;)
	}

	m.Type = MessageType(data[4])

	m.Version = data[5] &gt;&gt; 4
	m.Subversion = data[5] &gt;&gt; 2 &amp; 0x03
	m.Optional = data[5] &gt;&gt; 1 &amp; 0x01

	// move the index for ANumber back if there&#39;s an optional string
	idx := 6
	if m.Optional == 0 {
		// remove the last two bytes for ANumber
		end := int(m.Length) - 2
		m.NodeName = string(data[6:end])
		idx = end
	}

	m.ANumber = uint16(data[idx]&amp;0xc0)&lt;&lt;2 | uint16(data[idx]&amp;0x3f&lt;&lt;2|data[idx+1]&gt;&gt;6)
	return nil

}

You can of course add more bound checks to return errors rather than letting this panic when indexing out of bounds.

I modified your inStream slice slightly to match your definition, and you can see the example output here: https://play.golang.org/p/FoNoazluOF

答案2

得分: 0

我一直在为restruct.io编写一些补丁,以便能够处理位字段...尽管还没有完全测试,但似乎可以工作...

一旦测试完成,我将尝试发送一个拉取请求...

func (e *encoder) writeBits(f field, inBuf []byte) {

	var inputLength uint8 = uint8(len(inBuf))

	if f.BitSize == 0 {
		// Having problems with complex64 type ... so we asume we want to read all
		//f.BitSize = uint8(f.Type.Bits())
		f.BitSize = 8 * inputLength
	}

	// destPos: Destination position ( in the result ) of the first bit in the first byte
	var destPos uint8 = 8 - e.bitCounter

	// originPos: Original position of the first bit in the first byte
	var originPos uint8 = f.BitSize % 8
	if originPos == 0 {
		originPos = 8
	}

	// numBytes: number of complete bytes to hold the result
	var numBytes uint8 = f.BitSize / 8

	// numBits: number of remaining bits in the first non-complete byte of the result
	var numBits uint8 = f.BitSize % 8

	// number of positions we have to shift the bytes to get the result
	var shift uint8
	if originPos > destPos {
		shift = originPos - destPos
	} else {
		shift = destPos - originPos
	}
	shift = shift % 8

	var inputInitialIdx uint8 = inputLength - numBytes
	if numBits > 0 {
		inputInitialIdx = inputInitialIdx - 1
	}

	if originPos < destPos {
		// shift left
		carry := func(idx uint8) uint8 {
			if (idx + 1) < inputLength {
				return (inBuf[idx+1] >> (8 - shift))
			}
			return 0x00

		}
		mask := func(idx uint8) uint8 {
			if idx == 0 {
				return (0x01 << destPos) - 1
			}
			return 0xFF
		}
		var idx uint8 = 0
		for inIdx := inputInitialIdx; inIdx < inputLength; inIdx++ {
			e.buf[idx] |= ((inBuf[inIdx] << shift) | carry(inIdx)) & mask(idx)
			idx++
		}

	} else {
		// originPos >= destPos => shift right
		var idx uint8 = 0
		// carry : is a little bit tricky in this case because of the first case
		// when idx == 0 and there is no carry at all
		carry := func(idx uint8) uint8 {
			if idx == 0 {
				return 0x00
			}
			return (inBuf[idx-1] << (8 - shift))
		}
		mask := func(idx uint8) uint8 {
			if idx == 0 {
				return (0x01 << destPos) - 1
			}
			return 0xFF
		}
		inIdx := inputInitialIdx
		for ; inIdx < inputLength; inIdx++ {
			//note: Should the mask be done BEFORE the OR with carry?
			e.buf[idx] |= ((inBuf[inIdx] >> shift) | carry(inIdx)) & mask(idx)

			idx++
		}
		if ((e.bitCounter + f.BitSize) % 8) > 0 {
			e.buf[idx] |= carry(inIdx)
		}
	}

	//now we should update buffer and bitCounter
	e.bitCounter = (e.bitCounter + f.BitSize) % 8

	// move the head to the next non-complete byte used
	headerUpdate := func() uint8 {
		if (e.bitCounter == 0) && ((f.BitSize % 8) != 0) {
			return (numBytes + 1)
		}
		return numBytes
	}

	e.buf = e.buf[headerUpdate():]

	return
}
英文:

I have been working on some patch for restruct.io to be able to work with bitfields .... Still not fully tested but seems to work ...

Will try to send a pull request once tested ...

func (e *encoder) writeBits(f field, inBuf []byte) {
var inputLength uint8 = uint8(len(inBuf))
if f.BitSize == 0 {
// Having problems with complex64 type ... so we asume we want to read all
//f.BitSize = uint8(f.Type.Bits())
f.BitSize = 8 * inputLength
}
// destPos: Destination position ( in the result ) of the first bit in the first byte
var destPos uint8 = 8 - e.bitCounter
// originPos: Original position of the first bit in the first byte
var originPos uint8 = f.BitSize % 8
if originPos == 0 {
originPos = 8
}
// numBytes: number of complete bytes to hold the result
var numBytes uint8 = f.BitSize / 8
// numBits: number of remaining bits in the first non-complete byte of the result
var numBits uint8 = f.BitSize % 8
// number of positions we have to shift the bytes to get the result
var shift uint8
if originPos &gt; destPos {
shift = originPos - destPos
} else {
shift = destPos - originPos
}
shift = shift % 8
var inputInitialIdx uint8 = inputLength - numBytes
if numBits &gt; 0 {
inputInitialIdx = inputInitialIdx - 1
}
if originPos &lt; destPos {
// shift left
carry := func(idx uint8) uint8 {
if (idx + 1) &lt; inputLength {
return (inBuf[idx+1] &gt;&gt; (8 - shift))
}
return 0x00
}
mask := func(idx uint8) uint8 {
if idx == 0 {
return (0x01 &lt;&lt; destPos) - 1
}
return 0xFF
}
var idx uint8 = 0
for inIdx := inputInitialIdx; inIdx &lt; inputLength; inIdx++ {
e.buf[idx] |= ((inBuf[inIdx] &lt;&lt; shift) | carry(inIdx)) &amp; mask(idx)
idx++
}
} else {
// originPos &gt;= destPos =&gt; shift right
var idx uint8 = 0
// carry : is a little bit tricky in this case because of the first case
// when idx == 0 and there is no carry at all
carry := func(idx uint8) uint8 {
if idx == 0 {
return 0x00
}
return (inBuf[idx-1] &lt;&lt; (8 - shift))
}
mask := func(idx uint8) uint8 {
if idx == 0 {
return (0x01 &lt;&lt; destPos) - 1
}
return 0xFF
}
inIdx := inputInitialIdx
for ; inIdx &lt; inputLength; inIdx++ {
//note: Should the mask be done BEFORE the OR with carry?
e.buf[idx] |= ((inBuf[inIdx] &gt;&gt; shift) | carry(inIdx)) &amp; mask(idx)
idx++
}
if ((e.bitCounter + f.BitSize) % 8) &gt; 0 {
e.buf[idx] |= carry(inIdx)
}
}
//now we should update buffer and bitCounter
e.bitCounter = (e.bitCounter + f.BitSize) % 8
// move the head to the next non-complete byte used
headerUpdate := func() uint8 {
if (e.bitCounter == 0) &amp;&amp; ((f.BitSize % 8) != 0) {
return (numBytes + 1)
}
return numBytes
}
e.buf = e.buf[headerUpdate():]
return
}

huangapple
  • 本文由 发表于 2016年12月12日 23:22:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/41104049.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定