在golang中,重复的空字节的protobuf编组

huangapple go评论75阅读模式
英文:

protobuf marshaling of repeated empty bytes in golang

问题

我有一个简单的protobuf消息

//test.proto
syntax = "proto3";
package Bar;

option go_package = "/Bar";

message Foo {
	repeated bytes Foo = 1;
}

主程序只是接收一个空的Foo消息并进行编组/解组:

package main

import (
	"fmt"

	"github.com/potuz/test/Bar"

	"google.golang.org/protobuf/proto"
)

func main() {
	msg := Bar.Foo{
		Foo: make([][]byte, 0),
	}
	buf, _ := proto.Marshal(&msg)

	newMsg := &Bar.Foo{}
	_ = proto.Unmarshal(buf, newMsg)

	if msg.Foo != nil {
		fmt.Println("msg.Foo is not nil")
	}

	if newMsg.Foo == nil {
		fmt.Println("newMsg.Foo is nil")
	}
}

该程序的输出是

$ ./main
msg.Foo is not nil
newMsg.Foo is nil

有没有一种简洁的方法解决这个问题?我需要实现一个包含repeated bytes字段的grpc服务器/客户端,当服务器用非空消息Foo = [][]byte{}响应时,客户端得到nil,我需要单独处理这种情况。

英文:

I have a simple protobuf message

//test.proto
syntax = "proto3";
package Bar;

option go_package = "/Bar";

message Foo {
	repeated bytes Foo = 1;
}

And the main program just takes an empty Foo message and marshals/unmarshals it:

package main

import (
	"fmt"

	"github.com/potuz/test/Bar"

	"google.golang.org/protobuf/proto"
)

func main() {
	msg := Bar.Foo{
		Foo: make([][]byte, 0),
	}
	buf, _ := proto.Marshal(&msg)

	newMsg := &Bar.Foo{}
	_ = proto.Unmarshal(buf, newMsg)

	if msg.Foo != nil {
		fmt.Println("msg.Foo is not nil")
	}

	if newMsg.Foo == nil {
		fmt.Println("newMsg.Foo is nil")
	}
}

The output of this program is

$ ./main
msg.Foo is not nil
newMsg.Foo is nil

Is there a clean way around this? I need to implement a grpc server/client where a message containing a repeated bytes field is present. And when the server responds with a non-nil message Foo = [][]byte{} the client gets nil and I need to deal with this particular case separately.

答案1

得分: 1

tl;dr 在 proto 消息中添加一个布尔标志,用于区分源切片是空还是 nil。

message Foo {
    repeated ByteSlice slice = 1;
}

message ByteSlice {
    bool is_empty = 1;
    bytes payload = 2;
}

<hr>

对于 Protobuf 序列化来说,nil 切片和长度为零的切片的语义是相同的;在 Protobuf 的 wire format 中,重复字段使用“Length-delimited”类型表示,如果输入长度为零,则序列化为空。在 Go 中,nil 和空切片都具有 len(b) == 0

下面的程序对于示例的三个消息都不会输出任何内容:

proto

syntax = "proto3";

package test;

option go_package = ".;pb";

message Foo {
  repeated string a = 1;
}

message Bar {
  repeated bytes b = 1;
}

main

package main

import (
	"example.com/pb"
	"fmt"
	"google.golang.org/protobuf/proto"
)

func main() {
	foo := &pb.Foo{ A: make([]string, 0) } // 空字符串切片
	foobytes, _ := proto.Marshal(foo)
	fmt.Printf("%v\n", foobytes)

	bar1 := &pb.Bar{ B: nil } // nil 二维字节切片
	bar1bytes, _ := proto.Marshal(bar1)
	fmt.Printf("%v\n", bar1bytes)

	bar2 := &pb.Bar{ B: make([][]byte, 0) } // 空二维字节切片
	bar2bytes, _ := proto.Marshal(bar2)
	fmt.Printf("%v\n", bar2bytes)

}

输出(全部为空):

[]
[]
[]

当消息反序列化为结构体时,字节字段(无论是否重复)将丢失,并且将导致一个 Go 切片的零值,即 nil

Protobuf 对于切片在源语言中的表示方式一无所知,无论切片是分配并为空还是只是 nil,这种差异是特定于 Go,而不是 Protobuf。

如果您需要保留这种语义,可以向消息中添加一个布尔字段:

message Foo {
    repeated ByteSlice slice = 1;
}

message ByteSlice {
    bool is_empty = 1;
    bytes payload = 2;
}

这仍然会解组为 nil 切片,但您可以通过检查布尔标志来检测空状态:

    newMsg := &Bar.Foo{}
    _ = proto.Unmarshal(buf, newMsg)
    if newMsg.Slice.IsEmpty {
        // 恢复原始状态...
        newMsg.Slice.Payload = make([][]byte, 0)
        // ...或者在知道切片为空时执行其他操作
    }

无论如何,请记住 len(newMsg.Slice.Payload) 都将是 0

<hr>

注意 如果您拥有的是包含空字节切片的 [][]byte,这意味着长度不为零,因此它将正确地进行序列化/反序列化:

	bar2 := &pb.Bar{
		B: [][]byte{
			make([]byte, 0), // 或者为 nil
			make([]byte, 0), // 或者为 nil
			make([]byte, 0), // 或者为 nil
		},
	}
	bar2bytes, _ := proto.Marshal(bar2)
	fmt.Printf("%v\n", bar2bytes)

输出

[10 0 10 0 10 0]
英文:

tl;dr add a bool flag to the proto message to distinguish when the source slice is empty or nil.

message Foo {
    repeated ByteSlice slice = 1;
}

message ByteSlice {
    bool is_empty = 1;
    bytes payload = 2;
}

<hr>

The semantics of a nil slice and a zero-length slice are the same, for the purposes of protobuffer serialization; repeated fields in Protobuffer wire format are represented with the "Length-delimited" type, which serializes to nothing if the input has zero length. And in Go both a nil and an empty slices have len(b) == 0.

The program below prints no output at all for the sample three messages:

proto

syntax = &quot;proto3&quot;;

package test;

option go_package = &quot;.;pb&quot;;

message Foo {
  repeated string a = 1;
}

message Bar {
  repeated bytes b = 1;
}

main

package main

import (
	&quot;example.com/pb&quot;
	&quot;fmt&quot;
	&quot;google.golang.org/protobuf/proto&quot;
)

func main() {
	foo := &amp;pb.Foo{	A: make([]string, 0) } // empty string slice
	foobytes, _ := proto.Marshal(foo)
	fmt.Printf(&quot;%v\n&quot;, foobytes)

	bar1 := &amp;pb.Bar{ B: nil } // nil 2D byte slice
	bar1bytes, _ := proto.Marshal(bar1)
	fmt.Printf(&quot;%v\n&quot;, bar1bytes)

	bar2 := &amp;pb.Bar{ B: make([][]byte, 0) } // empty 2D byte slice
	bar2bytes, _ := proto.Marshal(bar2)
	fmt.Printf(&quot;%v\n&quot;, bar2bytes)

}

output (all empty):

[]
[]
[]

When the message is then deserialized into a struct, the byte field (repeated or not) will be missing, and it will result in a Go slice zero value, which is nil.

Protobuffer knows nothing of how the slice is represented in the source language, whether the slice was allocated and empty or just nil. The difference is specific to Go, not to protobuffers.

If you need to preserve this semantics, you can add a boolean field to the message:

message Foo {
    repeated ByteSlice slice = 1;
}

message ByteSlice {
    bool is_empty = 1;
    bytes payload = 2;
}

This will still unmarshal to a nil slice, but you'll be able to detect the empty state by checking the bool flag:

    newMsg := &amp;Bar.Foo{}
    _ = proto.Unmarshal(buf, newMsg)
    if newMsg.Slice.IsEmpty {
        // restore the original state...
        newMsg.Slice.Payload = make([][]byte, 0)
        // ...or do something else knowing that the slice was empty
    }

Anyway remember that len(newMsg.Slice.Payload) will be 0 either way.

<hr>

NOTE if what you have is a [][]byte that contains empty byte slices, this means the length is not zero, therefore it will marshal/unmarshal correctly:

	bar2 := &amp;pb.Bar{
		B: [][]byte{
			make([]byte, 0), // or nil
			make([]byte, 0), // or nil
			make([]byte, 0), // or nil
		},
	}
	bar2bytes, _ := proto.Marshal(bar2)
	fmt.Printf(&quot;%v\n&quot;, bar2bytes)

output

[10 0 10 0 10 0]

huangapple
  • 本文由 发表于 2021年11月13日 09:59:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/69950874.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定