英文:
protobuf marshaling of repeated empty bytes in golang
问题
我有一个简单的protobuf消息
//test.proto
syntax = "proto3";
package Bar;
option go_package = "/Bar";
message Foo {
repeated bytes Foo = 1;
}
主程序只是接收一个空的Foo
消息并进行编组/解组:
package main
import (
"fmt"
"github.com/potuz/test/Bar"
"google.golang.org/protobuf/proto"
)
func main() {
msg := Bar.Foo{
Foo: make([][]byte, 0),
}
buf, _ := proto.Marshal(&msg)
newMsg := &Bar.Foo{}
_ = proto.Unmarshal(buf, newMsg)
if msg.Foo != nil {
fmt.Println("msg.Foo is not nil")
}
if newMsg.Foo == nil {
fmt.Println("newMsg.Foo is nil")
}
}
该程序的输出是
$ ./main
msg.Foo is not nil
newMsg.Foo is nil
有没有一种简洁的方法解决这个问题?我需要实现一个包含repeated bytes
字段的grpc服务器/客户端,当服务器用非空消息Foo = [][]byte{}
响应时,客户端得到nil
,我需要单独处理这种情况。
英文:
I have a simple protobuf message
//test.proto
syntax = "proto3";
package Bar;
option go_package = "/Bar";
message Foo {
repeated bytes Foo = 1;
}
And the main program just takes an empty Foo
message and marshals/unmarshals it:
package main
import (
"fmt"
"github.com/potuz/test/Bar"
"google.golang.org/protobuf/proto"
)
func main() {
msg := Bar.Foo{
Foo: make([][]byte, 0),
}
buf, _ := proto.Marshal(&msg)
newMsg := &Bar.Foo{}
_ = proto.Unmarshal(buf, newMsg)
if msg.Foo != nil {
fmt.Println("msg.Foo is not nil")
}
if newMsg.Foo == nil {
fmt.Println("newMsg.Foo is nil")
}
}
The output of this program is
$ ./main
msg.Foo is not nil
newMsg.Foo is nil
Is there a clean way around this? I need to implement a grpc server/client where a message containing a repeated bytes
field is present. And when the server responds with a non-nil message Foo = [][]byte{}
the client gets nil
and I need to deal with this particular case separately.
答案1
得分: 1
tl;dr 在 proto 消息中添加一个布尔标志,用于区分源切片是空还是 nil。
message Foo {
repeated ByteSlice slice = 1;
}
message ByteSlice {
bool is_empty = 1;
bytes payload = 2;
}
<hr>
对于 Protobuf 序列化来说,nil
切片和长度为零的切片的语义是相同的;在 Protobuf 的 wire format 中,重复字段使用“Length-delimited”类型表示,如果输入长度为零,则序列化为空。在 Go 中,nil
和空切片都具有 len(b) == 0
。
下面的程序对于示例的三个消息都不会输出任何内容:
proto
syntax = "proto3";
package test;
option go_package = ".;pb";
message Foo {
repeated string a = 1;
}
message Bar {
repeated bytes b = 1;
}
main
package main
import (
"example.com/pb"
"fmt"
"google.golang.org/protobuf/proto"
)
func main() {
foo := &pb.Foo{ A: make([]string, 0) } // 空字符串切片
foobytes, _ := proto.Marshal(foo)
fmt.Printf("%v\n", foobytes)
bar1 := &pb.Bar{ B: nil } // nil 二维字节切片
bar1bytes, _ := proto.Marshal(bar1)
fmt.Printf("%v\n", bar1bytes)
bar2 := &pb.Bar{ B: make([][]byte, 0) } // 空二维字节切片
bar2bytes, _ := proto.Marshal(bar2)
fmt.Printf("%v\n", bar2bytes)
}
输出(全部为空):
[]
[]
[]
当消息反序列化为结构体时,字节字段(无论是否重复)将丢失,并且将导致一个 Go 切片的零值,即 nil
。
Protobuf 对于切片在源语言中的表示方式一无所知,无论切片是分配并为空还是只是 nil
,这种差异是特定于 Go,而不是 Protobuf。
如果您需要保留这种语义,可以向消息中添加一个布尔字段:
message Foo {
repeated ByteSlice slice = 1;
}
message ByteSlice {
bool is_empty = 1;
bytes payload = 2;
}
这仍然会解组为 nil
切片,但您可以通过检查布尔标志来检测空状态:
newMsg := &Bar.Foo{}
_ = proto.Unmarshal(buf, newMsg)
if newMsg.Slice.IsEmpty {
// 恢复原始状态...
newMsg.Slice.Payload = make([][]byte, 0)
// ...或者在知道切片为空时执行其他操作
}
无论如何,请记住 len(newMsg.Slice.Payload)
都将是 0
。
<hr>
注意 如果您拥有的是包含空字节切片的 [][]byte
,这意味着长度不为零,因此它将正确地进行序列化/反序列化:
bar2 := &pb.Bar{
B: [][]byte{
make([]byte, 0), // 或者为 nil
make([]byte, 0), // 或者为 nil
make([]byte, 0), // 或者为 nil
},
}
bar2bytes, _ := proto.Marshal(bar2)
fmt.Printf("%v\n", bar2bytes)
输出
[10 0 10 0 10 0]
英文:
tl;dr add a bool flag to the proto message to distinguish when the source slice is empty or nil.
message Foo {
repeated ByteSlice slice = 1;
}
message ByteSlice {
bool is_empty = 1;
bytes payload = 2;
}
<hr>
The semantics of a nil
slice and a zero-length slice are the same, for the purposes of protobuffer serialization; repeated fields in Protobuffer wire format are represented with the "Length-delimited" type, which serializes to nothing if the input has zero length. And in Go both a nil and an empty slices have len(b) == 0
.
The program below prints no output at all for the sample three messages:
proto
syntax = "proto3";
package test;
option go_package = ".;pb";
message Foo {
repeated string a = 1;
}
message Bar {
repeated bytes b = 1;
}
main
package main
import (
"example.com/pb"
"fmt"
"google.golang.org/protobuf/proto"
)
func main() {
foo := &pb.Foo{ A: make([]string, 0) } // empty string slice
foobytes, _ := proto.Marshal(foo)
fmt.Printf("%v\n", foobytes)
bar1 := &pb.Bar{ B: nil } // nil 2D byte slice
bar1bytes, _ := proto.Marshal(bar1)
fmt.Printf("%v\n", bar1bytes)
bar2 := &pb.Bar{ B: make([][]byte, 0) } // empty 2D byte slice
bar2bytes, _ := proto.Marshal(bar2)
fmt.Printf("%v\n", bar2bytes)
}
output (all empty):
[]
[]
[]
When the message is then deserialized into a struct, the byte field (repeated or not) will be missing, and it will result in a Go slice zero value, which is nil
.
Protobuffer knows nothing of how the slice is represented in the source language, whether the slice was allocated and empty or just nil. The difference is specific to Go, not to protobuffers.
If you need to preserve this semantics, you can add a boolean field to the message:
message Foo {
repeated ByteSlice slice = 1;
}
message ByteSlice {
bool is_empty = 1;
bytes payload = 2;
}
This will still unmarshal to a nil
slice, but you'll be able to detect the empty state by checking the bool flag:
newMsg := &Bar.Foo{}
_ = proto.Unmarshal(buf, newMsg)
if newMsg.Slice.IsEmpty {
// restore the original state...
newMsg.Slice.Payload = make([][]byte, 0)
// ...or do something else knowing that the slice was empty
}
Anyway remember that len(newMsg.Slice.Payload)
will be 0
either way.
<hr>
NOTE if what you have is a [][]byte
that contains empty byte slices, this means the length is not zero, therefore it will marshal/unmarshal correctly:
bar2 := &pb.Bar{
B: [][]byte{
make([]byte, 0), // or nil
make([]byte, 0), // or nil
make([]byte, 0), // or nil
},
}
bar2bytes, _ := proto.Marshal(bar2)
fmt.Printf("%v\n", bar2bytes)
output
[10 0 10 0 10 0]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论