当使用不同的字段值时,Protobuf序列化长度会有所不同。

huangapple go评论78阅读模式
英文:

Protobuf serialization length differs when using different values for field

问题

我的Protobuf消息由3个双精度数组成。

syntax = "proto3";

message TestMessage{
  double input = 1;
  double output = 2;
  double info = 3;
}

当我将这些值设置为:

test.set_input(2.3456);
test.set_output(5.4321);
test.set_info(5.0);

序列化后的消息如下所示:

00000000  09 16 fb cb ee c9 c3 02  40 11 0a 68 22 6c 78 ba  |........@..h"lx.|
00000010  15 40 19                                          |.@.|
00000013

使用test.serializeToArray序列化后,无法成功通过使用相同的Protobuf消息的Go程序进行反序列化。当尝试从C++程序中读取时,我得到了一个0作为info,因此消息似乎已损坏。

使用test.serializeToOstream时,我得到了以下消息,可以成功地由Go和C++程序进行反序列化:

00000000  09 16 fb cb ee c9 c3 02  40 11 0a 68 22 6c 78 ba  |........@..h"lx.|
00000010  15 40 19 00 00 00 00 00  00 14 40               |.@........@|
0000001b

当将值设置为:

test.set_input(2.3456);
test.set_output(5.4321);
test.set_info(5.5678);

通过test.serializeToArraytest.serializeToOstream序列化的消息如下所示:

00000000  09 16 fb cb ee c9 c3 02  40 11 0a 68 22 6c 78 ba  |........@..h"lx.|
00000010  15 40 19 da ac fa 5c 6d  45 16 40                 |.@....\mE.@|
0000001b

并且可以成功地由我的Go和C++程序读取。

我在这里漏掉了什么?为什么serializeToArray在第一种情况下不起作用?

编辑:
事实证明,serializeToString也可以正常工作。

这是我用于比较的代码:

file_a.open(FILEPATH_A);
file_b.open(FILEPATH_B);

test.set_input(2.3456);
test.set_output(5.4321);
test.set_info(5.0);

//serializeToArray
int size = test.ByteSize();
char *buffer = (char*) malloc(size);
test.SerializeToArray(buffer, size);
file_a << buffer;

//serializeToString
std::string buf;
test.SerializeToString(&buf);
file_b << buf;

file_a.close();
file_b.close();

为什么serializeToArray不按预期工作?

编辑2:
当使用file_b << buf.data()而不是file_b << buf时,数据也会损坏,但是为什么?

英文:

My Protobuf message consists of 3 doubles

syntax = &quot;proto3&quot;;

message TestMessage{
  double input = 1;
  double output = 2;
  double info = 3;
}

When I set these values to

test.set_input(2.3456);
test.set_output(5.4321);
test.set_info(5.0);

the serialized message looks like

00000000  09 16 fb cb ee c9 c3 02  40 11 0a 68 22 6c 78 ba  |........@..h&quot;lx.|
00000010  15 40 19                                          |.@.|
00000013

when using test.serializeToArray and could not be deserialized successfully by a go program using the same protobuf message. When trying to read it from a c++ program I got a 0 as info, so the message seems to be corrupted.

When using test.serializeToOstream I got this message, which could be deserialized successfully by both go and c++ programs.

00000000  09 16 fb cb ee c9 c3 02  40 11 0a 68 22 6c 78 ba  |........@..h&quot;lx.|
00000010  15 40 19 00 00 00 00 00  00 14 40               |.@........@|
0000001b

When setting the values to

test.set_input(2.3456);
test.set_output(5.4321);
test.set_info(5.5678);

the serialized messages, both produced by test.serializeToArray and test.serializeToOstream look like

00000000  09 16 fb cb ee c9 c3 02  40 11 0a 68 22 6c 78 ba  |........@..h&quot;lx.|
00000010  15 40 19 da ac fa 5c 6d  45 16 40                 |.@....\mE.@|
0000001b

and could be successfully read by my go and cpp program.

What am I missing here? Why is serializeToArray not working in the first case?

EDIT:
As it turns out, serializeToString works fine, too.

Here the code I used for the comparison:

file_a.open(FILEPATH_A);
file_b.open(FILEPATH_B);

test.set_input(2.3456);
test.set_output(5.4321);
test.set_info(5.0);

//serializeToArray
int size = test.ByteSize();
char *buffer = (char*) malloc(size);
test.SerializeToArray(buffer, size);
file_a &lt;&lt; buffer;

//serializeToString
std::string buf;
test.SerializeToString(&amp;buf);
file_b &lt;&lt; buf;

file_a.close();
file_b.close();

Why does serializeToArray not work as expected?

EDIT2:

When using file_b &lt;&lt; buf.data() instead of file_b &lt;&lt; buf.data(), the data gets corrupted as well, but why?

答案1

得分: 2

我认为你犯的错误是将二进制视为字符数据并使用字符数据的API。许多这些API在遇到第一个空字节(0)时就会停止,但在protobuf二进制中,这是一个完全有效的值。

你需要确保不使用任何此类API,而是纯粹使用二进制安全的API。

由于你指示size为27,这一切都符合。

基本上,5.0的二进制表示包含了0字节,但你可能在其他时间的值中也会遇到同样的问题。

英文:

I think the error you're making is treating binary as character data and using character data APIs. Many of those APIs stop at the first nil byte (0), but that is a totally valid value in protobuf binary.

You need to make sure you don't use any such APIs basically - stick purely to binary safe APIs.

Since you indicate that size is 27, this all fits.

Basically, the binary representation of 5.0 includes 0 bytes, but you could easily have seen the same problem for other values in time.

huangapple
  • 本文由 发表于 2017年6月28日 23:08:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/44806362.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定