英文:
Why kafka avro serializer is faster than json\string?
问题
Avro序列化器比JSON(或字符串)序列化器更快的原因是什么?
例如,如果我没错的话,字符串序列化器调用javaObject.toString()然后将字符串转换为byte[],Avro具有与JSON类似的格式,所以它创建类似的字符串,然后将其转换为byte[]?
如果您发送已经准备好的字符串,情况是否相同?
我预计Avro序列化器对于对象稍微更好,对于字符串来说差不多。但每个人都说Avro对于对象要好得多。我正在阅读Apache Kafka书籍Todd Palino,但里面没有关于这个的信息。
英文:
How it's possible that avro serializer is faster than json(or string) serializers for objects/strings?
For example, if I am right, String serializer calls javaObject.toString() and then convert string to byte[], avro has same json-similar format, so it creates similar string and than convert to bytes[]?
Is it the same if you send already prepared String?
I expect than avro serializer is slightly better for objects and the same for strings. But everyone says avro is much better for objects. I'm Reading Apache Kafka book Todd Palino but there is zero info about it.
答案1
得分: 1
> Avro 具有与 JSON 类似的格式
并不是这样。二进制 Avro 与其在 JSON 中的模式定义不同。
Avro 永远不会转换为字符串,因此比 JSON 更紧凑(没有引号、冒号、空格、括号等)。Avro API 使用 BinaryEncoder
和 ByteBuffer
对象来构建 byte[]
。
这篇帖子可能会突出显示 JSON 和其他格式之间的一些差异 - https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
Protobuf、Flatbuffers、JSON-smile、Captn-Proto 等都比原始 JSON(或任何)字符串更有效地构建/解析为 byte[]
。
英文:
> avro has same json-similar format
It does not. Binary Avro is not the same as its schema definition in JSON.
Avro is not converted to a string at any point, therefore is more compact than JSON (no quotes, colons, spaces, brackets, etc). The Avro API uses BinaryEncoder
and a ByteBuffer
object to build the byte[]
.
This post may highlight some differences in JSON and other formats - https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
Protobuf, Flatbuffers, JSON-smile, Captn-Proto, etc are all more effcient to build/parse as byte[]
than raw JSON (or any) strings.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论