Golang和Python中的zlib有什么区别?

huangapple go评论140阅读模式
英文:

golang/python zlib difference

问题

调试Python的zlib和Golang的zlib之间的差异。为什么以下两者的结果不同?

compress.go:

  1. package main
  2. import (
  3. "compress/flate"
  4. "bytes"
  5. "fmt"
  6. )
  7. func compress(source string) []byte {
  8. w, _ := flate.NewWriter(nil, 7)
  9. buf := new(bytes.Buffer)
  10. w.Reset(buf)
  11. w.Write([]byte(source))
  12. w.Close()
  13. return buf.Bytes()
  14. }
  15. func main() {
  16. example := "foo"
  17. compressed := compress(example)
  18. fmt.Println(compressed)
  19. }

compress.py:

  1. from __future__ import print_function
  2. import zlib
  3. def compress(source):
  4. # golang zlib strips header + checksum
  5. compressor = zlib.compressobj(7, zlib.DEFLATED, -15)
  6. compressor.compress(source)
  7. # python zlib defaults to Z_FLUSH, but
  8. # https://golang.org/pkg/compress/flate/#Writer.Flush
  9. # says "Flush is equivalent to Z_SYNC_FLUSH"
  10. return compressor.flush(zlib.Z_SYNC_FLUSH)
  11. def main():
  12. example = u"foo"
  13. compressed = compress(example)
  14. print(list(bytearray(compressed)))
  15. if __name__ == "__main__":
  16. main()

结果

  1. $ go version
  2. go version go1.7.3 darwin/amd64
  3. $ go build compress.go
  4. $ ./compress
  5. [74 203 207 7 4 0 0 255 255]
  6. $ python --version
  7. $ python 2.7.12
  8. $ python compress.py
  9. [74, 203, 207, 7, 0, 0, 0, 255, 255]

Python版本的第五个字节为0,而Golang版本为4 - 是什么导致了不同的输出?

英文:

Debugging differences between Python's zlib and golang's zlib. Why don't the following have the same results?

compress.go:

  1. package main
  2. import (
  3. "compress/flate"
  4. "bytes"
  5. "fmt"
  6. )
  7. func compress(source string) []byte {
  8. w, _ := flate.NewWriter(nil, 7)
  9. buf := new(bytes.Buffer)
  10. w.Reset(buf)
  11. w.Write([]byte(source))
  12. w.Close()
  13. return buf.Bytes()
  14. }
  15. func main() {
  16. example := "foo"
  17. compressed := compress(example)
  18. fmt.Println(compressed)
  19. }

compress.py:

  1. from __future__ import print_function
  2. import zlib
  3. def compress(source):
  4. # golang zlib strips header + checksum
  5. compressor = zlib.compressobj(7, zlib.DEFLATED, -15)
  6. compressor.compress(source)
  7. # python zlib defaults to Z_FLUSH, but
  8. # https://golang.org/pkg/compress/flate/#Writer.Flush
  9. # says "Flush is equivalent to Z_SYNC_FLUSH"
  10. return compressor.flush(zlib.Z_SYNC_FLUSH)
  11. def main():
  12. example = u"foo"
  13. compressed = compress(example)
  14. print(list(bytearray(compressed)))
  15. if __name__ == "__main__":
  16. main()

Results

  1. $ go version
  2. go version go1.7.3 darwin/amd64
  3. $ go build compress.go
  4. $ ./compress
  5. [74 203 207 7 4 0 0 255 255]
  6. $ python --version
  7. $ python 2.7.12
  8. $ python compress.py
  9. [74, 203, 207, 7, 0, 0, 0, 255, 255]

The Python version has 0 for the fifth byte, but the golang version has 4 -- what's causing the different output?

答案1

得分: 4

Python示例的输出不是一个“完整”的流,它只是在压缩第一个字符串后刷新缓冲区。你可以通过将Close()替换为Flush()来在Go代码中获得相同的输出:

https://play.golang.org/p/BMcjTln-ej

  1. func compress(source string) []byte {
  2. buf := new(bytes.Buffer)
  3. w, _ := flate.NewWriter(buf, 7)
  4. w.Write([]byte(source))
  5. w.Flush()
  6. return buf.Bytes()
  7. }

然而,你正在比较Python中使用DEFLATE内部产生zlib格式输出的zlib库的输出,和Go中使用的flate库,它是一个DEFLATE实现。我不知道你是否可以让Python的zlib库输出原始的、完整的DEFLATE流,但是试图让不同的库输出逐字节匹配的压缩数据似乎没有用处,也不易于维护。压缩库的输出只保证是兼容的,而不是完全相同的。

英文:

The output from the python example isn't a "complete" stream, its just flushing the buffer after compressing the first string. You can get the same output from the Go code by replacing Close() with Flush():

https://play.golang.org/p/BMcjTln-ej

  1. func compress(source string) []byte {
  2. buf := new(bytes.Buffer)
  3. w, _ := flate.NewWriter(buf, 7)
  4. w.Write([]byte(source))
  5. w.Flush()
  6. return buf.Bytes()
  7. }

However, you are comparing output from zlib in python, which uses DEFLATE internally to produce a zlib format output, and flate in Go, which is a DEFLATE implementation. I don't know if you can get the python zlib library to output the raw, complete DEFLATE stream, but trying to get different libraries to output byte-for-byte matches of compressed data doesn't seem useful or maintainable. The output of the compression libraries is only guaranteed to be compatible, not identical.

huangapple
  • 本文由 发表于 2016年12月1日 00:29:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/40893411.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定