英文:
Why using unix-compress and go compress/lzw produce different files, not readable by the other decoder?
问题
我在终端中使用compress file.txt
命令压缩了一个文件,并得到了(如预期的)file.txt.Z
。
当我将该文件传递给Go中的ioutil.ReadFile
函数时,
buf0, err := ioutil.ReadFile("file.txt.Z")
我得到了以下错误(上面的代码是第116行):
finder_test.go:116: lzw: invalid code
我发现如果我使用compress/lzw
包压缩文件,Go将接受该文件,我只是使用了来自一个网站的代码来实现这一点。我只修改了这一行:
outputFile, err := os.Create("file.txt.lzw")
我将.lzw
改为.Z
,然后在上面的Go代码中使用生成的file.txt.Z
,这样就没有错误了。
注意:file.txt
的大小为16.0 kB,Unix压缩的file.txt.Z
大小为7.8 kB,Go压缩的file.txt.Z
大小为8.2 kB。
现在,我试图理解为什么会发生这种情况。所以,我尝试运行
uncompress.real file.txt.Z
但它没有起作用。我得到了以下错误:
file.txt.Z: not in compressed format
我需要使用一个压缩程序(最好是unix-compress
)使用lzw-compression
压缩文件,然后在两个不同的算法中使用相同的压缩文件,一个是用C编写的,另一个是用Go编写的,因为我打算比较这两个算法的性能。C
程序只接受使用unix-compress
压缩的文件,而Go程序只接受使用Go的compress/lzw
压缩的文件。
有人能解释一下为什么会发生这种情况吗?为什么这两个.Z文件不等效?我该如何解决这个问题?
注意:我在Mac上安装了VirtualBox中的Ubuntu。
英文:
I compressed a file in a terminal with compress file.txt
and got (as expected) file.txt.Z
When I pass that file to ioutil.ReadFile
in Go,
buf0, err := ioutil.ReadFile("file.txt.Z")
I get the error (the line above is 116):
finder_test.go:116: lzw: invalid code
I found that Go would accept the file if I compress it using the compress/lzw
package, I just used code from <a href="https://www.socketloop.com/references/golang-compress-lzw-newwriter-function-example">a website</a> that does that. I only modified the line
outputFile, err := os.Create("file.txt.lzw")
I changed the .lzw
to .Z
. then used the resulting file.txt.Z
in the Go code at the top, and it worked fine, no error.
Note: file.txt
is 16.0 kB, unix-compressed file.txt.Z
is 7.8 kB, and go-compressed file.txt.Z
is 8.2 kB
Now, I was trying to understand why this happened. So, I tried to run
uncompress.real file.txt.Z
and it did not work. I got
file.txt.Z: not in compressed format
I need to use a compressor (preferably unix-compress
) to compress files using lzw-compression
then use the same compressed files on two different algorithms, one written in C and the other in Go, because I intend to compare the performance of the two algorithms. The C
program will only accept the files compressed with unix-compress
and the Go program will only accept the files compressed with Go's compress/lzw
.
Can someone explain why that happened? Why are the two .Z files not equivalent? How can I overcome this?
Note: I am working on Ubuntu installed in VirtualBox on a Mac.
答案1
得分: 2
一个.Z文件不仅包含LZW压缩数据,还包含一个3字节的头部。Go LZW代码不会生成这个头部,因为它的目的是压缩数据,而不是生成一个Z文件。
英文:
A .Z file does not only contain LZW compressed data, there is also a 3-bytes header that the Go LZW code does not generate because it is meant to compress data, not generate a Z file.
答案2
得分: 1
你可能只想测试你的两个算法或第三方算法的性能(而不是压缩算法本身),你可以编写一个调用压缩命令并传递所需文件/目录的shell脚本,然后从你的C / GO程序中调用这个脚本。这是一种你可以解决这个问题的方法,但这还需要解决如何正确使用压缩库的其他部分的问题。
英文:
Presumably you only want to test the performance of two of your/some third party algorithms (& not the compression algorithms themselves), you may want to write a shell script which calls the compress command passing the files/dir's required and then call this script from your C / GO program. This is one way you can overcome this, but leaves open other parts of your queries on the correct way to use the compression libraries.
答案3
得分: 0
这个问题背后有一个古老的错误,名为"对齐位组"。我在维基百科的"特殊输出格式"中对其进行了描述,请阅读。
我实现了一个新的库lzws。它具有所有可能的选项:
--without-magic-header
(-w
) - 禁用魔术头部--max-code-bit-length
(-b
) - 设置最大代码位长度(9-16)--raw
(-r
) - 禁用块模式--msb
(-m
) - 启用最高有效位--unaligned-bit-groups
(-u
) - 启用非对齐位组
您可以以所有可能的组合使用任何选项。已经测试了所有组合。我相信您可以找到适合Go LZW实现的组合。
如果您喜欢使用Ruby,您可以使用ruby-lzws绑定。
英文:
There is an ancient bug named "alignment bit groups" behind this question. I've described it in wikipedia "Special output format". Please read.
I've implemented a new library lzws. It has all possible options:
--without-magic-header
(-w
) - disable magic header--max-code-bit-length
(-b
) - set max code bit length (9-16)--raw
(-r
) - disable block mode--msb
(-m
) - enable most significant bit--unaligned-bit-groups
(-u
) - enable unaligned bit groups
You can use any options in all possible combinations. All combinations has been tested. I am sure that you can find combinations suitable for go lzw implementation.
You can use ruby-lzws binding if you like to use ruby.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论