英文:
Compressing data from a reader in Go
问题
这个程序出现死锁的原因是在compress
函数中,io.Copy
方法将数据从data
读取并写入到gw
(gzip.Writer)中,但是io.Copy
方法是阻塞的,直到所有数据都被写入到gw
中。然而,gw
是通过io.Pipe
创建的,而io.Pipe
是一个同步的管道,它需要读取端和写入端同时被使用,否则会导致死锁。
在这种情况下,io.Copy
方法阻塞在写入端,等待有其他的goroutine从读取端读取数据。然而,在compress
函数中并没有启动一个新的goroutine来读取数据,因此导致了死锁。
为了解决这个问题,你可以在compress
函数中启动一个新的goroutine来读取数据,然后在主goroutine中等待读取完成。这样可以避免死锁的发生。以下是修改后的代码:
func compress(data io.Reader) (io.Reader, error) {
pr, pw := io.Pipe()
gw := gzip.NewWriter(pw)
go func() {
defer gw.Close()
defer pw.Close()
_, err := io.Copy(gw, data)
if err != nil {
fmt.Printf("error: %s", err.Error())
}
}()
return pr, nil
}
这样修改后,compress
函数会启动一个新的goroutine来读取数据,并在读取完成后关闭gw
和pw
。主goroutine会立即返回pr
,这样你就可以从pr
中读取压缩后的数据。
关于从io.Reader
中最高效地压缩数据的方法,你已经使用了compress/gzip
包,这是一个很好的选择。如果你对压缩率和速度有更高的要求,可以考虑使用其他的压缩算法,如compress/zlib
包提供的zlib.Writer
。不同的压缩算法可能在不同的数据集上表现更好,你可以根据实际情况进行测试和比较。
英文:
I have the following short program written in Go, which attempts to transparently compress the data in a reader (https://play.golang.org/p/SnvYT6it5r):
package main
import (
"fmt"
"io"
"bytes"
"compress/gzip"
)
func main() {
data := bytes.NewReader([]byte("hello world"))
compress(data)
}
func compress(data io.Reader) (io.Reader, error) {
pr, pw := io.Pipe()
gw := gzip.NewWriter(pw)
n, err := io.Copy(gw, data)
if err != nil {
fmt.Printf("error: %s", err.Error())
} else {
fmt.Printf("%d bytes compressed", n)
}
return pr, err
}
When I run it, the program hangs:
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [semacquire]:
sync.runtime_notifyListWait(0x1043e6cc, 0x0)
/usr/local/go/src/runtime/sema.go:297 +0x140
sync.(*Cond).Wait(0x1043e6c4, 0x137118)
/usr/local/go/src/sync/cond.go:57 +0xc0
io.(*pipe).write(0x1043e680, 0x1045a055, 0xa, 0xa, 0x0, 0x0, 0x0, 0x101)
/usr/local/go/src/io/pipe.go:90 +0x1a0
io.(*PipeWriter).Write(0x1040c180, 0x1045a055, 0xa, 0xa, 0xe205ef63, 0x34c, 0x0, 0x0)
/usr/local/go/src/io/pipe.go:157 +0x40
compress/gzip.(*Writer).Write(0x1045a000, 0x1040a130, 0xb, 0x10, 0x2c380, 0x7654, 0x1059e0, 0x111480)
/usr/local/go/src/compress/gzip/gzip.go:168 +0x2e0
bytes.(*Reader).WriteTo(0x10440240, 0x190610, 0x1045a000, 0x0, 0xfef64000, 0x10440240, 0x1045a001, 0x190670)
/usr/local/go/src/bytes/reader.go:134 +0xe0
io.copyBuffer(0x190610, 0x1045a000, 0x1905d0, 0x10440240, 0x0, 0x0, 0x0, 0x106620, 0x1045a000, 0x0, ...)
/usr/local/go/src/io/io.go:380 +0x360
io.Copy(0x190610, 0x1045a000, 0x1905d0, 0x10440240, 0x10440240, 0x0, 0x1a47c0, 0x0)
/usr/local/go/src/io/io.go:360 +0x60
main.compress(0x1905d0, 0x10440240, 0x10440240, 0x1040c170, 0x1040a130, 0xb)
/tmp/sandbox403912545/main.go:19 +0x180
main.main()
/tmp/sandbox403912545/main.go:12 +0xe0
What is causing the deadlock, and what is the most efficient way to compress data from a reader?
答案1
得分: 3
你向io.Pipe
写入数据,但你从未从中读取(在一个并行的go例程中),因此导致了死锁。以下是文档中的说明:
管道上的读取和写入是一对一匹配的,除非需要多个读取来消耗单个写入。也就是说,每次写入到
PipeWriter
都会阻塞,直到满足一个或多个从PipeReader
读取的要求,这些读取完全消耗了写入的数据。数据直接从写入到相应的读取(或读取)中进行复制;没有内部缓冲。
https://golang.org/pkg/io/#Pipe
英文:
You write to io.Pipe
but you never read from it (in a parallel go routine), hence the deadlock. Here is what the docs say:
>Reads and Writes on the pipe are matched one to one except when multiple Reads are needed to consume a single Write. That is, each Write to the PipeWriter blocks until it has satisfied one or more Reads from the PipeReader that fully consume the written data. The data is copied directly from the Write to the corresponding Read (or Reads); there is no internal buffering.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论