英文:
limitation on bytes.Buffer?
问题
我正在尝试使用"compress/gzip"包对字节切片进行gzip压缩。我正在写入一个bytes.Buffer,并写入45976个字节,当我尝试使用gzip.reader和reader函数解压缩内容时,发现并没有恢复所有内容。bytes.Buffer有一些限制吗?是否有办法绕过或更改这个限制?以下是我的代码(已编辑):
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i, err := w.Write([]byte(long_string))
if err != nil {
log.Fatal(err)
}
w.Close()
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
j, err := r.Read(b2)
if err != nil {
log.Fatal(err)
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
测试输出(使用选择的字符串作为long_string)将会是:
Wrote: 45976, Read: 32768
英文:
I am trying to gzip a slice of bytes using the package "compress/gzip". I am writing to a bytes.Buffer and I am writing 45976 bytes, when I am trying to uncompress the content using a gzip.reader and then reader function - I find that the not all of the content is recovered. Is there some limitations to bytes.buffer? and is it a way to by pass or alter this? here is my code (edit):
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i,err := w.Write([]byte(long_string))
if(err!=nil){
log.Fatal(err)
}
w.Close()
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
j, err := r.Read(b2)
if(err!=nil){
log.Fatal(err)
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
output from testing (with a chosen string as long_string) would give
Wrote: 45976, Read 32768
答案1
得分: 7
继续阅读以获取剩余的13208字节。第一次读取返回32768字节,第二次读取返回13208字节,第三次读取返回零字节和EOF。
例如,
package main
import (
"bytes"
"compress/gzip"
"fmt"
"io"
"log"
)
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i, err := w.Write([]byte(long_string))
if err != nil {
log.Fatal(err)
}
w.Close()
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
j := 0
for {
n, err := r.Read(b2[:cap(b2)])
b2 = b2[:n]
j += n
if err != nil {
if err != io.EOF {
log.Fatal(err)
}
if n == 0 {
break
}
}
fmt.Println(len(b2))
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
var long_string string
func main() {
long_string = string(make([]byte, 45976))
compress_and_uncompress()
}
输出:
32768
13208
Wrote: 45976 Read: 45976
英文:
Continue reading to get the remaining 13208 bytes. The first read returns 32768 bytes, the second read returns 13208 bytes, and the third read returns zero bytes and EOF.
For example,
package main
import (
"bytes"
"compress/gzip"
"fmt"
"io"
"log"
)
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i, err := w.Write([]byte(long_string))
if err != nil {
log.Fatal(err)
}
w.Close()
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
j := 0
for {
n, err := r.Read(b2[:cap(b2)])
b2 = b2[:n]
j += n
if err != nil {
if err != io.EOF {
log.Fatal(err)
}
if n == 0 {
break
}
}
fmt.Println(len(b2))
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
var long_string string
func main() {
long_string = string(make([]byte, 45976))
compress_and_uncompress()
}
Output:
32768
13208
Wrote: 45976 Read: 45976
答案2
得分: 4
使用ioutil.ReadAll。io.Reader的合约规定它不必返回所有数据,这是由于内部缓冲区大小的原因。ioutil.ReadAll
的工作方式类似于io.Reader,但会一直读取直到EOF。
例如(未经测试):
import "io/ioutil"
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i, err := w.Write([]byte(long_string))
if err != nil {
log.Fatal(err)
}
w.Close()
r, _ := gzip.NewReader(&buf)
b2, err := ioutil.ReadAll(r)
if err != nil {
log.Fatal(err)
}
r.Close()
fmt.Println("Wrote:", i, "Read:", len(b2))
}
英文:
Use ioutil.ReadAll. The contract for io.Reader says it doesn't have to return all the data and there is a good reason for it not to to do with sizes of internal buffers. ioutil.ReadAll
works like io.Reader but will read until EOF.
Eg (untested)
import "io/ioutil"
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i,err := w.Write([]byte(long_string))
if err!=nil {
log.Fatal(err)
}
w.Close()
r, _ := gzip.NewReader(&buf)
b2, err := ioutil.ReadAll(r)
if err!=nil {
log.Fatal(err)
}
r.Close()
fmt.Println("Wrote:", i, "Read:", len(b2))
}
答案3
得分: 1
如果从gzip.NewReader读取的内容没有返回整个预期的切片,你可以继续重新读取,直到接收到缓冲区中的所有数据。
关于你提到的问题,即重新读取后,后续的读取没有追加到切片的末尾,而是在开头追加,答案可以在gzip的Read函数的实现中找到,其中包括
208 z.digest.Write(p[0:n])
这将导致字符串在开头进行“追加”。
可以通过以下方式解决这个问题:
func compress_and_uncompress(long_string string) {
// Writer
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i, err := w.Write([]byte(long_string))
if err != nil {
log.Fatal(err)
}
w.Close()
// Reader
var j, k int
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
for j = 0; ; j += k {
k, err = r.Read(b2[j:]) // 在这里添加偏移量
if err != nil {
if err != io.EOF {
log.Fatal(err)
} else {
break
}
}
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
结果将是:
Wrote: 45976 Read: 45976
在使用45976个字符的字符串进行测试后,我可以确认输出与输入完全相同,第二部分正确地追加在第一部分之后。
gzip.Read的源代码:http://golang.org/src/pkg/compress/gzip/gunzip.go?s=4633:4683#L189
英文:
If the read from gzip.NewReader does not return the whole expected slice. You can just keep re-reading until you have recieved all the data in the buffer.
Regarding you problem where if you re-read the subsequent reads did not append to the end of the slice, but instead at the beginning; the answer can be found in the implementation of gzip's Read function, which includes
208 z.digest.Write(p[0:n])
This will result in an "append" at the beginning of the string.
This can be solves in this manner
func compress_and_uncompress(long_string string) {
// Writer
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i,err := w.Write([]byte(long_string))
if(err!=nil){
log.Fatal(err)
}
w.Close()
// Reader
var j, k int
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
for j=0 ; ; j+=k {
k, err = r.Read(b2[j:]) // Add the offset here
if(err!=nil){
if(err != io.EOF){
log.Fatal(err)
} else{
break
}
}
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
The result will be:
Wrote: 45976 Read: 45976
Also after testing with a string of 45976 characters i can confirm that the output is in exactly the same manner as the input, where the second part is correctly appended after the first part.
Source for gzip.Read: http://golang.org/src/pkg/compress/gzip/gunzip.go?s=4633:4683#L189
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论