英文:
What is a correct way of writing data in gzip format?
问题
我的应用程序生成了大量的文本数据,为了减少磁盘消耗,我想以gzip格式写入数据。
许多goroutine同时调用WriteData()函数。
但是Linux的gzip抱怨文件损坏。
zcat ./2021-08-11-00.gz > /dev/null
gzip: ./2021-08-11-00.gz: invalid compressed data--format violated
这种情况并不是每次都发生,但大约每两到三个写入的文件中会发生一次。
我的代码有什么问题?
我的DataWrite包看起来像这样:
package storage
import (
"compress/gzip"
"os"
"sync"
"github.com/rs/zerolog/log"
)
type Storage struct {
handle *os.File
writer *gzip.Writer
lock sync.Mutex
}
func (s *Storage) Init(filename string) error {
file, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
return err
}
s.handle = file
s.writer = gzip.NewWriter(file)
return nil
}
func (s *Storage) Shutdown() {
if err := s.writer.Close(); err != nil {
log.Warn().Err(err).Msg("Gzip writer close failed")
}
if err := s.handle.Close(); err != nil {
log.Warn().Err(err).Msg("Gzip handle close failed")
}
}
func (s *Storage) WriteData(data *MyStruct) error {
s.lock.Lock()
defer s.lock.Unlock()
buffer := data.content
_, err := s.writer.Write([]byte(buffer))
if err != nil {
log.Warn().Err(err).Msg("Gzip write failed")
return err
}
if err := s.writer.Flush(); err != nil {
return err
}
if err := s.handle.Sync(); err != nil {
return err
}
return nil
}
英文:
My application produces a lot of text data, to reduce disk consumption I want to write data in gzip format
Many goroutines simultaneously call WriteData() function.
But linux gzip complains about corrupted file.
zcat ./2021-08-11-00.gz > /dev/null
gzip: ./2021-08-11-00.gz: invalid compressed data--format violated
It happend not every time, but about every second-trird writed file.
What is wrong with my code?
My DataWrite package looks like
package storage
import (
"compress/gzip"
"os"
"sync"
"github.com/rs/zerolog/log"
)
type Storage struct {
handle *os.File
writer *gzip.Writer
lock sync.Mutex
}
func (s *Storage) Init(filename string) error {
file, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
return err
}
s.handle = file
s.writer = gzip.NewWriter(file)
return nil
}
func (s *Storage) Shutdown() {
if err := s.writer.Close(); err != nil {
log.Warn().Err(err).Msg("Gzip writer close failed")
}
if err := s.handle.Close(); err != nil {
log.Warn().Err(err).Msg("Gzip handle close failed")
}
}
func (s *Storage) WriteData(data *MyStruct) error {
s.lock.Lock()
defer s.lock.Unlock()
buffer := data.content
_, err := s.writer.Write([]byte(buffer))
if err != nil {
log.Warn().Err(err).Msg("Gzip write failed")
return err
}
if err := s.writer.Flush(); err != nil {
return err
}
if err := s.handle.Sync(); err != nil {
return err
}
return nil
}
答案1
得分: 1
你没有同步关闭和写入操作。
package storage
type Storage struct {
handle *os.File
writer *gzip.Writer
lock sync.Mutex
}
func (s *Storage) Init(filename string) {
file, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
return err
}
s.handle = file
s.writer = gzip.NewWriter(file)
}
func (s *Storage) Shutdown() {
s.lock.Lock() // 这里!!
defer s.lock.Unlock()
if err := s.writer.Close(); err != nil {
log.Warn().Err(err).Str("fileName", path).Msg("Gzip writer close failed")
}
if err := s.handle.Close(); err != nil {
log.Warn().Err(err).Str("fileName", path).Msg("Gzip handle close failed")
}
}
func (s *Storage) WriteData(data *MyStruct) error {
s.lock.Lock()
defer s.lock.Unlock()
cnt, err := s.writer.Write([]byte(buffer))
if err != nil {
log.Warn().Err(err).Msg("Gzip write failed")
return err
}
if err := s.writer.Flush(); err != nil {
return err
}
if err := s.handle.Sync(); err != nil {
return err
}
return nil
}
英文:
You are not synchronizing Shutdown and Write.
package storage
type Storage struct {
handle *os.File
writer *gzip.Writer
lock sync.Mutex
}
func (s *Storage) Init(filename string) {
file, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
return err
}
s.handle = file
s.writer = gzip.NewWriter(file)
}
func (s *Storage) Shutdown() {
s.lock.Lock() // Here !!
defer s.lock.Unlock()
if err := s.writer.Close(); err != nil {
log.Warn().Err(err).Str("fileName", path).Msg("Gzip writer close failed")
}
if err := s.handle.Close(); err != nil {
log.Warn().Err(err).Str("fileName", path).Msg("Gzip handle close failed")
}
}
func (s *Storage) WriteData(data *MyStruct) error {
s.lock.Lock()
defer s.lock.Unlock()
cnt, err := s.writer.Write([]byte(buffer))
if err != nil {
log.Warn().Err(err).Msg("Gzip write failed")
return err
}
if err := s.writer.Flush(); err != nil {
return err
}
if err := s.handle.Sync(); err != nil {
return err
}
return nil
}
答案2
得分: -2
以下是gzip压缩的工作代码:
package main
import (
"compress/gzip"
"log"
"os"
"time"
"sync"
)
type Storage struct {
handle *os.File
writer *gzip.Writer
buffer []byte
lock sync.Mutex
Name string
Comment string
ModTime time.Time
}
func (s *Storage) Init(filename string) {
file, err := os.OpenFile(filename, os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
log.Fatal(err)
}
s.handle = file
s.writer = gzip.NewWriter(file)
s.Name = "a-new-hope.txt"
s.Comment = "an epic space opera by George Lucas"
s.ModTime = time.Date(1977, time.May, 25, 0, 0, 0, 0, time.UTC)
s.buffer = []byte("Hello")
}
func (s *Storage) Shutdown() {
if err := s.writer.Close(); err != nil {
log.Fatal("Gzip writer close failed")
}
if err := s.handle.Close(); err != nil {
log.Fatal("Gzip writer close failed")
}
}
func (s *Storage) WriteData() error {
s.lock.Lock()
defer s.lock.Unlock()
_, err := s.writer.Write([]byte(s.buffer))
if err != nil {
log.Fatal("Gzip write failed")
return err
}
if err := s.writer.Flush(); err != nil {
return err
}
if err := s.handle.Sync(); err != nil {
return err
}
return nil
}
func main() {
s := Storage{}
s.Init("sss.gzip")
s.WriteData()
s.Shutdown()
}
编辑
进行了修改,使其与问题中的代码类似,只做了一些小的更改。WriteData从Storage结构中获取缓冲区,因为代码中没有MyStruct。
英文:
Here you can see the below working code for gzip compress:
package main
import (
"compress/gzip"
"log"
"os"
"time"
"sync"
)
type Storage struct {
handle *os.File
writer *gzip.Writer
buffer []byte
lock sync.Mutex
Name string
Comment string
ModTime time.Time
}
func (s *Storage) Init(filename string) {
file, err := os.OpenFile(filename, os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
log.Fatal(err)
}
s.handle = file
s.writer = gzip.NewWriter(file)
s.Name = "a-new-hope.txt"
s.Comment = "an epic space opera by George Lucas"
s.ModTime = time.Date(1977, time.May, 25, 0, 0, 0, 0, time.UTC)
s.buffer = []byte("Hello")
}
func (s *Storage) Shutdown() {
if err := s.writer.Close(); err != nil {
log.Fatal("Gzip writer close failed")
}
if err := s.handle.Close(); err != nil {
log.Fatal("Gzip writer close failed")
}
}
func (s *Storage) WriteData() error {
s.lock.Lock()
defer s.lock.Unlock()
_, err := s.writer.Write([]byte(s.buffer))
if err != nil {
log.Fatal("Gzip write failed")
return err
}
if err := s.writer.Flush(); err != nil {
return err
}
if err := s.handle.Sync(); err != nil {
return err
}
return nil
}
func main() {
//WriteGzip("test.gzip", "My data")
s := Storage{};
s.Init("sss.gzip");
s.WriteData();
s.Shutdown();
}
EDIT
Modified to make it similar than the code in question with little changes. WriteData taking buffer from Storage struct as MyStruct is not in the code.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论