在Golang中原地编辑ZIP存档

huangapple go评论78阅读模式
英文:

Editing ZIP archive in place in Golang

问题

我正在编写一个应用程序,允许用户将匿名数据上传到S3存储桶,以便他们在不提供身份验证数据的情况下尝试我们的产品。

这是处理ZIP归档的结构体,已经被证明是正确的:

type ZipWriter struct {
	buffer *bytes.Buffer
	writer *zip.Writer
}

func FromFile(file io.Reader) (*ZipWriter, error) {

	// 首先,从文件中读取所有数据;如果失败,则返回错误
	data, err := ioutil.ReadAll(file)
	if err != nil {
		return nil, fmt.Errorf("无法从ZIP归档中读取数据")
	}

	// 接下来,将所有数据放入缓冲区,然后创建一个ZIP写入器
	// 使用缓冲区创建ZIP写入器,并返回该写入器
	buffer := bytes.NewBuffer(data)
	return &ZipWriter{
		buffer: buffer,
		writer: zip.NewWriter(buffer),
	}, nil
}

// WriteToStream将ZIP归档的内容写入提供的流
func (writer *ZipWriter) WriteToStream(file io.Writer) error {

	// 首先,尝试关闭ZIP归档写入器,以避免对底层缓冲区进行重复写入;如果发生错误,则返回错误
	if err := writer.writer.Close(); err != nil {
		return fmt.Errorf("无法关闭ZIP归档,错误:%v", err)
	}

	// 接下来,将底层缓冲区写入提供的流;如果失败,则返回错误
	if _, err := writer.buffer.WriteTo(file); err != nil {
		return fmt.Errorf("无法将ZIP数据写入流,错误:%v", err)
	}

	return nil
}

使用ZipWriter,我使用FromFile函数加载一个ZIP文件,然后使用WriteToStream函数将其写入字节数组。之后,我调用以下函数将ZIP归档数据上传到S3中的预签名URL:

// DoRequest使用给定的URL、方法和访问令牌对端点进行HTTP请求
func DoRequest(client *http.Client, method string, url string, code string, reader io.Reader) ([]byte, error) {

	// 首先,使用方法、URL、请求体和访问令牌创建请求
	// 我们不希望此步骤失败,因此忽略错误
	request, _ := http.NewRequest(method, url, reader)
	if !util.IsEmpty(code) {
		request.Header.Set(headers.Accept, echo.MIMEApplicationJSON)
		request.Header.Set(headers.Authorization, fmt.Sprintf("Bearer %s", code))
	} else {
		request.Header.Set(headers.ContentType, "application/zip")
	}

	// 接下来,执行请求;如果失败,则返回错误
	resp, err := client.Do(request)
	if err != nil {
		return nil, fmt.Errorf("无法执行%s请求:%s,错误:%v", method, url, err)
	} else if resp.StatusCode != http.StatusOK {
		return nil, fmt.Errorf("无法执行%s请求:%s,响应:%v", method, url, resp)
	}

	// 现在,从响应中读取主体;如果失败,则返回错误
	defer resp.Body.Close()
	body, err := ioutil.ReadAll(resp.Body)
	if err != nil {
		return nil, fmt.Errorf("无法读取与响应关联的主体,错误:%v", err)
	}

	// 最后,返回响应的主体
	return body, nil
}

因此,整个操作大致如下:

file, err := os.Open(location)
if err != nil {
	log.Fatalf("无法打开位于%s的ZIP归档,错误:%v", location, err)
}

writer, err := lutils.FromFile(file)
if err != nil {
	log.Fatalf("无法将位于%s的文件读取为ZIP归档,错误:%v", location, err)
}

buffer := new(bytes.Buffer)
if err := writer.WriteToStream(buffer); err != nil {
	log.Fatalf("无法将数据写入ZIP归档,错误:%v", err)
}

if body, err := DoRequest(new(http.Client), http.MethodPut, url, "", buffer); err != nil {
	log.Fatalf("无法将数据上传到S3,响应:%s,错误:%v", string(body), err)
}

我遇到的问题是,尽管成功将ZIP归档上传到S3,但在下载ZIP归档并提取数据时,找不到任何文件。在调查此问题时,我想到了一些可能的失败点:

  1. FromFile未正确地从文件创建ZIP归档,导致归档文件损坏。
  2. WriteToStream在写入归档时损坏数据。这似乎不太可能,因为我已经使用bytes.Buffer作为读取器测试了此功能。除非os.Filebytes.Buffer不会产生损坏的ZIP归档时会产生损坏,否则我认为该函数可能按预期工作。
  3. DoRequest将数据写入S3时,数据被损坏。这似乎不太可能,因为我已经使用此代码处理其他数据而没有问题。因此,除非ZIP归档具有需要与其他文件类型不同处理的结构,否则我在这里看不到问题。

在更深入地研究了这些可能性之后,我认为问题可能在于我如何从归档文件创建ZIP写入器,但我不确定问题出在哪里。

英文:

I'm writing an application that allows a user to upload anonymized data to an S3 bucket in order to allow them to try out our product without providing us with authentication data.

This is the struct that handles ZIP archives, which have already proven to be correct:

type ZipWriter struct {
	buffer *bytes.Buffer
	writer *zip.Writer
}

func FromFile(file io.Reader) (*ZipWriter, error) {

	// First, read all the data from the file; if this fails then return an error
	data, err := ioutil.ReadAll(file)
	if err != nil {
		return nil, fmt.Errorf("Failed to read data from the ZIP archive")
	}

	// Next, put all the data into a buffer and then create a ZIP writer
	// from the buffer and return that writer
	buffer := bytes.NewBuffer(data)
	return &ZipWriter{
		buffer: buffer,
		writer: zip.NewWriter(buffer),
	}, nil
}

// WriteToStream writes the contents of the ZIP archive to the provided stream
func (writer *ZipWriter) WriteToStream(file io.Writer) error {

	// First, attempt to close the ZIP archive writer so that we can avoid
	// double writes to the underlying buffer; if an error occurs then return it
	if err := writer.writer.Close(); err != nil {
		return fmt.Errorf("Failed to close ZIP archive, error: %v", err)
	}

	// Next, write the underlying buffer to the provided stream; if this fails
	// then return an error
	if _, err := writer.buffer.WriteTo(file); err != nil {
		return fmt.Errorf("Failed to write the ZIP data to the stream, error: %v", err)
	}

	return nil
}

Using the ZipWriter, I load a ZIP file using the FromFile function and then write it to a byte array, using the WriteToStream function. After that, I call the following function to upload the ZIP archive data to a presigned URL in S3:

// DoRequest does an HTTP request against an endpoint with a given URL, method and access token
func DoRequest(client *http.Client, method string, url string, code string, reader io.Reader) ([]byte, error) {

	// First, create the request with the method, URL, body and access token
	// We don't expect this to fail so ignore the error
	request, _ := http.NewRequest(method, url, reader)
	if !util.IsEmpty(code) {
		request.Header.Set(headers.Accept, echo.MIMEApplicationJSON)
		request.Header.Set(headers.Authorization, fmt.Sprintf("Bearer %s", code))
	} else {
		request.Header.Set(headers.ContentType, "application/zip")
	}

	// Next, do the request; if this fails then return an error
	resp, err := client.Do(request)
	if err != nil {
		return nil, fmt.Errorf("Failed to run the %s request against %s, error: %v", method, url, err)
	} else if resp.StatusCode != http.StatusOK {
		return nil, fmt.Errorf("Failed to run the %s request against %s, response: %v", method, url, resp)
	}

	// Now, read the body from the response; if this fails then return an error
	defer resp.Body.Close()
	body, err := ioutil.ReadAll(resp.Body)
	if err != nil {
		return nil, fmt.Errorf("Failed to read the body associated with the response, error: %v", err)
	}

	// Finally, return the body from the response
	return body, nil
}

So, the whole operation works about like this:

file, err := os.Open(location)
if err != nil {
	log.Fatalf("Unable to open ZIP archive located in %s, error: %v", location, err)
}

writer, err := lutils.FromFile(file)
if err != nil {
	log.Fatalf("File located in %s could not be read as a ZIP archive, error: %v", location, err)
}

buffer := new(bytes.Buffer)
if err := writer.WriteToStream(buffer); err != nil {
	log.Fatalf("Failed to write data to the ZIP archive, error: %v", err)
}

if body, err := DoRequest(new(http.Client), http.MethodPut, url, "", buffer); err != nil {
	log.Fatalf("Failed to upload the data to S3, response: %s, error: %v", string(body), err)
}

The problem I'm having is that, although the upload to S3 succeeds, when the ZIP archive is downloaded and the data is extracted, no files are found. While investigating this issue, I've come up with a number of possible fail points:

  1. FromFile does not create the ZIP archive from the file, correctly; resulting in a corrupt archive file.
  2. WriteToStream corrupts the data when it writes the archive. This seems less likely as I've already tested this functionality with a bytes.Buffer as the reader. Unless an os.File produces a corrupt ZIP archive when the bytes.Buffer does not, I think this function probably works as expected.
  3. The data is corrupted when DoRequest writes it to S3. This seems unlikely as I've used this code for other data without issue. So, unless ZIP archives have a structure that needs to be treated differently from other file types, I don't see a problem here either.

After examining these possibilities in more depth, I think the issue might be in how I'm creating the ZIP writer from the archive file but I'm not sure what the problem is.

答案1

得分: 0

这里的问题有点误导性。正如@CeriseLimón指出的那样,在现有的ZIP存档上调用NewWriterClose将导致一个空的存档被添加到文件的末尾。在我的用例中,解决方案是直接打开文件并将其直接写入流,而不是尝试将其作为ZIP存档进行读取。

file, err := os.Open(location)
if err != nil {
    log.Fatalf("无法打开位于%s的ZIP存档,错误:%v", location, err)
}

if body, err := DoRequest(new(http.Client), http.MethodPut, url, "", file); err != nil {
    log.Fatalf("将数据上传到S3失败,响应:%s,错误:%v", string(body), err)
}
英文:

The issue here was a bit of a red herring. As @CeriseLimón pointed out, calling NewWriter and Close on an existing ZIP archive will necessarily result in an empty archive being added onto the end of the file. In my use case, the solution was to open the file and write it directly to the stream, rather than attempting to read it as a ZIP archive.

file, err := os.Open(location)
if err != nil {
    log.Fatalf("Unable to open ZIP archive located in %s, error: %v", location, err)
}

if body, err := DoRequest(new(http.Client), http.MethodPut, url, "", file); err != nil {
    log.Fatalf("Failed to upload the data to S3, response: %s, error: %v", string(body), err)
}

huangapple
  • 本文由 发表于 2021年6月2日 11:48:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/67798666.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定