2014年9月5日 01:10:25go评论177阅读模式

英文:

Golang io.copy twice on the request body

问题

我正在构建一个 Blob 存储系统，并选择使用 Go 作为编程语言。
我创建了一个流来实现从客户端到 Blob 服务器的多部分文件上传。

流工作正常，但我想从请求体中创建一个 SHA1 哈希值。我需要对请求体进行两次 io.Copy 操作。SHA1 哈希值可以创建成功，但在此之后，多部分流变成了 0 字节。

有什么办法可以解决这个问题吗？

客户端上传代码如下：

func (c *Client) Upload(h *UploadHandle) (*PutResult, error) {
    body, bodySize, err := h.Read()
    if err != nil {
        return nil, err
    }

    // 从 body 字节创建一个 SHA1 哈希值
    dropRef, err := drop.Sha1FromReader(body)
    if err != nil {
        return nil, err
    }

    bodyReader, bodyWriter := io.Pipe()
    writer := multipart.NewWriter(bodyWriter)

    errChan := make(chan error, 1)
    go func() {
        defer bodyWriter.Close()
        part, err := writer.CreateFormFile(dropRef, dropRef)
        if err != nil {
            errChan <- err
            return
        }
        if _, err := io.Copy(part, body); err != nil {
            errChan <- err
            return
        }
        if err = writer.Close(); err != nil {
            errChan <- err
        }
    }()

    req, err := http.NewRequest("POST", c.Server+"/drops/upload", bodyReader)
    req.Header.Add("Content-Type", writer.FormDataContentType())
    resp, err := c.Do(req)
    if err != nil {
        return nil, err
    }
    // ...
}

SHA1 函数代码如下：

func Sha1FromReader(src io.Reader) (string, error) {
    hash := sha1.New()
    _, err := io.Copy(hash, src)
    if err != nil {
        return "", err
    }
    return hex.EncodeToString(hash.Sum(nil)), nil
}

上传处理代码如下：

func (h *UploadHandle) Read() (io.Reader, int64, error) {
    var b bytes.Buffer

    hw := &Hasher{&b, sha1.New()}
    n, err := io.Copy(hw, h.Contents)

    if err != nil {
        return nil, 0, err
    }

    return &b, n, nil
}

以上是你提供的代码的翻译。

英文:

I am building a blob storage system and i picked Go as the programming language.
I create a stream to do a multipart file upload from client to the blob server.

The stream works fine, but i want to make a sha1 hash from the request body. I need to io.Copy the body twice. The sha1 gets created but the multipart streams 0 bytes after that.

For creating the hash
For streaming the body as multipart

any idea how i can do this?

the client upload

func (c *Client) Upload(h *UploadHandle) (*PutResult, error) {
body, bodySize, err := h.Read()
if err != nil {
	return nil, err
}

// Creating a sha1 hash from the bytes of body
dropRef, err := drop.Sha1FromReader(body)
if err != nil {
	return nil, err
}

bodyReader, bodyWriter := io.Pipe()
writer := multipart.NewWriter(bodyWriter)

errChan := make(chan error, 1)
go func() {
	defer bodyWriter.Close()
	part, err := writer.CreateFormFile(dropRef, dropRef)
	if err != nil {
		errChan &lt;- err
		return
	}
	if _, err := io.Copy(part, body); err != nil {
		errChan &lt;- err
		return
	}
	if err = writer.Close(); err != nil {
		errChan &lt;- err
	}
}()

req, err := http.NewRequest(&quot;POST&quot;, c.Server+&quot;/drops/upload&quot;, bodyReader)
req.Header.Add(&quot;Content-Type&quot;, writer.FormDataContentType())
resp, err := c.Do(req)
if err != nil {
	return nil, err
}
  .....
 }

the sha1 func

func Sha1FromReader(src io.Reader) (string, error) {
hash := sha1.New()
_, err := io.Copy(hash, src)
if err != nil {
	return &quot;&quot;, err
}
return hex.EncodeToString(hash.Sum(nil)), nil

}

upload handle

func (h *UploadHandle) Read() (io.Reader, int64, error) {
var b bytes.Buffer

hw := &amp;Hasher{&amp;b, sha1.New()}
n, err := io.Copy(hw, h.Contents)

if err != nil {
	return nil, 0, err
}

return &amp;b, n, nil

}

答案1

得分: 26

我建议使用io.TeeReader，如果你想要并发地将所有从 blob 读取的内容传递给 sha1。

bodyReader := io.TeeReader(body, hash)

现在，当上传过程中消耗 bodyReader 时，哈希值会自动更新。

英文:

I would suggest using an io.TeeReader if you want to push all reads from the blob through the sha1 concurrently.

bodyReader := io.TeeReader(body, hash)

Now as the bodyReader is consumed during upload, the hash is automatically updated.

答案2

得分: 12

你可以编写一个包装器，在io.Copy上进行哈希处理，但不能直接进行哈希处理。

// 这个包装器适用于读取器或写入器，但如果同时使用两者，哈希将会出错。
type Hasher struct {
	io.Writer
	io.Reader
	hash.Hash
	Size uint64
}

func (h *Hasher) Write(p []byte) (n int, err error) {
	n, err = h.Writer.Write(p)
	h.Hash.Write(p)
	h.Size += uint64(n)
	return
}

func (h *Hasher) Read(p []byte) (n int, err error) {
	n, err = h.Reader.Read(p)
	h.Hash.Write(p[:n]) // 如果出错，n将为0，所以这仍然是安全的。
	return
}

func (h *Hasher) Sum() string {
	return hex.EncodeToString(h.Hash.Sum(nil))
}

func (h *UploadHandle) Read() (io.Reader, string, int64, error) {
	var b bytes.Buffer

	hashedReader := &Hasher{Reader: h.Contents, Hash: sha1.New()}
	n, err := io.Copy(&b, hashedReader)

	if err != nil {
		return nil, "", 0, err
	}

	return &b, hashedReader.Sum(), n, nil
}

// 基于@Dustin的评论更新的版本，因为我完全忘记了`io.TeeReader`的存在。

func (h *UploadHandle) Read() (io.Reader, string, int64, error) {
	var b bytes.Buffer

	hash := sha1.New()
	n, err := io.Copy(&b, io.TeeReader(h.Contents, hash))

	if err != nil {
		return nil, "", 0, err
	}

	return &b, hex.EncodeToString(hash.Sum(nil)), n, nil
}

英文:

You can't do that directly but you can write a wrapper that does the hashing on io.Copy

// this works for either a reader or writer, 
//  but if you use both in the same time the hash will be wrong.
type Hasher struct {
io.Writer
io.Reader
hash.Hash
Size uint64
}
func (h *Hasher) Write(p []byte) (n int, err error) {
n, err = h.Writer.Write(p)
h.Hash.Write(p)
h.Size += uint64(n)
return
}
func (h *Hasher) Read(p []byte) (n int, err error) {
n, err = h.Reader.Read(p)
h.Hash.Write(p[:n]) //on error n is gonna be 0 so this is still safe.
return
}
func (h *Hasher) Sum() string {
return hex.EncodeToString(h.Hash.Sum(nil))
}
func (h *UploadHandle) Read() (io.Reader, string, int64, error) {
var b bytes.Buffer
hashedReader := &amp;Hasher{Reader: h.Contents, Hash: sha1.New()}
n, err := io.Copy(&amp;b, hashedReader)
if err != nil {
return nil, &quot;&quot;, 0, err
}
return &amp;b, hashedReader.Sum(), n, nil
}

// updated version based on @Dustin's comment since I complete forgot io.TeeReader existed.

func (h *UploadHandle) Read() (io.Reader, string, int64, error) {
var b bytes.Buffer
hash := sha1.New()
n, err := io.Copy(&amp;b, io.TeeReader(h.Contents, hash))
if err != nil {
return nil, &quot;&quot;, 0, err
}
return &amp;b, hex.EncodeToString(hash.Sum(nil)), n, nil
}

答案3

得分: 2

你有两个选项。

最直接的方法是使用io.MultiWriter。

但是，如果你需要哈希函数生成多部分输出，那么你需要将其复制到一个bytes.Buffer中，然后将缓冲区写回每个写入器。

英文:

You have two options.

The most direct way is to use io.MultiWriter.

But if you need the hash to produce the multipart output, then you will have to copy to a bytes.Buffer and then write the buffer back to each writer.

答案4

得分: 1

我们可以将流转换为字符串，并根据需要多次创建它。

例如：

readerStream := 来源的流
buf := new(bytes.Buffer)
buf.ReadFrom(readerStream)
rawBody := buf.String()
newReader1 := strings.NewReader(rawBody)
newReader2 := strings.NewReader(rawBody)

但如果能够避免这样做，那将是很好的。

我不确定这是否是最佳方法，但它对我起作用了。

英文:

We can convert the stream into string and create it again as many times we need.
e.g.

readerStream := your stream from source
buf := new(bytes.Buffer)
buf.ReadFrom(readerStream)
rawBody := buf.String()
newReader1 := strings.NewReader(rawBody)
newReader2 := strings.NewReader(rawBody)

But it will be great if it can be avoided.

I am not sure it is the best approach. But it worked for me.

答案5

得分: 0

你可以使用Request.GetBody方法：

package main
import (
"io"
"net/http"
"os"
"strings"
)
func main() {
read := strings.NewReader("north east south west")
req, e := http.NewRequest("GET", "https://stackoverflow.com", read)
if e != nil {
panic(e)
}
// one
io.Copy(os.Stdout, req.Body)
// two
body, e := req.GetBody()
if e != nil {
panic(e)
}
io.Copy(os.Stdout, body)
}

https://golang.org/pkg/net/http#Request.GetBody

英文:

You can use Request.GetBody:

package main
import (
&quot;io&quot;
&quot;net/http&quot;
&quot;os&quot;
&quot;strings&quot;
)
func main() {
read := strings.NewReader(&quot;north east south west&quot;)
req, e := http.NewRequest(&quot;GET&quot;, &quot;https://stackoverflow.com&quot;, read)
if e != nil {
panic(e)
}
// one
io.Copy(os.Stdout, req.Body)
// two
body, e := req.GetBody()
if e != nil {
panic(e)
}
io.Copy(os.Stdout, body)
}

https://golang.org/pkg/net/http#Request.GetBody

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Golang io.copy在请求体上执行两次复制。

问题

答案1

答案2

答案3

答案4

答案5

当不支持沙盒化时，有没有办法让Bazel使用沙盒目录？

在Go语言中对共享的嵌套结构属性进行排序。

Go共享库作为C++插件

条件化的Go协程/通道

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论