英文:
Golang multipart uploads with chunked `http.GET` and Goamz `multi.PutAll`
问题
我正在使用Goamz包,并且需要帮助使bucket.Multi
能够将HTTP GET响应流式传输到S3。
我将通过分块的HTTP下载一个2GB以上的文件,并希望直接将其流式传输到S3存储桶中。
看起来我需要使用某种方式包装resp.Body
,以便我可以将s3.ReaderAtSeeker
的实现传递给multi.PutAll
。
// 设置S3
auth, _ := aws.EnvAuth()
s3Con := s3.New(auth, aws.USEast)
bucket := s3Con.Bucket("bucket-name")
// 发起HTTP请求
resp, err := http.Get(export_url)
if err != nil {
fmt.Printf("Get error %v\n", err)
return
}
defer resp.Body.Close()
// 设置多部分上传
multi, err := bucket.InitMulti(s3Path, "text/plain", s3.Private, s3.Options{})
if err != nil {
fmt.Printf("InitMulti error %v\n", err)
return
}
// 需要实现s3.ReaderAtSeeker的结构体
// type ReaderAtSeeker interface {
// io.ReaderAt
// io.ReadSeeker
// }
rs := // 问题:我应该用什么包装`resp.Body`?
parts, err := multi.PutAll(rs, 5120)
if err != nil {
fmt.Printf("PutAll error %v\n", err)
return
}
err = multi.Complete(parts)
if err != nil {
fmt.Printf("Complete error %v\n", err)
return
}
目前,当我尝试运行我的程序时,我得到了以下(预期的)错误:
./main.go:50: cannot use resp.Body (type io.ReadCloser) as type s3.ReaderAtSeeker in argument to multi.PutAll:
io.ReadCloser does not implement s3.ReaderAtSeeker (missing ReadAt method)
英文:
I'm using the Goamz package and could use some help getting bucket.Multi
to stream an HTTP GET response to S3.
I'll be downloading a 2+ GB file via chunked HTTP and I'd like to stream it directly into an S3 bucket.
It appears that I need to wrap the resp.Body
with something so I can pass an implementation of s3.ReaderAtSeeker
to multi.PutAll
// set up s3
auth, _ := aws.EnvAuth()
s3Con := s3.New(auth, aws.USEast)
bucket := s3Con.Bucket("bucket-name")
// make http request to URL
resp, err := http.Get(export_url)
if err != nil {
fmt.Printf("Get error %v\n", err)
return
}
defer resp.Body.Close()
// set up multi-part
multi, err := bucket.InitMulti(s3Path, "text/plain", s3.Private, s3.Options{})
if err != nil {
fmt.Printf("InitMulti error %v\n", err)
return
}
// Need struct that implements: s3.ReaderAtSeeker
// type ReaderAtSeeker interface {
// io.ReaderAt
// io.ReadSeeker
// }
rs := // Question: what can i wrap `resp.Body` in?
parts, err := multi.PutAll(rs, 5120)
if err != nil {
fmt.Printf("PutAll error %v\n", err)
return
}
err = multi.Complete(parts)
if err != nil {
fmt.Printf("Complete error %v\n", err)
return
}
Currently I get the following (expected) error when trying to run my program:
./main.go:50: cannot use resp.Body (type io.ReadCloser) as type s3.ReaderAtSeeker in argument to multi.PutAll:
io.ReadCloser does not implement s3.ReaderAtSeeker (missing ReadAt method)
答案1
得分: 1
你没有指明你使用的是哪个包来访问S3 API,但我假设你使用的是这个包:https://github.com/mitchellh/goamz/。
由于你的文件大小相当大,一个可能的解决方案是使用multi.PutPart。这将比multi.PutAll提供更多的控制。使用标准库中的Reader,你可以按照以下步骤进行操作:
- 从响应头中获取Content-Length。
- 根据Content-Length和partSize计算所需的分块数。
- 循环遍历分块数,从response.Body中读取[]byte到bytes.Reader,并调用multi.PutPart。
- 使用multi.ListParts获取分块。
- 调用multi.Complete并传入分块。
我无法访问S3,所以无法测试我的假设,但如果你还没有尝试过,上述方法可能值得探索一下。
英文:
<del>You haven't indicated which package you're using to access the S3 api but I'm assuming it's this one</del> https://github.com/mitchellh/goamz/.
Since your file is of a significant in size, a possible solution might be to use the multi.PutPart. This will give you more control than multi.PutAll. Using the Reader from the standard library, your approach would be:
- Get the Content-Length from the response header
- Get the number of parts needed based on Content-Length and partSize
- Loop over number of part and read []byte from response.Body into bytes.Reader and call multi.PutPart
- Get parts from multi.ListParts
- call multi.Complete with parts.
I don't have access to S3 so I can't test my hypothesis but the above could be worth exploring if you haven't already.
答案2
得分: 0
一个更简单的方法是使用 - http://github.com/minio/minio-go
它实现了PutObject(),这是一个完全托管的自包含操作,用于上传大文件。对于超过5MB的数据,它还会自动并行进行分块处理。如果没有预定义的ContentLength,它会一直上传直到达到EOF。
以下示例展示了如何在没有预定义输入长度但有一个流式的io.Reader的情况下进行操作。在这个示例中,我使用"os.Stdin"作为等效的分块输入。
package main
import (
"log"
"os"
"github.com/minio/minio-go"
)
func main() {
config := minio.Config{
AccessKeyID: "YOUR-ACCESS-KEY-HERE",
SecretAccessKey: "YOUR-PASSWORD-HERE",
Endpoint: "https://s3.amazonaws.com",
}
s3Client, err := minio.New(config)
if err != nil {
log.Fatalln(err)
}
err = s3Client.PutObject("mybucket", "myobject", "application/octet-stream", 0, os.Stdin)
if err != nil {
log.Fatalln(err)
}
}
$ echo "Hello my new-object" | go run stream-object.go
英文:
A simpler approach is to use - http://github.com/minio/minio-go
It implements PutObject() which is a fully managed self contained operation for uploading large files. It also automatically does multipart for more than 5MB worth of data in parallel. if no pre-defined ContentLength is specified. It will keep uploading until it reaches EOF.
Following example shows how to do it, when one doesn't have a pre-defined input length but an io.Reader which is streaming. In this example i have used "os.Stdin" as an equivalent for your chunked input.
package main
import (
"log"
"os"
"github.com/minio/minio-go"
)
func main() {
config := minio.Config{
AccessKeyID: "YOUR-ACCESS-KEY-HERE",
SecretAccessKey: "YOUR-PASSWORD-HERE",
Endpoint: "https://s3.amazonaws.com",
}
s3Client, err := minio.New(config)
if err != nil {
log.Fatalln(err)
}
err = s3Client.PutObject("mybucket", "myobject", "application/octet-stream", 0, os.Stdin)
if err != nil {
log.Fatalln(err)
}
}
$ echo "Hello my new-object" | go run stream-object.go
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论