Is there a way to stream data to amazon s3 files using aws-sdk-go that is similar to google storage Write() method?

huangapple go评论72阅读模式
英文:

Is there a way to stream data to amazon s3 files using aws-sdk-go that is similar to google storage Write() method?

问题

我们目前正在从Google Storage转换到Amazon S3存储。

在Google Storage上,我使用了这个函数https://godoc.org/cloud.google.com/go/storage#Writer.Write来写入文件。它基本上使用io.Writer接口将数据字节流式传输到文件中,并在写入器上调用Close()时保存文件。这样我们就可以整天将数据流式传输到文件中,并在一天结束时完成文件,而无需创建文件的本地副本。

我查看了godoc上的aws-sdk-go s3文档,似乎找不到类似的函数,可以让我们只是将数据流式传输到文件中,而无需先在本地创建文件。我找到的所有函数都是从已经存在的本地文件流式传输数据,比如PutObject()。

所以我的问题是:是否有一种类似于Google Storage Write()方法的方式,可以使用aws-sdk-go将数据流式传输到Amazon S3文件中?

英文:

We're currently doing a transition from Google Storage to Amazon S3 storage.

On Google Storage I've used this function https://godoc.org/cloud.google.com/go/storage#Writer.Write to write to files. It basically streams bytes of data into file using io.Writer interface and saves file when Close() is called on writer. That allows us to stream data into a file all day long and finalize it on the end of the day without ever creating a local copy of the file.

I've examined aws-sdk-go s3 documentation on godoc and can't seem to find a similar function that would allow us to just stream data to file without creating a file locally first. All I've found are functions that stream data from already existing local files like PutObject().

So my question is: Is there a way to stream data to amazon s3 files using aws-sdk-go that is similar to google storage Write() method?

答案1

得分: 5

S3 HTTP API没有类似append的写入方法,而是使用多部分上传。你需要上传固定大小的块,并附带索引号,S3会将它们作为单独的文件存储,并在接收到最后一个块时自动将它们连接起来。默认块大小为5MB(可更改),最多可以有10,000个块(不可更改)。

不幸的是,aws-sdk-go API似乎没有提供方便的接口来处理块以实现流式传输的行为。

你需要手动处理块(在aws-sdk-go中称为parts),直接使用CreateMultipartUpload初始化传输,为要发送的数据创建UploadPartInput实例,并使用UploadPart发送。当发送最后一个块时,你需要使用CompleteMultipartUpload来关闭事务。

关于如何直接从[]byte数据流式传输而不是从文件中传输的问题:UploadPartInput结构体的Body字段是你要发送到S3的内容,注意Body的类型是io.readseeker。这意味着你可以使用类似bytes.NewReader([]byte)的方法从[]byte内容中创建一个io.readseeker,然后将UploadPartInput.Body设置为该值。

s3manager上传工具是一个很好的起点,可以了解如何使用多部分函数,它使用多部分API将单个大文件并发地上传为较小的块。

请记住,你应该设置一个生命周期策略来删除未完成的多部分上传。如果你不发送最后的CompleteMultipartUpload,所有已上传的块将保留在S3中并产生费用。可以通过AWS控制台/CLI或使用aws-sdk-go以编程方式设置该策略。

英文:

The S3 HTTP API doesn't have any append-like write method, instead it uses multipart uploads. You basically upload fixed size chunks with an index number and S3 will store them internally as separate files and automatically concatenate them when the last chunks is received. Default chunk size is 5MB (can be changed) and you can have atmost 10,000 chunks (can't be changed).

Unfortunately it doesn't look like the aws-sdk-go API provides any convenient interface for working with chunks to achieve the streaming behaviour.

You would have to work with the chunks manually (called parts in aws-sdk-go) directly using CreateMultipartUpload to initialize the transfers, create UploadPartInput instances for the data you want to send and send it with UploadPart. When the final chunk has been sent you need to close the transaction with CompleteMultipartUpload.

Regarding the question on how to stream directly from e.g. []byte data instead of a file: the Body field of the UploadPartInput struct is where you put your content you want to send to S3, note that Body is of type io.readseeker. This means you can create a io.readseeker from e.g. your []byte content with something like bytes.NewReader([]byte) and set UploadPartInput.Body to that.

The s3manager upload utility of uploads could be a good starting point to see how the multipart functions are used, it uses the multipart API to upload a single large file as smaller chunks concurrently.

Keep in mind that you should set a lifecycle policy that removes unfinished multipart uploads. If you don't send the final CompleteMultipartUpload all the chunks that have been uploaded will stay in S3 and incur costs. The policy can be set through AWS console/CLI or programmatically with aws-sdk-go.

huangapple
  • 本文由 发表于 2017年5月22日 00:44:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/44099341.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定