GCS Go SDK中的分块上传?

huangapple go评论89阅读模式
英文:

GCS Uploads for chunking in Go SDK?

问题

我正在尝试使用GCS写入器上传大文件:

bucketHandle := m.Client.Bucket(bucket)
objectHandle := bucketHandle.Object(path)
writer := objectHandle.NewWriter(context.Background())

然后,对于大小为N的块,我调用writer.Write(myBuffer)。我在我的集群上遇到了一些内存不足的异常,并想知道这个操作实际上是将整个文件缓冲到内存中,还是有其他的含义。这个操作的语义是什么,我是否有什么误解?

英文:

I'm trying to upload large files using the GCS writer:

	bucketHandle := m.Client.Bucket(bucket)
	objectHandle := bucketHandle.Object(path)
	writer := objectHandle.NewWriter(context.Background())

then for chunks of size N I call writer.write(myBuffer). I'm seeing some out of memory exceptions on my cluster and wondering if this is actually just buffering the entire file into memory or not. What are the semantics of this operation, am I misunderstanding something?

答案1

得分: 1

是的,在您的代码中的每次写入调用之后,数据都会被刷新到GCS。Write方法返回写入的字节数以及遇到的任何错误,以及实际写入底层连接的字节数。在每个块被写入后,数据都会被刷新到GCS,因此在客户端端消耗的内存量应该限制在缓冲区的大小上,即在您的实例中为5 MB,如果您将输入数据分块为5 MB块并在循环中使用Write方法。

英文:

Yes, after each Write call in your code, the data is flushed to GCS. The Write method returns the amount of bytes written and any errors that were encountered along with the number of bytes actually written to the underlying connection. The data is flushed to GCS after each chunk is written, thus the amount of memory consumed on the client side should be restricted to the size of the buffer, which in your instance is 5 MB, if you are chunking the input data into 5 MB chunks and using Write in a loop.

huangapple
  • 本文由 发表于 2023年2月10日 03:41:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/75403661.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定