英文:
Copy S3 Object with MultiPartUpload
问题
我需要重命名AWS S3中的许多对象。对于小对象,以下代码段可以完美运行:
input := &s3.CopyObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
}
_, err = svc.CopyObject(input)
if err != nil {
panic(errors.Wrap(err, "error copying object"))
}
但是对于较大的对象,我遇到了S3大小限制的问题。我了解到需要使用多部分上传来复制对象。以下是我尝试过的方法:
multiPartUpload, err := svc.CreateMultipartUpload(
&s3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix), // targetPrefix是新名称
},
)
if err != nil {
panic(errors.Wrap(err, "could not create MultiPartUpload"))
}
resp, err := svc.UploadPartCopy(
&s3.UploadPartCopyInput{
UploadId: multiPartUpload.UploadId,
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
PartNumber: aws.Int64(1),
},
)
if err != nil {
panic(errors.Wrap(err, "error copying multipart object"))
}
log.Printf("copied: %v", resp)
但是,Golang SDK给出了以下错误:
InvalidRequest: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120
我还尝试了以下方法,但是在这里没有列出任何部分:
multiPartUpload, err := svc.CreateMultipartUpload(
&s3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix), // targetPrefix是新名称
},
)
if err != nil {
panic(errors.Wrap(err, "could not create MultiPartUpload"))
}
err = svc.ListPartsPages(
&s3.ListPartsInput{
Bucket: aws.String(bucket), // 必需
Key: obj.Key, // 必需
UploadId: multiPartUpload.UploadId, // 必需
},
// 遍历`CopySource`对象中的所有部分
func(parts *s3.ListPartsOutput, lastPage bool) bool {
log.Printf("parts:\n%v\n%v", parts, parts.Parts)
// parts.Parts是一个空切片
for _, part := range parts.Parts {
log.Printf("copying %v part %v", source, part.PartNumber)
resp, err := svc.UploadPartCopy(
&s3.UploadPartCopyInput{
UploadId: multiPartUpload.UploadId,
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
PartNumber: part.PartNumber,
},
)
if err != nil {
panic(errors.Wrap(err, "error copying object"))
}
log.Printf("copied: %v", resp)
}
return true
},
)
if err != nil {
panic(errors.Wrap(err, "something went wrong with ListPartsPages!"))
}
我做错了什么,还是我理解错了什么?
英文:
I need to rename a quite a bunch of objects in AWS S3. For small objects the following snippet works flawlessly:
input := &s3.CopyObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
}
_, err = svc.CopyObject(input)
if err != nil {
panic(errors.Wrap(err, "error copying object"))
}
I am running into the S3 size limitation for larger objects. I understand I need to copy the object using a multi part upload. This is what I tried so far:
multiPartUpload, err := svc.CreateMultipartUpload(
&s3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix), // targetPrefix is the new name
},
)
if err != nil {
panic(errors.Wrap(err, "could not create MultiPartUpload"))
}
resp, err := svc.UploadPartCopy(
&s3.UploadPartCopyInput{
UploadId: multiPartUpload.UploadId,
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
PartNumber: aws.Int64(1),
},
)
if err != nil {
panic(errors.Wrap(err, "error copying multipart object"))
}
log.Printf("copied: %v", resp)
The golang SDK bails out on me with:
InvalidRequest: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120
I have also tried the following approach but I do not get any parts listed here:
multiPartUpload, err := svc.CreateMultipartUpload(
&s3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix), // targetPrefix is the new name
},
)
if err != nil {
panic(errors.Wrap(err, "could not create MultiPartUpload"))
}
err = svc.ListPartsPages(
&s3.ListPartsInput{
Bucket: aws.String(bucket), // Required
Key: obj.Key, // Required
UploadId: multiPartUpload.UploadId, // Required
},
// Iterate over all parts in the `CopySource` object
func(parts *s3.ListPartsOutput, lastPage bool) bool {
log.Printf("parts:\n%v\n%v", parts, parts.Parts)
// parts.Parts is an empty slice
for _, part := range parts.Parts {
log.Printf("copying %v part %v", source, part.PartNumber)
resp, err := svc.UploadPartCopy(
&s3.UploadPartCopyInput{
UploadId: multiPartUpload.UploadId,
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
PartNumber: part.PartNumber,
},
)
if err != nil {
panic(errors.Wrap(err, "error copying object"))
}
log.Printf("copied: %v", resp)
}
return true
},
)
if err != nil {
panic(errors.Wrap(err, "something went wrong with ListPartsPages!"))
}
What am I doing wrong or am I missunderstanding something?
答案1
得分: 2
我认为ListPartsPages
是错误的方向,因为它适用于“多部分上传”,这是一个与S3“对象”不同的实体。所以你正在列出刚刚创建的多部分上传中已经上传的部分。
你的第一个示例接近所需的内容,但你需要手动将原始文件分割成多个部分,每个部分的范围由UploadPartCopyInput
的CopySourceRange
指定。至少这是我从阅读文档中得出的结论。
英文:
I think that ListPartsPages
is the wrong direction because it works on "Multipart Uploads" which is a different entity than an an s3 "Object". So you're listing the already-uploaded parts to the multipart upload you just created.
Your first example is close to what's needed, but you need to manually split the original file into parts, with the range of each part specified by UploadPartCopyInput
's CopySourceRange
. At least that's my take from reading the documentation.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论