英文:
Reading last modified file name from Amazon S3 bucket in GO
问题
我需要使用Golang读取Amazon S3存储桶中的文件名。
该存储桶主要包含两种类型的CSV文件命名格式。
- uploaded/2022-03-21-18:31:06.608058.csv
- overwritten/2022-03-22-18:31:06.608058.csv
我需要找出以uploaded前缀命名的最后修改的文件的名称。(该存储桶包含数千个文件。)
非常感谢任何帮助。
英文:
I need to read the filenames in an amazon s3 bucket using Golang.
The bucket contains csv files mainly with 2 types of name formats.
1. uploaded/2022-03-21-18:31:06.608058.csv
2. overwritten/2022-03-22-18:31:06.608058.csv
I need to find out the name of the last modified file with uploaded prefix. (The bucket in question contains 1000s of files.)
Any help is much appreciated.
答案1
得分: 1
如评论中已经提到的:
您需要列出所有以给定前缀开头的键,然后(在您的代码中)对其进行排序并找到所需的对象键。
...
objs := []types.Object{}
params := &s3.ListObjectsV2Input{
Bucket: aws.String(s.bucket),
Prefix: aws.String(prefix), //uploaded
}
p := s3.NewListObjectsV2Paginator(svc, params)
for p.HasMorePages() {
out, err := p.NextPage(ctx)
if err != nil {
return nil, err
}
objs = append(objs, out.Contents...)
}
if l :=len(objs); l > 0 {
sort.Slice(objs, func(a, b int) bool {
return objs[a].LastModified.Before(*objs[b].LastModified)
})
return objs[l-1].Key, nil
}
return "", nil
如果您想要找到具有最近日期时间(在键的名称中)的对象键,则可以使用StartAfter参数来避免列出所有可用的键。
...
lastModifiedKey := ""
params := &s3.ListObjectsV2Input{
Bucket: aws.String(s.bucket),
Prefix: aws.String(prefix), //uploaded
StartAfter: aws.String(previousSearchKey),
}
p := s3.NewListObjectsV2Paginator(svc, params)
for p.HasMorePages() {
out, err := p.NextPage(ctx)
if err != nil {
return nil, err
}
// ListObjectV2按升序排序键,您的名称格式是可排序的
lastModifiedKey = out.Contents[len(out.Contents) -1]
}
return lastModifiedKey, nil
请注意,StartAfter键不需要是存储桶中的现有键。
它可以是先前搜索的结果,或者您可以断言它在所需结果之后(最好是靠近所需结果)。
英文:
As already mentioned in the comments:
You have to list all keys started with the given prefix, then (in your code) sort and find the wanted object key.
...
objs := []types.Object{}
params := &s3.ListObjectsV2Input{
Bucket: aws.String(s.bucket),
Prefix: aws.String(prefix), //uploaded
}
p := s3.NewListObjectsV2Paginator(svc, params)
for p.HasMorePages() {
out, err := p.NextPage(ctx)
if err != nil {
return nil, err
}
objs = append(objs, out.Contents...)
}
if l :=len(objs); l > 0 {
sort.Slice(objs, func(a, b int) bool {
return objs[a].LastModified.Before(*objs[b].LastModified)
})
return objs[l-1].Key, nil
}
return "", nil
If you are trying to find the object key with the most recent DateTime (within the key's name), then you can use the StartAfter parameter to avoid listing all available keys.
...
lastModifiedKey := ""
params := &s3.ListObjectsV2Input{
Bucket: aws.String(s.bucket),
Prefix: aws.String(prefix), //uploaded
StartAfter: aws.String(previousSearchKey),
}
p := s3.NewListObjectsV2Paginator(svc, params)
for p.HasMorePages() {
out, err := p.NextPage(ctx)
if err != nil {
return nil, err
}
// ListObjectV2 sorts key by ascending order & your name format is sortable
lastModifiedKey = out.Contents[len(out.Contents) -1]
}
return lastModifiedKey, nil
Note that StartAfter key doesn't require to be an existing key in the bucket.
It may be the result of a previous search, or a guessed value that you can assert is behind (and preferably close to the wanted result).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论