从Amazon S3存储桶中读取最后修改的文件名在GO中的代码部分:

huangapple go评论76阅读模式
英文:

Reading last modified file name from Amazon S3 bucket in GO

问题

我需要使用Golang读取Amazon S3存储桶中的文件名。
该存储桶主要包含两种类型的CSV文件命名格式。

  1. uploaded/2022-03-21-18:31:06.608058.csv
  2. overwritten/2022-03-22-18:31:06.608058.csv

我需要找出以uploaded前缀命名的最后修改的文件的名称。(该存储桶包含数千个文件。)

非常感谢任何帮助。

英文:

I need to read the filenames in an amazon s3 bucket using Golang.
The bucket contains csv files mainly with 2 types of name formats.

1. uploaded/2022-03-21-18:31:06.608058.csv
2. overwritten/2022-03-22-18:31:06.608058.csv

I need to find out the name of the last modified file with uploaded prefix. (The bucket in question contains 1000s of files.)

Any help is much appreciated.

答案1

得分: 1

如评论中已经提到的:

您需要列出所有以给定前缀开头的键,然后(在您的代码中)对其进行排序并找到所需的对象键。

    ...

    objs := []types.Object{}
	params := &s3.ListObjectsV2Input{
		Bucket:     aws.String(s.bucket),
		Prefix:     aws.String(prefix), //uploaded
	}
	p := s3.NewListObjectsV2Paginator(svc, params)

	for p.HasMorePages() {
		out, err := p.NextPage(ctx)
		if err != nil {
			return nil, err
		}
		objs = append(objs, out.Contents...)
	}

	if l :=len(objs); l > 0 {
        sort.Slice(objs, func(a, b int) bool {
		    return objs[a].LastModified.Before(*objs[b].LastModified)
	    })

	    return objs[l-1].Key, nil
	}

	return "", nil

如果您想要找到具有最近日期时间(在键的名称中)的对象键,则可以使用StartAfter参数来避免列出所有可用的键。

    ...

    lastModifiedKey := ""

	params := &s3.ListObjectsV2Input{
		Bucket:     aws.String(s.bucket),
		Prefix:     aws.String(prefix), //uploaded
		StartAfter: aws.String(previousSearchKey),
	}
	p := s3.NewListObjectsV2Paginator(svc, params)

	for p.HasMorePages() {
		out, err := p.NextPage(ctx)
		if err != nil {
			return nil, err
		}
        // ListObjectV2按升序排序键,您的名称格式是可排序的
		lastModifiedKey =  out.Contents[len(out.Contents) -1]
	}

	return lastModifiedKey, nil

请注意,StartAfter键不需要是存储桶中的现有键。

它可以是先前搜索的结果,或者您可以断言它在所需结果之后(最好是靠近所需结果)。

英文:

As already mentioned in the comments:

You have to list all keys started with the given prefix, then (in your code) sort and find the wanted object key.

    ...

    objs := []types.Object{}
	params := &s3.ListObjectsV2Input{
		Bucket:     aws.String(s.bucket),
		Prefix:     aws.String(prefix), //uploaded
	}
	p := s3.NewListObjectsV2Paginator(svc, params)

	for p.HasMorePages() {
		out, err := p.NextPage(ctx)
		if err != nil {
			return nil, err
		}
		objs = append(objs, out.Contents...)
	}

	if l :=len(objs); l > 0 {
        sort.Slice(objs, func(a, b int) bool {
		    return objs[a].LastModified.Before(*objs[b].LastModified)
	    })

	    return objs[l-1].Key, nil
	}

	return "", nil

If you are trying to find the object key with the most recent DateTime (within the key's name), then you can use the StartAfter parameter to avoid listing all available keys.

    ...

    lastModifiedKey := ""

	params := &s3.ListObjectsV2Input{
		Bucket:     aws.String(s.bucket),
		Prefix:     aws.String(prefix), //uploaded
		StartAfter: aws.String(previousSearchKey),
	}
	p := s3.NewListObjectsV2Paginator(svc, params)

	for p.HasMorePages() {
		out, err := p.NextPage(ctx)
		if err != nil {
			return nil, err
		}
        // ListObjectV2 sorts key by ascending order & your name format is sortable
		lastModifiedKey =  out.Contents[len(out.Contents) -1]
	}

	return lastModifiedKey, nil

Note that StartAfter key doesn't require to be an existing key in the bucket.

It may be the result of a previous search, or a guessed value that you can assert is behind (and preferably close to the wanted result).

huangapple
  • 本文由 发表于 2022年3月22日 02:57:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/71562766.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定