使用Go SDK 2获取Amazon S3中文件夹的大小。

huangapple go评论130阅读模式
英文:

Get the size of a folder in Amazon S3 using Go SDK 2

问题

我知道在Amazon S3中没有文件夹的概念,但我们可以通过在键名中使用“/”来模拟文件夹。鉴于此,使用AWS SDK for Go v2是否可以计算文件夹的大小?还是我必须检索文件夹中的所有对象,然后逐个计算大小?

英文:

I know there are no folders in Amazon S3, but we can emulate them by using "/" on the key name.
Given that, is it possible using the AWS SDK for Go v2 to calculate the size of a folder? Or do I have to retrieve all objects in the folder and then calculate one by one the size?

答案1

得分: 5

给定这个示例,以及这里的对象类型文档

可以计算存储在存储桶中的项目的大小

  1. package main
  2. import (
  3. "context"
  4. "flag"
  5. "fmt"
  6. "log"
  7. "github.com/aws/aws-sdk-go-v2/config"
  8. "github.com/aws/aws-sdk-go-v2/service/s3"
  9. )
  10. var (
  11. bucketName string
  12. objectPrefix string
  13. objectDelimiter string
  14. maxKeys int
  15. )
  16. func init() {
  17. flag.StringVar(&bucketName, "bucket", "", "要列出对象的 S3 存储桶的名称。")
  18. flag.StringVar(&objectPrefix, "prefix", "", "要列出的 S3 对象键的可选对象前缀。")
  19. flag.StringVar(&objectDelimiter, "delimiter", "",
  20. "S3 列出对象时使用的可选对象键分隔符。")
  21. flag.IntVar(&maxKeys, "max-keys", 0,
  22. "一次检索的最大键数。")
  23. }
  24. // 使用分页列出存储桶中的所有对象
  25. func main() {
  26. flag.Parse()
  27. if len(bucketName) == 0 {
  28. flag.PrintDefaults()
  29. log.Fatalf("无效的参数,需要存储桶名称")
  30. }
  31. // 从环境和共享配置中加载 SDK 的配置,并使用此配置创建客户端。
  32. cfg, err := config.LoadDefaultConfig(context.TODO())
  33. if err != nil {
  34. log.Fatalf("无法加载 SDK 配置,%v", err)
  35. }
  36. client := s3.NewFromConfig(cfg)
  37. // 根据 CLI 标志输入设置参数。
  38. params := &s3.ListObjectsV2Input{
  39. Bucket: &bucketName,
  40. }
  41. if len(objectPrefix) != 0 {
  42. params.Prefix = &objectPrefix
  43. }
  44. if len(objectDelimiter) != 0 {
  45. params.Delimiter = &objectDelimiter
  46. }
  47. // 创建用于 ListObjectsV2 操作的分页器。
  48. p := s3.NewListObjectsV2Paginator(client, params, func(o *s3.ListObjectsV2PaginatorOptions) {
  49. if v := int32(maxKeys); v != 0 {
  50. o.Limit = v
  51. }
  52. })
  53. // 遍历 S3 对象页面,打印每个返回的对象。
  54. var i int
  55. var total int64
  56. log.Println("对象:")
  57. for p.HasMorePages() {
  58. i++
  59. // NextPage 对每个页面检索都使用一个新的上下文。在这里,您可以添加超时或截止时间。
  60. page, err := p.NextPage(context.TODO())
  61. if err != nil {
  62. log.Fatalf("无法获取第 %v 页,%v", i, err)
  63. }
  64. // 记录找到的对象
  65. for _, obj := range page.Contents {
  66. // fmt.Println("对象:", *obj.Key)
  67. total += obj.Size
  68. }
  69. }
  70. fmt.Println("总计:", total)
  71. }

然后,如果我没错的话,根据s3.ListObjectsV2Input文档,你可以配置s3.ListObjectV2Input实例的Prefix成员来选择特定的文件夹。示例已经演示了如果传入标志-prefix=...的情况。

英文:

Given that example, and the Object types documentation here

It is possible to compute the size occupied by items within a bucket

  1. package main
  2. import (
  3. "context"
  4. "flag"
  5. "fmt"
  6. "log"
  7. "github.com/aws/aws-sdk-go-v2/config"
  8. "github.com/aws/aws-sdk-go-v2/service/s3"
  9. )
  10. var (
  11. bucketName string
  12. objectPrefix string
  13. objectDelimiter string
  14. maxKeys int
  15. )
  16. func init() {
  17. flag.StringVar(&bucketName, "bucket", "", "The `name` of the S3 bucket to list objects from.")
  18. flag.StringVar(&objectPrefix, "prefix", "", "The optional `object prefix` of the S3 Object keys to list.")
  19. flag.StringVar(&objectDelimiter, "delimiter", "",
  20. "The optional `object key delimiter` used by S3 List objects to group object keys.")
  21. flag.IntVar(&maxKeys, "max-keys", 0,
  22. "The maximum number of `keys per page` to retrieve at once.")
  23. }
  24. // Lists all objects in a bucket using pagination
  25. func main() {
  26. flag.Parse()
  27. if len(bucketName) == 0 {
  28. flag.PrintDefaults()
  29. log.Fatalf("invalid parameters, bucket name required")
  30. }
  31. // Load the SDK's configuration from environment and shared config, and
  32. // create the client with this.
  33. cfg, err := config.LoadDefaultConfig(context.TODO())
  34. if err != nil {
  35. log.Fatalf("failed to load SDK configuration, %v", err)
  36. }
  37. client := s3.NewFromConfig(cfg)
  38. // Set the parameters based on the CLI flag inputs.
  39. params := &s3.ListObjectsV2Input{
  40. Bucket: &bucketName,
  41. }
  42. if len(objectPrefix) != 0 {
  43. params.Prefix = &objectPrefix
  44. }
  45. if len(objectDelimiter) != 0 {
  46. params.Delimiter = &objectDelimiter
  47. }
  48. // Create the Paginator for the ListObjectsV2 operation.
  49. p := s3.NewListObjectsV2Paginator(client, params, func(o *s3.ListObjectsV2PaginatorOptions) {
  50. if v := int32(maxKeys); v != 0 {
  51. o.Limit = v
  52. }
  53. })
  54. // Iterate through the S3 object pages, printing each object returned.
  55. var i int
  56. var total int64
  57. log.Println("Objects:")
  58. for p.HasMorePages() {
  59. i++
  60. // Next Page takes a new context for each page retrieval. This is where
  61. // you could add timeouts or deadlines.
  62. page, err := p.NextPage(context.TODO())
  63. if err != nil {
  64. log.Fatalf("failed to get page %v, %v", i, err)
  65. }
  66. // Log the objects found
  67. for _, obj := range page.Contents {
  68. // fmt.Println("Object:", *obj.Key)
  69. total += obj.Size
  70. }
  71. }
  72. fmt.Println("total", total)
  73. }

Then, if I am correct, reading at s3.ListObjectsV2Input documentation, it appears to me that you can configure the Prefix member of the s3.ListObjectV2Input instance to select a specific folder. The example already demonstrates that if you pass in the flag -prefix=...

答案2

得分: 1

不确定是否最简单的方法,但你可以迭代遍历你感兴趣的对象列表-https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html,并在本地聚合大小。

英文:

Not sure if the easiest way, however you can iterate over your objects list of interest - https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html, and aggregate the size locally.

答案3

得分: 1

另一种方法:

  • 启用 AWS 的 storage lens - advanced metric - prefix aggregation 功能,将指标导出为 CSV 文件到一个存储桶中
  • 从存储桶中获取 CSV 文件的数据

注意:该指标每 24 小时导出一次。

英文:

Another way:

  • enable AWS storage lens - advanced metric - prefix aggregation for your bucket, with exportation the metric as csv to a bucket
  • get the csv data from the file in the bucket

note: the metric is exported every 24 hours

huangapple
  • 本文由 发表于 2021年9月19日 16:21:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/69241548.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定