使用Go SDK 2获取Amazon S3中文件夹的大小。

huangapple go评论90阅读模式
英文:

Get the size of a folder in Amazon S3 using Go SDK 2

问题

我知道在Amazon S3中没有文件夹的概念,但我们可以通过在键名中使用“/”来模拟文件夹。鉴于此,使用AWS SDK for Go v2是否可以计算文件夹的大小?还是我必须检索文件夹中的所有对象,然后逐个计算大小?

英文:

I know there are no folders in Amazon S3, but we can emulate them by using "/" on the key name.
Given that, is it possible using the AWS SDK for Go v2 to calculate the size of a folder? Or do I have to retrieve all objects in the folder and then calculate one by one the size?

答案1

得分: 5

给定这个示例,以及这里的对象类型文档

可以计算存储在存储桶中的项目的大小

package main

import (
	"context"
	"flag"
	"fmt"
	"log"

	"github.com/aws/aws-sdk-go-v2/config"
	"github.com/aws/aws-sdk-go-v2/service/s3"
)

var (
	bucketName      string
	objectPrefix    string
	objectDelimiter string
	maxKeys         int
)

func init() {
	flag.StringVar(&bucketName, "bucket", "", "要列出对象的 S3 存储桶的名称。")
	flag.StringVar(&objectPrefix, "prefix", "", "要列出的 S3 对象键的可选对象前缀。")
	flag.StringVar(&objectDelimiter, "delimiter", "",
		"S3 列出对象时使用的可选对象键分隔符。")
	flag.IntVar(&maxKeys, "max-keys", 0,
		"一次检索的最大键数。")
}

// 使用分页列出存储桶中的所有对象
func main() {
	flag.Parse()
	if len(bucketName) == 0 {
		flag.PrintDefaults()
		log.Fatalf("无效的参数,需要存储桶名称")
	}

	// 从环境和共享配置中加载 SDK 的配置,并使用此配置创建客户端。
	cfg, err := config.LoadDefaultConfig(context.TODO())
	if err != nil {
		log.Fatalf("无法加载 SDK 配置,%v", err)
	}

	client := s3.NewFromConfig(cfg)

	// 根据 CLI 标志输入设置参数。
	params := &s3.ListObjectsV2Input{
		Bucket: &bucketName,
	}
	if len(objectPrefix) != 0 {
		params.Prefix = &objectPrefix
	}
	if len(objectDelimiter) != 0 {
		params.Delimiter = &objectDelimiter
	}

	// 创建用于 ListObjectsV2 操作的分页器。
	p := s3.NewListObjectsV2Paginator(client, params, func(o *s3.ListObjectsV2PaginatorOptions) {
		if v := int32(maxKeys); v != 0 {
			o.Limit = v
		}
	})

	// 遍历 S3 对象页面,打印每个返回的对象。
	var i int
	var total int64
	log.Println("对象:")
	for p.HasMorePages() {
		i++

		// NextPage 对每个页面检索都使用一个新的上下文。在这里,您可以添加超时或截止时间。
		page, err := p.NextPage(context.TODO())
		if err != nil {
			log.Fatalf("无法获取第 %v 页,%v", i, err)
		}

		// 记录找到的对象
		for _, obj := range page.Contents {
			// fmt.Println("对象:", *obj.Key)
			total += obj.Size
		}
	}
	fmt.Println("总计:", total)
}

然后,如果我没错的话,根据s3.ListObjectsV2Input文档,你可以配置s3.ListObjectV2Input实例的Prefix成员来选择特定的文件夹。示例已经演示了如果传入标志-prefix=...的情况。

英文:

Given that example, and the Object types documentation here

It is possible to compute the size occupied by items within a bucket

package main
import (
"context"
"flag"
"fmt"
"log"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
var (
bucketName      string
objectPrefix    string
objectDelimiter string
maxKeys         int
)
func init() {
flag.StringVar(&bucketName, "bucket", "", "The `name` of the S3 bucket to list objects from.")
flag.StringVar(&objectPrefix, "prefix", "", "The optional `object prefix` of the S3 Object keys to list.")
flag.StringVar(&objectDelimiter, "delimiter", "",
"The optional `object key delimiter` used by S3 List objects to group object keys.")
flag.IntVar(&maxKeys, "max-keys", 0,
"The maximum number of `keys per page` to retrieve at once.")
}
// Lists all objects in a bucket using pagination
func main() {
flag.Parse()
if len(bucketName) == 0 {
flag.PrintDefaults()
log.Fatalf("invalid parameters, bucket name required")
}
// Load the SDK's configuration from environment and shared config, and
// create the client with this.
cfg, err := config.LoadDefaultConfig(context.TODO())
if err != nil {
log.Fatalf("failed to load SDK configuration, %v", err)
}
client := s3.NewFromConfig(cfg)
// Set the parameters based on the CLI flag inputs.
params := &s3.ListObjectsV2Input{
Bucket: &bucketName,
}
if len(objectPrefix) != 0 {
params.Prefix = &objectPrefix
}
if len(objectDelimiter) != 0 {
params.Delimiter = &objectDelimiter
}
// Create the Paginator for the ListObjectsV2 operation.
p := s3.NewListObjectsV2Paginator(client, params, func(o *s3.ListObjectsV2PaginatorOptions) {
if v := int32(maxKeys); v != 0 {
o.Limit = v
}
})
// Iterate through the S3 object pages, printing each object returned.
var i int
var total int64
log.Println("Objects:")
for p.HasMorePages() {
i++
// Next Page takes a new context for each page retrieval. This is where
// you could add timeouts or deadlines.
page, err := p.NextPage(context.TODO())
if err != nil {
log.Fatalf("failed to get page %v, %v", i, err)
}
// Log the objects found
for _, obj := range page.Contents {
// fmt.Println("Object:", *obj.Key)
total += obj.Size
}
}
fmt.Println("total", total)
}

Then, if I am correct, reading at s3.ListObjectsV2Input documentation, it appears to me that you can configure the Prefix member of the s3.ListObjectV2Input instance to select a specific folder. The example already demonstrates that if you pass in the flag -prefix=...

答案2

得分: 1

不确定是否最简单的方法,但你可以迭代遍历你感兴趣的对象列表-https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html,并在本地聚合大小。

英文:

Not sure if the easiest way, however you can iterate over your objects list of interest - https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html, and aggregate the size locally.

答案3

得分: 1

另一种方法:

  • 启用 AWS 的 storage lens - advanced metric - prefix aggregation 功能,将指标导出为 CSV 文件到一个存储桶中
  • 从存储桶中获取 CSV 文件的数据

注意:该指标每 24 小时导出一次。

英文:

Another way:

  • enable AWS storage lens - advanced metric - prefix aggregation for your bucket, with exportation the metric as csv to a bucket
  • get the csv data from the file in the bucket

note: the metric is exported every 24 hours

huangapple
  • 本文由 发表于 2021年9月19日 16:21:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/69241548.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定