英文:
Delete objects in s3 using wildcard matching
问题
我有以下可用的代码来从Amazon S3中删除对象:
params := &s3.DeleteObjectInput{
Bucket: aws.String("Bucketname"),
Key: aws.String("ObjectKey"),
}
s3Conn.DeleteObjects(params)
但是我想要做的是使用通配符**删除文件夹下的所有文件。我知道Amazon S3不会将"x/y/file.jpg"视为x内的文件夹y,但我想要实现的是通过提及"x/y*"来删除具有相同前缀的所有后续对象。尝试过Amazon多对象删除:
params := &s3.DeleteObjectsInput{
Bucket: aws.String("BucketName"),
Delete: &s3.Delete{
Objects: []*s3.ObjectIdentifier{
{
Key: aws.String("x/y/.*"),
},
},
},
}
result, err := s3Conn.DeleteObjects(params)
我知道在PHP中可以很容易地通过s3->delete_all_objects来实现,就像这个答案中所述。在GO语言中是否可以实现相同的操作。
英文:
I have the following working code to delete an object from Amazon s3
params := &s3.DeleteObjectInput{
Bucket: aws.String("Bucketname"),
Key : aws.String("ObjectKey"),
}
s3Conn.DeleteObjects(params)
But what i want to do is to delete all files under a folder using wildcard *. I know amazon s3 doesn't treat "x/y/file.jpg" as a folder y inside x but what i want to achieve is by mentioning "x/y" delete all the subsequent objects having the same prefix. Tried amazon multi object delete
params := &s3.DeleteObjectsInput{
Bucket: aws.String("BucketName"),
Delete: &s3.Delete{
Objects: []*s3.ObjectIdentifier {
{
Key : aws.String("x/y/.*"),
},
},
},
}
result , err := s3Conn.DeleteObjects(params)
I know in php it can be done easily by s3->delete_all_objects as per this answer. Is the same action possible in GOlang.
答案1
得分: 3
很遗憾,goamz包没有类似于PHP库中的delete_all_objects
方法。
然而,PHP的delete_all_objects
的源代码可以在这里找到(切换到源代码视图):http://docs.aws.amazon.com/AWSSDKforPHP/latest/#m=AmazonS3/delete_all_objects
以下是关键的代码行:
public function delete_all_objects($bucket, $pcre = self::PCRE_ALL)
{
// 收集所有匹配项
$list = $this->get_object_list($bucket, array('pcre' => $pcre));
// 只要我们至少有一个匹配项...
if (count($list) > 0)
{
$objects = array();
foreach ($list as $object)
{
$objects[] = array('key' => $object);
}
$batch = new CFBatchRequest();
$batch->use_credentials($this->credentials);
foreach (array_chunk($objects, 1000) as $object_set)
{
$this->batch($batch)->delete_objects($bucket, array(
'objects' => $object_set
));
}
$responses = $this->batch($batch)->send();
}
}
如你所见,PHP代码实际上会在存储桶上发起一个HTTP请求,首先获取与PCRE_ALL
匹配的所有文件,PCRE_ALL
在其他地方被定义为const PCRE_ALL = '/.*/i';
。
你一次只能删除1000个文件,所以delete_all_objects
会创建一个批处理函数,每次删除1000个文件。
你需要在你的go程序中创建与PHP库相同的功能,因为goamz包目前不支持此功能。幸运的是,这应该只需要几行代码,并且你有PHP库的指南。
完成后,向goamz包提交一个拉取请求可能是值得的!
英文:
Unfortunately the goamz package doesn't have a method similar to the PHP library's delete_all_objects
.
However, the source code for the PHP delete_all_objects
is available here (toggle source view): http://docs.aws.amazon.com/AWSSDKforPHP/latest/#m=AmazonS3/delete_all_objects
Here are the important lines of code:
public function delete_all_objects($bucket, $pcre = self::PCRE_ALL)
{
// Collect all matches
$list = $this->get_object_list($bucket, array('pcre' => $pcre));
// As long as we have at least one match...
if (count($list) > 0)
{
$objects = array();
foreach ($list as $object)
{
$objects[] = array('key' => $object);
}
$batch = new CFBatchRequest();
$batch->use_credentials($this->credentials);
foreach (array_chunk($objects, 1000) as $object_set)
{
$this->batch($batch)->delete_objects($bucket, array(
'objects' => $object_set
));
}
$responses = $this->batch($batch)->send();
As you can see, the PHP code will actually make an HTTP request on the bucket to first get all files matching PCRE_ALL
, which is defined elsewhere as const PCRE_ALL = '/.*/i';
.
You can only delete 1000 files at once, so delete_all_objects
then creates a batch function to delete 1000 files at a time.
You have to create the same functionality in your go program as the goamz package doesn't support this yet. Luckily it should only be a few lines of code, and you have a guide from the PHP library.
It might be worth submitting a pull request for the goamz package once you're done!
答案2
得分: 1
使用mc工具,您可以执行以下操作:
mc rm -r --force https://BucketName.s3.amazonaws.com/x/y
它将删除所有具有前缀"x/y"的对象
您可以使用minio-go在Go中实现相同的功能,如下所示:
package main
import (
"log"
"github.com/minio/minio-go"
)
func main() {
config := minio.Config{
AccessKeyID: "YOUR-ACCESS-KEY-HERE",
SecretAccessKey: "YOUR-PASSWORD-HERE",
Endpoint: "https://s3.amazonaws.com",
}
// 在此处找到您的S3端点 http://docs.aws.amazon.com/general/latest/gr/rande.html
s3Client, err := minio.New(config)
if err != nil {
log.Fatalln(err)
}
isRecursive := true
for object := range s3Client.ListObjects("BucketName", "x/y", isRecursive) {
if object.Err != nil {
log.Fatalln(object.Err)
}
err := s3Client.RemoveObject("BucketName", object.Key)
if err != nil {
log.Fatalln(err)
continue
}
log.Println("Removed : " + object.Key)
}
}
英文:
Using the mc tool you can do:
mc rm -r --force https://BucketName.s3.amazonaws.com/x/y
it will delete all the objects with the prefix "x/y"
You can achieve the same with Go using minio-go like this:
package main
import (
"log"
"github.com/minio/minio-go"
)
func main() {
config := minio.Config{
AccessKeyID: "YOUR-ACCESS-KEY-HERE",
SecretAccessKey: "YOUR-PASSWORD-HERE",
Endpoint: "https://s3.amazonaws.com",
}
// find Your S3 endpoint here http://docs.aws.amazon.com/general/latest/gr/rande.html
s3Client, err := minio.New(config)
if err != nil {
log.Fatalln(err)
}
isRecursive := true
for object := range s3Client.ListObjects("BucketName", "x/y", isRecursive) {
if object.Err != nil {
log.Fatalln(object.Err)
}
err := s3Client.RemoveObject("BucketName", object.Key)
if err != nil {
log.Fatalln(err)
continue
}
log.Println("Removed : " + object.Key)
}
}
答案3
得分: 1
自从提出这个问题以来,AWS GoLang库的S3 Manager已经添加了一些新的方法来处理这个任务(响应@Itachi的pr)。
请参阅Github记录:https://github.com/aws/aws-sdk-go/issues/448#issuecomment-309078450
这是他们在v1中的示例:https://github.com/awsdocs/aws-doc-sdk-examples/blob/main/go/s3/DeleteObjects/DeleteObjects.go#L36
要在存储桶内的路径上实现“通配符匹配”,请将Prefix参数添加到示例的ListObjectsInput调用中,如下所示:
iter := s3manager.NewDeleteListIterator(svc, &s3.ListObjectsInput{
Bucket: bucket,
Prefix: aws.String("somePathString"),
})
英文:
Since this question was asked, the AWS GoLang lib for S3 has received some new methods in S3 Manager to handle this task (in response to @Itachi's pr).
See Github record: https://github.com/aws/aws-sdk-go/issues/448#issuecomment-309078450
Here is their example in v1: https://github.com/awsdocs/aws-doc-sdk-examples/blob/main/go/s3/DeleteObjects/DeleteObjects.go#L36
To get "wildcard matching" on paths inside the bucket, add the Prefix param to the example's ListObjectsInput call, as shown here:
iter := s3manager.NewDeleteListIterator(svc, &s3.ListObjectsInput{
Bucket: bucket,
Prefix: aws.String("somePathString"),
})
答案4
得分: 0
有点晚了,但是由于我遇到了同样的问题,所以我创建了一个小的包,你可以将其复制到你的代码库中,并根据需要进行导入。
func ListKeysInPrefix(s s3iface.S3API, bucket, prefix string) ([]string, error) {
res, err := s.Client.ListObjectsV2(&s3.ListObjectsV2Input{
Bucket: aws.String(bucket),
Prefix: aws.String(prefix),
})
if err != nil {
return []string{}, err
}
var keys []string
for _, key := range res.Contents {
keys = append(keys, *key.Key)
}
return keys, nil
}
func createDeleteObjectsInput(keys []string) *s3.Delete {
rm := []*s3.ObjectIdentifier{}
for _, key := range keys {
rm = append(rm, &s3.ObjectIdentifier{Key: aws.String(key)})
}
return &s3.Delete{Objects: rm, Quiet: aws.Bool(false)}
}
func DeletePrefix(s s3iface.S3API, bucket, prefix string) error {
keys, err := s.ListKeysInPrefix(bucket, prefix)
if err != nil {
panic(err)
}
_, err = s.Client.DeleteObjects(&s3.DeleteObjectsInput{
Bucket: aws.String(bucket),
Delete: s.createDeleteObjectsInput(keys),
})
if err != nil {
return err
}
return nil
}
所以,如果你有一个名为"somebucket"的存储桶,并且具有以下结构:s3://somebucket/foo/some-prefixed-folder/bar/test.txt
,并且想要从"some-prefixed-folder"开始删除,使用方法如下:
func main() {
// 在这里创建你的S3客户端
// client := ....
err := DeletePrefix(client, "somebucket", "some-prefixed-folder")
if err != nil {
panic(err)
}
}
由于ListObjectsV2
的实现方式,此实现只允许从给定前缀删除最多1000个条目,但它是分页的,所以只需添加功能以保持刷新结果,直到结果小于1000即可。
英文:
A bit late in the game, but since I was having the same problem, I created a small pkg that you can copy to your code base and import as needed.
func ListKeysInPrefix(s s3iface.S3API, bucket, prefix string) ([]string, error) {
res, err := s.Client.ListObjectsV2(&s3.ListObjectsV2Input{
Bucket: aws.String(bucket),
Prefix: aws.String(prefix),
})
if err != nil {
return []string{}, err
}
var keys []string
for _, key := range res.Contents {
keys = append(keys, *key.Key)
}
return keys, nil
}
func createDeleteObjectsInput(keys []string) *s3.Delete {
rm := []*s3.ObjectIdentifier{}
for _, key := range keys {
rm = append(rm, &s3.ObjectIdentifier{Key: aws.String(key)})
}
return &s3.Delete{Objects: rm, Quiet: aws.Bool(false)}
}
func DeletePrefix(s s3iface.S3API, bucket, prefix string) error {
keys, err := s.ListKeysInPrefix(bucket, prefix)
if err != nil {
panic(err)
}
_, err = s.Client.DeleteObjects(&s3.DeleteObjectsInput{
Bucket: aws.String(bucket),
Delete: s.createDeleteObjectsInput(keys),
})
if err != nil {
return err
}
return nil
}
So, in the case you have a bucket called "somebucket" with the following structure: s3://somebucket/foo/some-prefixed-folder/bar/test.txt
and wanted to delete from some-prefixed-folder
onwards, usage would be:
func main() {
// create your s3 client here
// client := ....
err := DeletePrefix(client, "somebucket", "some-prefixed-folder")
if err != nil {
panic(err)
}
}
This implementation only allows to delete a maximum of 1000 entries from the given prefix due ListObjectsV2
implementation - but it is paginated, so it's a matter of adding the functionality to keep refreshing results until results are < 1000.
答案5
得分: 0
我能够使用通配符从CLI中删除S3存储桶中的对象。
aws s3 rm s3://<xyz bucket name>/2023/ --recursive --exclude '*' --include 'A*.csv'
英文:
I was able to delete objects in S3 bucket using wildcard from CLI
aws s3 rm s3://<xyz bucket name>/2023/ --recursive --exclude '*' --include 'A*.csv'
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论