如何将文件从S3存储桶复制到同一S3存储桶中的文件夹?

huangapple go评论65阅读模式
英文:

How to copy file from s3 bucket to a folder in same s3 bucket?

问题

以下是翻译好的部分:

我正在尝试在Python Lambda中执行一个简单的操作只是将上传的文件复制到同一个S3存储桶中的特定文件夹中但我遇到了问题如果我将文件硬编码这样可以工作但如果我尝试传递文件的src_key然后我会在要传输文件的文件夹上陷入无限循环

这是到目前为止的我的Lambda代码

import os
import logging
import boto3

DST_BUCKET = os.environ.get('DST_BUCKET')
REGION = os.environ.get('REGION')

s3 = boto3.resource('s3', region_name=REGION)

def handler(event, context):
    LOGGER.info(f'Event structure: {event}')
    LOGGER.info(f'DST_BUCKET: {DST_BUCKET}')

    for record in event['Records']:
        src_bucket = record['s3']['bucket']['name']
        src_key = record['s3']['object']['key']

        copy_source = {
                    'Bucket': src_bucket,
                    'Key': src_key
                        }
        bucket = s3.Bucket(DST_BUCKET)
        bucket.copy(copy_source, src_key)

        try:
            # 检查文件
            s3.Object(DST_BUCKET, src_key).load()
            LOGGER.info(f"文件 {src_key} 已上传到存储桶 {DST_BUCKET}")

            # 必须在这里硬编码存储桶名称
            src_bucket2 = s3.Bucket('send-bucket01')

            # 这是与src_key相关的关键部分(注意:这是S3存储桶中预创建的文件夹)
            src_bucket2.copy(copy_source, 'fileProcessed/' + src_key)
    
        except Exception as e:
            return {"error": str(e)}

    return {'status': 'ok'}

关键行是....

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key)

我认为这应该可以工作,但这只会不断地将文件夹名称和文件一遍又一遍地复制到"fileProcessed"文件夹中。

再次强调,如果我硬编码实际文件(例如,'fileProcessed/myfile.pdf'),这可以工作,但我希望使其动态,以接受上传的任何文件并将其放入"fileProcessed"文件夹中。

注意:我有一个要求,即从S3存储桶的根路径复制这些文件,而不是从预先创建的文件夹中复制,我可以在复制事件中设置文件夹前缀。

英文:

I'm trying a simple operation within a python lambda that just copies a file on upload to a specific folder within that same s3 bucket, but Im running into issues. If I hardcode the file, this works but if I try to pass in the src_key of the file, then I get an infinite loop on the folder im wanting to transfer the file(s) to.

Here is my lambda for this thus far.

import os
import logging
import boto3


DST_BUCKET = os.environ.get('DST_BUCKET')
REGION = os.environ.get('REGION')

s3 = boto3.resource('s3', region_name=REGION)

def handler(event, context):
    LOGGER.info(f'Event structure: {event}')
    LOGGER.info(f'DST_BUCKET: {DST_BUCKET}')

    for record in event['Records']:
        src_bucket = record['s3']['bucket']['name']
        src_key = record['s3']['object']['key']

        copy_source = {
                    'Bucket': src_bucket,
                    'Key': src_key
                        }
        bucket = s3.Bucket(DST_BUCKET)
        bucket.copy(copy_source, src_key)

        try:
            #Check file
            s3.Object(DST_BUCKET, src_key).load()
            LOGGER.info(f"File {src_key} uploaded to Bucket {DST_BUCKET}")

            # had to hardcode bucket name here
            src_bucket2 = s3.Bucket('send-bucket01')

            # this is key bit with src_key (NOTE: This is pre-created folder in s3 bucket)
            src_bucket2.copy(copy_source, 'fileProcessed/' + src_key)
    
        except Exception as e:
            return {"error":str(e)}

    return {'status': 'ok'}

The key line is....

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key)

I was thinking this would work, but this just continually copies the folder name and file over and over again in the fileProcessed folder.

Again, if I hardcode the actual file this works (i.e, 'fileProcessed/myfile.pdf), however, I want to make this dynamic to accept any file uploaded to be put in the fileProcessd folder.

Note: I’ve got a requirement to copy these files from the root path of the s3 bucket not from a precreated folder which I could set the folder prefix in the copy event.

答案1

得分: 2

你的AWS Lambda函数似乎配置成由Amazon S3中对象的创建触发。因此,它“不断复制文件夹名称和文件”是因为复制的对象再次触发Lambda函数

为防止这种情况发生,请确保S3存储桶中的事件配置仅在特定路径(文件夹)上触发,而不是在整个存储桶上触发。例如,您可以有一个incoming/路径。

然后,当Lambda函数将其复制到fileProcessed/路径时,它将不会再次触发事件(因为事件仅在incoming/路径上触发)。

英文:

It would appear that your AWS Lambda function is configured to be triggered by the creation of an object in Amazon S3. Therefore, the reason why it "continually copies the folder name and file over and over again" is because the copied object is again triggering the Lambda function.

To prevent this happening, ensure that the Event configuration in the S3 bucket only triggers on a specific path (folder) rather than triggering on the bucket as a whole. For example, you might have an incoming/ path.

Then, when the Lambda function copies it to the fileProcessed/ path, it will not trigger the event again (because the event only triggers on the incoming/ path).

答案2

得分: -1

Props to @AnonCoward in above comments for answering my use case. If anyone else is figuring this out, the following worked for me:

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key.split('/')[-1])

As stated above I’ve got a requirement to copy these files from the root path of the source s3 bucket not from a pre created folder. If you can copy files from a given folder then just set the folder path in the event configuration of the s3 bucket which is simple.

英文:

Props to @AnonCoward in above comments for answering my use case. If anyone else is figuring this out, the following worked for me:

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key.split('/')[-1])

As stated above I’ve got a requirement to copy these files from the root path of the source s3 bucket not from a pre created folder. If you can copy files from a given folder then just set the folder path in the event configuration of the s3 bucket which is simple.

huangapple
  • 本文由 发表于 2023年3月12日 08:20:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75710363.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定