2023年3月12日 08:20:59go评论65阅读模式

英文:

How to copy file from s3 bucket to a folder in same s3 bucket?

问题

以下是翻译好的部分：

我正在尝试在Python Lambda中执行一个简单的操作，只是将上传的文件复制到同一个S3存储桶中的特定文件夹中，但我遇到了问题。如果我将文件硬编码，这样可以工作，但如果我尝试传递文件的src_key，然后我会在要传输文件的文件夹上陷入无限循环。

这是到目前为止的我的Lambda代码。

import os
import logging
import boto3

DST_BUCKET = os.environ.get('DST_BUCKET')
REGION = os.environ.get('REGION')

s3 = boto3.resource('s3', region_name=REGION)

def handler(event, context):
    LOGGER.info(f'Event structure: {event}')
    LOGGER.info(f'DST_BUCKET: {DST_BUCKET}')

    for record in event['Records']:
        src_bucket = record['s3']['bucket']['name']
        src_key = record['s3']['object']['key']

        copy_source = {
                    'Bucket': src_bucket,
                    'Key': src_key
                        }
        bucket = s3.Bucket(DST_BUCKET)
        bucket.copy(copy_source, src_key)

        try:
            # 检查文件
            s3.Object(DST_BUCKET, src_key).load()
            LOGGER.info(f"文件 {src_key} 已上传到存储桶 {DST_BUCKET}")

            # 必须在这里硬编码存储桶名称
            src_bucket2 = s3.Bucket('send-bucket01')

            # 这是与src_key相关的关键部分（注意：这是S3存储桶中预创建的文件夹）
            src_bucket2.copy(copy_source, 'fileProcessed/' + src_key)
    
        except Exception as e:
            return {"error": str(e)}

    return {'status': 'ok'}

关键行是....

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key)

我认为这应该可以工作，但这只会不断地将文件夹名称和文件一遍又一遍地复制到"fileProcessed"文件夹中。

再次强调，如果我硬编码实际文件（例如，'fileProcessed/myfile.pdf'），这可以工作，但我希望使其动态，以接受上传的任何文件并将其放入"fileProcessed"文件夹中。

注意：我有一个要求，即从S3存储桶的根路径复制这些文件，而不是从预先创建的文件夹中复制，我可以在复制事件中设置文件夹前缀。

英文:

I'm trying a simple operation within a python lambda that just copies a file on upload to a specific folder within that same s3 bucket, but Im running into issues. If I hardcode the file, this works but if I try to pass in the src_key of the file, then I get an infinite loop on the folder im wanting to transfer the file(s) to.

Here is my lambda for this thus far.

import os
import logging
import boto3


DST_BUCKET = os.environ.get(&#39;DST_BUCKET&#39;)
REGION = os.environ.get(&#39;REGION&#39;)

s3 = boto3.resource(&#39;s3&#39;, region_name=REGION)

def handler(event, context):
    LOGGER.info(f&#39;Event structure: {event}&#39;)
    LOGGER.info(f&#39;DST_BUCKET: {DST_BUCKET}&#39;)

    for record in event[&#39;Records&#39;]:
        src_bucket = record[&#39;s3&#39;][&#39;bucket&#39;][&#39;name&#39;]
        src_key = record[&#39;s3&#39;][&#39;object&#39;][&#39;key&#39;]

        copy_source = {
                    &#39;Bucket&#39;: src_bucket,
                    &#39;Key&#39;: src_key
                        }
        bucket = s3.Bucket(DST_BUCKET)
        bucket.copy(copy_source, src_key)

        try:
            #Check file
            s3.Object(DST_BUCKET, src_key).load()
            LOGGER.info(f&quot;File {src_key} uploaded to Bucket {DST_BUCKET}&quot;)

            # had to hardcode bucket name here
            src_bucket2 = s3.Bucket(&#39;send-bucket01&#39;)

            # this is key bit with src_key (NOTE: This is pre-created folder in s3 bucket)
            src_bucket2.copy(copy_source, &#39;fileProcessed/&#39; + src_key)
    
        except Exception as e:
            return {&quot;error&quot;:str(e)}

    return {&#39;status&#39;: &#39;ok&#39;}

The key line is....

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key)

I was thinking this would work, but this just continually copies the folder name and file over and over again in the fileProcessed folder.

Again, if I hardcode the actual file this works (i.e, 'fileProcessed/myfile.pdf), however, I want to make this dynamic to accept any file uploaded to be put in the fileProcessd folder.

Note: I’ve got a requirement to copy these files from the root path of the s3 bucket not from a precreated folder which I could set the folder prefix in the copy event.

答案1

得分: 2

你的AWS Lambda函数似乎配置成由Amazon S3中对象的创建触发。因此，它“不断复制文件夹名称和文件”是因为复制的对象再次触发Lambda函数。

为防止这种情况发生，请确保S3存储桶中的事件配置仅在特定路径（文件夹）上触发，而不是在整个存储桶上触发。例如，您可以有一个incoming/路径。

然后，当Lambda函数将其复制到fileProcessed/路径时，它将不会再次触发事件（因为事件仅在incoming/路径上触发）。

英文:

It would appear that your AWS Lambda function is configured to be triggered by the creation of an object in Amazon S3. Therefore, the reason why it "continually copies the folder name and file over and over again" is because the copied object is again triggering the Lambda function.

To prevent this happening, ensure that the Event configuration in the S3 bucket only triggers on a specific path (folder) rather than triggering on the bucket as a whole. For example, you might have an incoming/ path.

Then, when the Lambda function copies it to the fileProcessed/ path, it will not trigger the event again (because the event only triggers on the incoming/ path).

答案2

得分: -1

Props to @AnonCoward in above comments for answering my use case. If anyone else is figuring this out, the following worked for me:

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key.split('/')[-1])

As stated above I’ve got a requirement to copy these files from the root path of the source s3 bucket not from a pre created folder. If you can copy files from a given folder then just set the folder path in the event configuration of the s3 bucket which is simple.

英文:

Props to @AnonCoward in above comments for answering my use case. If anyone else is figuring this out, the following worked for me:

src_bucket2.copy(copy_source, 'fileProcessed/' + src_key.split('/')[-1])

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将文件从S3存储桶复制到同一S3存储桶中的文件夹？

问题

答案1

答案2

Tkinter – GUI：带有按钮的复选框，用于检查异常并关闭窗口

循环以在Python中找到最大的R2

向多个设备发送SNS推送通知消息

使用Pandas/Python将一个DataFrame中指定的列替换为另一个列

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论