处理 AWS S3 文件复制时的动态文件命名约定,使用 boto3 Python。

huangapple go评论71阅读模式
英文:

How to handle dynamic file naming convention, while AWS s3 file coping using boto3 python

问题

我是AWS世界的新手,最近开始探索。
在运行 Athena 查询后,我尝试将生成的查询结果文件复制到另一个 S3 位置。
我在这里遇到的问题是:
file_name 我正在尝试动态构建,使用 Athena 生成的查询 ID 并附加 .csv 文件扩展名。
这会生成异常:

botocore.errorfactory.NoSuchKey: 在调用 CopyObject 操作时发生错误 (NoSuchKey):指定的键不存在。
如果硬编码文件名,例如:file_name = '30795514-8b0b-4b17-8764-495b25d74100.csv',在单引号 '' 中,我的代码可以正常工作。复制已完成。
请帮助我如何动态构建源和目标文件名。

import boto3
s3 = session.client('s3')
athena_client = boto3.client(
    "athena",
    aws_access_key_id=AWS_ACCESS_KEY,
    aws_secret_access_key=AWS_SECRET_KEY,
    region_name=AWS_REGION
)

def main():
    query = "select * from test_table"
    response = athena_client.start_query_execution(
        QueryString=query,
        ResultConfiguration={"OutputLocation": RESULT_OUTPUT_LOCATION}
    )
    queryId = response['QueryExecutionId']
    src_bucket = 'smg-datalake-prod-athena-query-results'
    dst_bucket = 'smg-datalake-prod-athena-query-results'
    file_name = str(queryId + ".csv")
    copy_object(src_bucket, dst_bucket, file_name)

def copy_object(src_bucket, dst_bucket, file_name):
    src_key = f'python-athena/{file_name}'
    dst_key = f'python-athena/cosmo/rss/v2/newsletter/kloka_latest.csv'
    # 复制对象到目标存储桶
    s3.copy_object(Bucket=dst_bucket, CopySource={'Bucket': src_bucket, 'Key': src_key}, Key=dst_key)

请注意,代码中使用的 AWS_ACCESS_KEYAWS_SECRET_KEYAWS_REGIONRESULT_OUTPUT_LOCATION 这些变量需要根据您的 AWS 配置进行设置。希望这有所帮助!

英文:

I am new to AWS world, started to explore recently.
After running Athena Query, I am trying to copy the query result file generated, to another s3 location.
The problem I am getting here is :
file_name Here I'm trying to build dynamically, using the query id , that Athena generated and by appending with .csv file extension.
Which is generating exception:

botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the CopyObject operation: The specified key does not exist.
If hardcode the file name e.g : file_name = '30795514-8b0b-4b17-8764-495b25d74100.csv' inside single quote '', my code is working fine. Copying is getting done.
Please help me how can I dynamically build source and destination file name dynamically.

import boto3
s3 = session.client('s3');
athena_client = boto3.client(
"athena",
aws_access_key_id=AWS_ACCESS_KEY,
aws_secret_access_key=AWS_SECRET_KEY,
region_name=AWS_REGION,);

def main():
query = "select * from test_table";
response = athena_client.start_query_execution(
    QueryString=query,
    ResultConfiguration={"OutputLocation":       RESULT_OUTPUT_LOCATION}
)
queryId = response['QueryExecutionId'];
src_bucket = 'smg-datalake-prod-athena-query-results'
dst_bucket = 'smg-datalake-prod-athena-query-results'
file_name = str(queryId+".csv");
copy_object(src_bucket, dst_bucket, file_name)

def copy_object(src_bucket, dst_bucket, file_name):
src_key = f'python-athena/{file_name}';
dst_key = f'python-athena/cosmo/rss/v2/newsletter/kloka_latest.csv';
# copy object to destination bucket
s3.copy_object(Bucket=dst_bucket, CopySource={'Bucket': src_bucket, 'Key': src_key}, Key=dst_key);

答案1

得分: 1

在执行 Athena 查询后,我只是等了一段时间。然后我尝试将文件移动到另一个位置,它开始工作。由于它运行得如此快,文件在查询结果存储桶中可用的同时,我的代码正在尝试复制这个文件,但它还没有出现。

英文:

After executing Athena Query, I just put some sleep. then I tried to move file to another location, it started to work.
As it was running so fast , by the time file is available in query results bucket, my code was trying to copy the file, which yet to be present.

huangapple
  • 本文由 发表于 2023年2月16日 13:43:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75468267.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定