获取Athena上的扫描数据使用boto3

huangapple go评论66阅读模式
英文:

Get Scanned data with boto3 on Athena

问题

I use Boto3 to perform Athena queries.
My code looks like this:

athena_client = boto3.client('athena')
    
# start the query 
query_execution = athena_client.start_query_execution(
    QueryString=sql_query,
    ResultConfiguration={ 'OutputLocation': 's3://my_path'}
)

# Get the id of the query
query_execution_id = query_execution['QueryExecutionId']

query_status = None
while query_status != 'SUCCEEDED':
    time.sleep(1)
    query_status = athena_client.get_query_execution(QueryExecutionId=query_execution_id)['QueryExecution']['Status']['State']
    if query_status not in ['QUEUED', 'RUNNING', 'SUCCEEDED']:
        raise Exception(f"""
        Athena query with the query execution ID {query_execution_id} failed or was cancelled.
        status: {query_status}
        """)

query_result = athena_client.get_query_results(QueryExecutionId=query_execution_id)

When I use the query Editor on AWS console Athena, I get metadata about the query I performed. I would like to get the field Data scanned.

When I look at the response I get (the variable query_result in my code), I have a field called ResponseMetadata but it does not contain the scanned data value. Is there a way to get it with boto3?

英文:

I use Boto3 to perform Athena queries.
My code looks like this:

athena_client = boto3.client('athena')
    
# start the query 
query_execution = athena_client.start_query_execution(
    QueryString=sql_query,
    ResultConfiguration={ 'OutputLocation': 's3://my_path'}
)

# Get the id of the query
query_execution_id = query_execution['QueryExecutionId']

query_status = None
while query_status != 'SUCCEEDED':
    time.sleep(1)
    query_status = athena_client.get_query_execution(QueryExecutionId=query_execution_id)['QueryExecution']['Status']['State']
    if query_status not in ['QUEUED', 'RUNNING', 'SUCCEEDED']:
        raise Exception(f"""
        Athena query with the query execution ID {query_execution_id} failed or was cancelled.
        status: {query_status}
        """)
                
query_result = athena_client.get_query_results(QueryExecutionId=query_execution_id)

When I use the query Editor on AWS console Athena, I get metada about the query I performed. I would like to get the field Data scanned:
获取Athena上的扫描数据使用boto3

When I look at the response I get (the variable query_result in my code), I have a field called ResponseMetadata but it does not contains the scanned data value. Is there a way to get it with boto3 ?

答案1

得分: 1

Amazon Athena get_query_runtime_statistics() 命令:

返回与查询的单次执行相关的查询执行运行时统计信息,如果您可以访问查询运行的工作组。

有一个名为 InputBytes 的字段,它被定义为:

用于执行查询的读取字节数。

英文:

The Amazon Athena get_query_runtime_statistics() command:

> Returns query execution runtime statistics related to a single execution of a query if you have access to the workgroup in which the query ran.

There is a field called InputBytes, which is defined as:

> The number of bytes read to execute the query.

huangapple
  • 本文由 发表于 2023年3月3日 23:29:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75628999.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定