英文:
python athena calculate total number of output rows
问题
I need to get total number of output rows returned by Athena.
status = 'RUNNING'
while status in ['QUEUED', 'RUNNING']:
response_get_query_details = athena.get_query_execution(
QueryExecutionId=query_execution_id
)
status = (
response_get_query_details.get("QueryExecution", {})
.get("Status", {})
.get("State", "NA")
)
if status in ("FAILED", "CANCELLED", "NA"):
raise Exception(f"Athena Query Failed: {failure_reason}")
elif status == 'SUCCEEDED':
query_stats = response_get_query_details['QueryExecution']['Statistics']
total_rows = query_stats['OutputRows'] # `outputRows` is not available
return total_rows
There's only DataScannedInBytes
in the statistics.
{'EngineExecutionTimeInMillis': 9799, 'DataScannedInBytes': 1090182, 'TotalExecutionTimeInMillis': 9991, 'QueryQueueTimeInMillis': 164, 'QueryPlanningTimeInMillis': 8860, 'ServiceProcessingTimeInMillis': 28}
Is there a way to calculate total number of rows from this?
英文:
I need to get total number of output rows returned by Athena.
status = 'RUNNING'
while status in ['QUEUED', 'RUNNING']:
response_get_query_details = athena.get_query_execution(
QueryExecutionId=query_execution_id
)
status = (
response_get_query_details.get("QueryExecution", {})
.get("Status", {})
.get("State", "NA")
)
if status in ("FAILED", "CANCELLED", "NA"):
raise Exception(f"Athena Query Failed: {failure_reason}")
elif status == 'SUCCEEDED':
query_stats = response_get_query_details['QueryExecution']['Statistics']
total_rows = query_stats['OutputRows'] <<--- `outputRows` is not available
return total_rows
There's only DataScannedInBytes
in the statistics.
{'EngineExecutionTimeInMillis': 9799, 'DataScannedInBytes': 1090182, 'TotalExecutionTimeInMillis': 9991, 'QueryQueueTimeInMillis': 164, 'QueryPlanningTimeInMillis': 8860, 'ServiceProcessingTimeInMillis': 28}
Is there a way to calculate total number of rows from this?
答案1
得分: 1
get_query_runtime_statistics()
API 调用返回查询返回的行数:
{
'QueryRuntimeStatistics': {
'Timeline': {
'QueryQueueTimeInMillis': 123,
'QueryPlanningTimeInMillis': 123,
'EngineExecutionTimeInMillis': 123,
'ServiceProcessingTimeInMillis': 123,
'TotalExecutionTimeInMillis': 123
},
'Rows': {
'InputRows': 123,
'InputBytes': 123,
'OutputBytes': 123,
'OutputRows': 123 <-- 这里!
},
...
}
}
英文:
The get_query_runtime_statistics()
API call returns the number of rows returned by the query:
{
'QueryRuntimeStatistics': {
'Timeline': {
'QueryQueueTimeInMillis': 123,
'QueryPlanningTimeInMillis': 123,
'EngineExecutionTimeInMillis': 123,
'ServiceProcessingTimeInMillis': 123,
'TotalExecutionTimeInMillis': 123
},
'Rows': {
'InputRows': 123,
'InputBytes': 123,
'OutputBytes': 123,
'OutputRows': 123 <-- Here!
},
...
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论