英文:
group by more than one column
问题
尝试从表中选择三列,并根据其中的两列对结果进行分组。例如,如果有列 x、y、z,我想要的结果是 (x1, y1): [z1, z9, z11 ....],(x2, y2): [z3, z12, z33 ...],...
我尝试了以下(Athena)查询:
SELECT region, family, tenancy, platform, size_factor, MAX(duration)
FROM default.sell_durations
WHERE CAST(creation_date as timestamp) BETWEEN CAST(? as timestamp) AND CAST(? as timestamp)
GROUP BY region, family, tenancy, platform, size_factor
我收到了以下错误:
无法执行 Athena 查询,状态:失败,原因:语法错误:
第 1 行的 56 位置:'duration' 必须是聚合表达式或出现在 GROUP BY 子句中
英文:
I want to select three columns from a table and group by the results according to two of the. i.e, if i have columns x, y, z, I want results like (x1, y1): [z1, z9, z11 ....], (x2, y2): [z3, z12, z33 ...], ...
I tried the next (ATHENA) query:
SELECT region, family, tenancy, platform, size_factor, duration
FROM default.sell_durations
WHERE CAST(creation_date as timestamp) BETWEEN CAST(? as timestamp) AND CAST(? as timestamp)
group by region, family, tenancy, platform, size_factor
and i got the next error:
> Failed to Execute Athena query, status: FAILED, reason: SYNTAX_ERROR:
> line 1:56: 'duration' must be an aggregate expression or appear in
> GROUP BY clause
答案1
得分: 1
使用其中一个聚合函数。在这种情况下,基于描述,array_agg
似乎是一个合适的选择:
SELECT region, family, tenancy, platform, size_factor, array_agg(duration)
FROM default.sell_durations
WHERE CAST(creation_date as timestamp) BETWEEN CAST(? as timestamp) AND CAST(? as timestamp)
group by region, family, tenancy, platform, size_factor
英文:
Use one of the aggregation functions. In this case based on description array_agg
seems to be an appropriate one:
SELECT region, family, tenancy, platform, size_factor, array_agg(duration)
FROM default.sell_durations
WHERE CAST(creation_date as timestamp) BETWEEN CAST(? as timestamp) AND CAST(? as timestamp)
group by region, family, tenancy, platform, size_factor
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论