英文:
order of execution confusion
问题
The AVG(Amount) calculation occurs in the SELECT clause, not the GROUP BY.
Regarding your questions:
-
The
HAVING SUM(Amount)condition is applied after theGROUP BYandAVG(Amount)calculation in the result set. It filters the grouped results based on the sum ofAmount, so it operates on the aggregated data after theGROUP BYclause. -
The
HAVINGclause does come after theSELECTclause in terms of the order of the SQL statement, but it operates on the result set after theGROUP BYandSELECTclauses have been applied. Therefore, it can filter usingAVG(Amount)because that calculation has already been performed in theSELECTclause.
英文:
SELECT First_name,last_name,email,ROUND (AVG(Amount),2) AS moneyspent
FROM Customer INNER JOIN Payment
ON Customer.customer_id= Payment.customer_id
WHERE Staff_id=2
GROUP BY First_name,last_name,email
HAVING SUM(Amount)>100
ORDER BY Moneyspent DESC
| first_name | last_name | moneyspent | |
|---|---|---|---|
| Brittany | Riley | brittany.riley@sakilacustomer.org | 5.83 |
| Arnold | Havens | arnold.havens@sakilacustomer.org | 5.14 |
| Eleanor | Hunt | eleanor.hunt@sakilacustomer.org | 5.04 |
| Stacey | Montgomery | stacey.montgomery@sakilacustomer.org | 4.73 |
| Karl | Seal | karl.seal@sakilacustomer.org | 4.63 |
I am confused about the Order of execution. I understand we cannot filter based on an aggregated result like AVG(Amount) in WHERE because that hasn't been executed yet. Does AVG(Amount) happen at the GROUP BY or SELECT?
-
If it happens at
GROUP BY, there should be three columnsFirst_name,Last_name,emailandAVG(Amount)left. How canHAVING SUM(AMOUNT)still be executing? -
If it happens at
SELECT, then how come we can still filter usingAVG(Amount)atHAVING, asHAVINGcomes beforeSELECT?
答案1
得分: 1
在处理您的 GROUP BY 时,查询执行引擎还将计算每个 first_name, last_name, email 组合的 SUM(Amount)(以及 AVG(Amount)),因为这些被您的 SELECT 和 HAVING 所引用。某些RDBMS不允许HAVING引用SELECT子句中尚未存在的表达式,但我想PostgreSQL允许这样做。对于您的用途,您应该假设这些聚合在GROUP BY时计算(HAVING之后应用;但这实际上是一种'逻辑'顺序,因为如果引擎推断出这样做更便宜并且不会改变结果,它可以提前应用其中的部分)。
英文:
When processing your GROUP BY the query execution engine will also evaluate SUM(Amount) (as well as AVG(Amount)) for each first_name, last_name, email combination; because these are referred to by your SELECT and HAVING, respectively. Some RDBMSs don't allow HAVING referring to an expression not already on the SELECT clause, but I guess PostgreSql is doing us a favour and allowing it. For your purposes you should assume that those aggregations are calculated at the GROUP BY time (HAVING is applied afterwards; but this is really a 'logical ' order, because the engine can apply parts of it earlier if it deduces it would be cheaper and that it wouldn't change results).
答案2
得分: 1
GROUP BY 和 HAVING 子句可以被视为在考虑结果时在 SELECT 子句之前进行评估,但仍然可以在 SELECT 列定义中使用的任何聚合函数都可以访问基础的非聚合数据。
思考的顺序如下:
JOIN和WHERE子句GROUP BY;请注意,您还可以按输出查询中的列进行分组,例如GROUP BY 1, 2,但这是一种语法糖HAVINGSELECT
可以看到实际计算结果列发生在最后,因为如果您在 SELECT 中使用窗口函数来访问(例如)前一行,您只能访问已包含在 GROUP BY 中或使用聚合函数的数据,而被 HAVING 或 WHERE 子句移除的数据将被过滤掉。
英文:
The GROUP BY and HAVING clauses can be thought of as evaluated before the SELECT clauses for when considering what the result would be, but any aggregate functions used in SELECT column definitions still have access to the underlying disaggregated data.
The order to think about it is:
JOINs andWHEREclauseGROUP BY; note that however, you can also group by columns in the output query, e.g.GROUP BY 1, 2but this is syntatic sugarHAVINGSELECT
You can see that the actual calculation of the resulting columns occurs last because if you use a window function as part of the SELECT to access (for example) the previous row you can only access data that you have either included in GROUP BY or that uses an aggregate function, and data removed by the HAVING or WHERE clause is filtered out.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论