识别冗余的订单 by’s

huangapple go评论88阅读模式
英文:

Identifying redundant order by's

问题

在我的代码中,我已经注释掉了我认为是多余的排序子句的行。我已经检查了结果(行值),在有注释和没有注释的情况下,结果是相同的。我只是想知道是否存在第二个和第四个CTE中我已经注释掉的两个"order by"子句不是真正多余的情况。

  1. speed_dataset as (
  2. select uc_id, imei, points_geom, time_created, st_distance((points_geom::geography),lag(points_geom::geography) over (partition by imei order by time_created))
  3. / nullif(( EXTRACT(EPOCH FROM time_created) - EXTRACT(EPOCH FROM LAG(time_created) OVER(PARTITION BY imei ORDER BY time_created)))::FLOAT8,0) as speed
  4. from orig_dataset
  5. order by imei,time_created
  6. )
  7. ,
  8. subset_speed as (
  9. select uc_id, ROW_NUMBER() OVER (ORDER BY (time_created)) AS row_id, speed, imei,points_geom ,time_created
  10. from speed_dataset sd
  11. where speed < 0.1 or speed between 0.75 and 2
  12. --order by time_created
  13. )
  14. ,
  15. leading_speeds as (
  16. select *,lead (speed) over (partition by imei order by time_created) as lead_speed from subset_speed
  17. )
  18. ,
  19. subset_cr as (
  20. select * from leading_speeds
  21. where
  22. (
  23. (speed < 0.1 and lead_speed between 0.75 and 2)
  24. or
  25. (speed between 0.75 and 2 and lead_speed < 0.1)
  26. )
  27. --order by imei,time_created
  28. )
  29. ,
  30. clustering as(
  31. SELECT uc_id,row_id,imei, speed, points_geom ,time_created, ST_ClusterDBSCAN(st_transform(points_geom,24313),eps := 150, minPoints := 3)
  32. OVER(ORDER BY row_id) AS cluster_id FROM subset_cr
  33. )

希望这有所帮助。

英文:

In my code below, I have commented out the lines which I believe are redundant ordering clauses. I have checked the results (row values) with and without commenting, and the results are the same. I was just wondering if there is ANY scenario where the two order by's that I have commented out in the second and fourth CTE are not really redundant.

  1. speed_dataset as (
  2. select uc_id, imei, points_geom, time_created, st_distance((points_geom::geography),lag(points_geom::geography) over (partition by imei order by time_created))
  3. / nullif(( EXTRACT(EPOCH FROM time_created) - EXTRACT(EPOCH FROM LAG(time_created) OVER(PARTITION BY imei ORDER BY time_created)))::FLOAT8,0) as speed
  4. from orig_dataset
  5. order by imei,time_created
  6. )
  7. ,
  8. subset_speed as (
  9. select uc_id, ROW_NUMBER() OVER (ORDER BY (time_created)) AS row_id, speed, imei,points_geom ,time_created
  10. from speed_dataset sd
  11. where speed &lt; 0.1 or speed between 0.75 and 2
  12. --order by time_created
  13. )
  14. ,
  15. leading_speeds as (
  16. select *,lead (speed) over (partition by imei order by time_created) as lead_speed from subset_speed
  17. )
  18. ,
  19. subset_cr as (
  20. select * from leading_speeds
  21. where
  22. (
  23. (speed &lt; 0.1 and lead_speed between 0.75 and 2)
  24. or
  25. (speed between 0.75 and 2 and lead_speed &lt; 0.1)
  26. )
  27. --order by imei,time_created
  28. )
  29. ,
  30. clustering as(
  31. SELECT uc_id,row_id,imei, speed, points_geom ,time_created, ST_ClusterDBSCAN(st_transform(points_geom,24313),eps := 150, minPoints := 3)
  32. OVER(ORDER BY row_id) AS cluster_id FROM subset_cr
  33. )

答案1

得分: 1

你的直觉是正确的。通常情况下,除非你将其与 DISTINCT ON (..)LIMIT/FETCH FIRST ... ROWS ONLY 结合使用,否则在 CTE 或视图定义中 永远不应该 使用 ORDER BY

英文:

Your intuition is right. As a rule, you should never have an ORDER BY in a CTE or a view definition unless you use it in conjunction with DISTINCT ON (..) or LIMIT/FETCH FIRST ... ROWS ONLY.

huangapple
  • 本文由 发表于 2023年3月12日 18:43:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/75712563.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定