percentile_cont与聚合函数

huangapple go评论74阅读模式
英文:

percentile_cont with aggregate Function

问题

I'm using SQL Server 2012 and trying to calculate some aggregate functions and percentiles on a data set with two columns (id and time). In my research I found some solutions, but they don't work for me (maybe it's because of my outdated SQL version)

id time_seconds
1 120
2 10
3 50
4 80
5 60
6 42
7 96

I'll tried something like that:

  1. SELECT
  2. MIN(time_seconds) AS min,
  3. MAX(time_seconds) AS max,
  4. PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY time_seconds) OVER () AS '95 percentil'
  5. from table

If i run this query, it wants me to make an include "time_seconds" in Group by, but then the aggregate function doesnt work anymore.

The Output id like to become is something like this :

|min |max |95 percentil|
|10 |120 |110|

英文:

I'm using SQL Server 2012 and trying to calculate some aggregate functions and percentiles on a data set with two columns (id and time). In my research I found some solutions, but they don't work for me (maybe it's because of my outdated SQL version)

id time_seconds
1 120
2 10
3 50
4 80
5 60
6 42
7 96

I'll tried something like that:

  1. SELECT
  2. MIN(time_seconds) AS min,
  3. MAX(time_seconds) AS max,
  4. PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY time_seconds) OVER () AS '95 percentil'
  5. from table

If i run this query, it wants me to make an include "time_seconds" in Group by, but then the aggregate function doesnt work anymore.

The Output id like to become is something like this :

min max 95 percentil
10 120 110

Thanks for your help!

答案1

得分: 3

抱歉,以下是翻译好的部分:

  1. 不幸的是,`PERCENTILE_CONT` 仅作为窗口函数起作用,而不作为聚合函数。这是一种(罕见的)情况,`select distinct` 可能会有用:
  2. select distinct
  3. min(time_seconds) over() as min_time_seconds,
  4. max(time_seconds) over() as max_time_seconds,
  5. percentile_cont(0.95) within group (order by time_seconds) over () as percentil_95
  6. from mytable
  7. 基本上,这是将问题反过来解决的方式:由于我们无法进行聚合百分位数计算,因此我们将其他聚合函数转换为窗口函数:窗口计算在所有行上产生相同的值,因此我们可以使用 `distinct` 对结果集进行去重。
  8. | min_time_seconds | max_time_seconds | percentile_95 |
  9. | ----------------:|----------------:|------------:|
  10. | 10 | 120 | 112.8 |
  11. [fiddle](https://dbfiddle.uk/z_Yz7qLR) - 这是 SQL Server 2014,因为 db<>fiddle 不支持 2012。
英文:

Unfortunately PERCENTILE_CONT works as a window function only, not as an aggregate function. This is a (rare) case when select distinct may come useful:

  1. select distinct
  2. min(time_seconds) over() as min_time_seconds,
  3. max(time_seconds) over() as max_time_seconds,
  4. percentile_cont(0.95) within group (order by time_seconds) over () as percentil_95
  5. from mytable

Basically this takes the problem the opposite way: since we can't have an aggregate percentile computation, we turn other aggregate functions to window functions: the window computations yield the same values on all rows, so we can then deduplicate the resultset with distinct.

min_time_seconds max_time_seconds percentil_95
10 120 112.8

fiddle - this is SQL Server 2014, since db<>fiddle does not support 2012.

答案2

得分: 0

  1. ;WITH cte AS (
  2. SELECT
  3. CAST(COUNT(*) OVER() AS float) AS total_rows,
  4. ROW_NUMBER() OVER (ORDER BY time_seconds) AS row_number,
  5. time_seconds
  6. FROM (
  7. SELECT 1 AS id, 120 AS time_seconds UNION ALL
  8. SELECT 2 AS id, 10 AS time_seconds UNION ALL
  9. SELECT 3 AS id, 50 AS time_seconds UNION ALL
  10. SELECT 4 AS id, 80 AS time_seconds UNION ALL
  11. SELECT 5 AS id, 60 AS time_seconds UNION ALL
  12. SELECT 6 AS id, 42 AS time_seconds UNION ALL
  13. SELECT 7 AS id, 96 AS time_seconds
  14. ) AS YourTable
  15. ),
  16. percentiles AS (
  17. SELECT
  18. time_seconds,
  19. 100.0 * ((row_number - 1) / (total_rows - 1)) AS percentile,
  20. LEAD(time_seconds) OVER (ORDER BY row_number) AS next_time_seconds,
  21. 100.0 * (row_number / (total_rows - 1)) AS next_percentile
  22. FROM cte
  23. )
  24. SELECT
  25. MIN(time_seconds) AS min_time_seconds,
  26. MAX(time_seconds) AS max_time_seconds,
  27. (
  28. SELECT
  29. time_seconds + (next_time_seconds - time_seconds) * (95 - percentile) / (next_percentile - percentile)
  30. FROM percentiles
  31. WHERE percentile <= 95 AND next_percentile >= 95
  32. ) AS percentile_95
  33. FROM cte;
英文:

Try this:

  1. ;WITH cte AS (
  2. SELECT
  3. CAST(COUNT(*) OVER() AS float) AS total_rows,
  4. ROW_NUMBER() OVER (ORDER BY time_seconds) AS row_number,
  5. time_seconds
  6. FROM (
  7. SELECT 1 AS id, 120 AS time_seconds UNION ALL
  8. SELECT 2 AS id, 10 AS time_seconds UNION ALL
  9. SELECT 3 AS id, 50 AS time_seconds UNION ALL
  10. SELECT 4 AS id, 80 AS time_seconds UNION ALL
  11. SELECT 5 AS id, 60 AS time_seconds UNION ALL
  12. SELECT 6 AS id, 42 AS time_seconds UNION ALL
  13. SELECT 7 AS id, 96 AS time_seconds
  14. ) AS YourTable
  15. ),
  16. percentiles AS (
  17. SELECT
  18. time_seconds,
  19. 100.0 * ((row_number - 1) / (total_rows - 1)) AS percentile,
  20. LEAD(time_seconds) OVER (ORDER BY row_number) AS next_time_seconds,
  21. 100.0 * (row_number / (total_rows - 1)) AS next_percentile
  22. FROM cte
  23. )
  24. SELECT
  25. MIN(time_seconds) AS min_time_seconds,
  26. MAX(time_seconds) AS max_time_seconds,
  27. (
  28. SELECT
  29. time_seconds + (next_time_seconds - time_seconds) * (95 - percentile) / (next_percentile - percentile)
  30. FROM percentiles
  31. WHERE percentile &lt;= 95 AND next_percentile &gt;= 95
  32. ) AS percentile_95
  33. FROM cte;

huangapple
  • 本文由 发表于 2023年6月6日 15:48:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76412460.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定