“double-aggregation” 在 T-SQL / SQL Server 中如何实现?

huangapple go评论66阅读模式
英文:

How can I do "double-aggregation" in T-SQL / SQL Server?

问题

I have a table which contains Issues and SubIssues in a customer service system. The data might look like this:

id ticket_id sub_ticket_id response_time_in_mins comment cond1 cond2 cond3
1 1000 NULL NULL "the overall ticket about how quickly you closed down my account" 1 0 1
2 1000 1 12 "send ack email" 1 0 1
3 1000 2 30 "look up user in all DBs and remove" 1 1 1
4 1000 3 5 "send finished email" 1 0 0
5 1001 NULL NULL "the overall ticket about do you have my email in your sys?" 1 0 1
6 1001 1 2 "send ack email" 0 0 1
7 1001 2 10 "look up and notify" 1 1 0
...

I'd like to aggregate total response times for each ticket_id satisfying each of the conditions at a time. For example the resulting table may look like this (but assume N conditions)

ticket_id total_resp_time_cond_1 total_resp_time_cond_2 total_resp_time_cond_3
1000 47 30 42
1001 10 10 2
...

Here it is in pseudo code if I could do Python+SQL:

foreach ticket_number in issues_and_subissues:
  foreach condition in cond1, cond2, cond3, ... condN:
    SELECT sum(response_time) FROM issues_and_subissues isst WHERE isst.ticket_id=ticket_number AND condition=1

Is there a T-SQL-only way to achieve this via JOINs? or would I have to use CURSORS? any other options?

I searched on Stackoverflow and beyond and all I could find was the CURSOR way of doing it. Thank you!

英文:

I have a table which contains Issues and SubIssues in a customer service system. The data might look like this:

id ticket_id sub_ticket_id response_time_in_mins comment cond1 cond2 cond3
1 1000 NULL NULL "the overall ticket about how quickly you closed down my account" 1 0 1
2 1000 1 12 "send ack email" 1 0 1
3 1000 2 30 "look up user in all DBs and remove" 1 1 1
4 1000 3 5 "send finished email" 1 0 0
5 1001 NULL NULL "the overall ticket about do you have my email in your sys?" 1 0 1
6 1001 1 2 "send ack email" 0 0 1
7 1001 2 10 "look up and notify" 1 1 0
...

I'd like to aggregate total response times for each ticket_id satisfying each of the conditions at a time. For example the resulting table may look like this (but assume N conditions)

ticket_id total_resp_time_cond_1 total_resp_time_cond_2 total_resp_time_cond_3
1000 47 30 42
1001 10 10 2
...

Here it is in pseudo code if I could do Python+SQL:

foreach ticket_number in issues_and_subissues:
  foreach condition in cond1, cond2, cond3, ... condN:
    SELECT sum(response_time) FROM issues_and_subissues isst WHERE isst.ticket_id=ticket_number AND condition=1

Is there a T-SQL-only way to achieve this via JOINs? or would I have to use CURSERS? any other options?

I searched on Stackoverflow and beyond and all I could find was the CURSER way of doing it. Thank you!

答案1

得分: 3

We can sum the condition flag multiplied by the response time. This answer assumes that the condition flags would only ever be 0 or 1. If not, then this answer would have to change.

<!-- language: sql -->

SELECT
ticket_id,
SUM(cond1 * response_time_in_min) total_resp_time_cond_1,
SUM(cond2 * response_time_in_min) total_resp_time_cond_2,
SUM(cond3 * response_time_in_min) total_resp_time_cond_3
FROM yourTable
GROUP BY ticket_id
ORDER BY ticket_id;

英文:

We can sum the condition flag multiplied by the response time. This answer assumes that the condition flags would only ever be 0 or 1. If not, then this answer would have to change.

<!-- language: sql -->

SELECT
    ticket_id,
    SUM(cond1 * response_time_in_min) total_resp_time_cond_1,
    SUM(cond2 * response_time_in_min) total_resp_time_cond_2,
    SUM(cond3 * response_time_in_min) total_resp_time_cond_3
FROM yourTable
GROUP BY ticket_id
ORDER BY ticket_id;

答案2

得分: 2

这被称为条件聚合。您可以通过将CASE表达式放在聚合函数内来执行:

SELECT ticket_id,
    SUM(CASE WHEN cond1 = 1 THEN response_time_in_min ELSE 0 END) total_resp_time_cond_1,
    SUM(CASE WHEN cond2 = 1 THEN response_time_in_min ELSE 0 END) total_resp_time_cond_2,
    SUM(CASE WHEN cond3 = 1 THEN response_time_in_min ELSE 0 END) total_resp_time_cond_3   
FROM [table]
GROUP BY ticket_id
英文:

It's called conditional aggregation. You do it by putting a CASE expression inside an aggregate function:

SELECT ticket_id,
    SUM(CASE WHEN cond1 = 1 THEN response_time_in_min ELSE 0 END) total_resp_time_cond_1,
    SUM(CASE WHEN cond2 = 1 THEN response_time_in_min ELSE 0 END) total_resp_time_cond_2,
    SUM(CASE WHEN cond3 = 1 THEN response_time_in_min ELSE 0 END) total_resp_time_cond_3   
FROM [table]
GROUP BY ticket_id

答案3

得分: 0

如果我理解您的意思正确,您想要从无中生出“条件”。通常我是这样做的:

SELECT
    ticket_id,
    SUM(CASE WHEN cond = 1 THEN response_time_in_min END) total_resp_time_cond_1,
    SUM(CASE WHEN cond = 2 THEN response_time_in_min END) total_resp_time_cond_2,
    SUM(CASE WHEN cond = 3 THEN response_time_in_min END) total_resp_time_cond_3
FROM yourTable t
CROSS APPLY (
   SELECT 1 AS cond
   WHERE t.somefield = 1
   AND t.anotherfield = 2
   UNION ALL
   SELECT 2 AS cond
   WHERE t.somefield = 2
   AND t.anotherfield = 3
   SELECT 3 AS cond
   WHERE t.somefield = 3
   AND t.anotherfield = 4
 ) c
GROUP BY ticket_id
ORDER BY ticket_id;

或者选择另一种方式:

SELECT
    ticket_id,
    SUM(CASE WHEN t.somefield = 1
   AND t.anotherfield = 2 THEN response_time_in_min END) total_resp_time_cond_1,
    SUM(CASE WHEN t.somefield = 2
   AND t.anotherfield = 3 THEN response_time_in_min END) total_resp_time_cond_2,
    SUM(CASE WHEN t.somefield = 3
   AND t.anotherfield = 4 THEN response_time_in_min END) total_resp_time_cond_3
FROM yourTable t
GROUP BY ticket_id
ORDER BY ticket_id;

这两种方法都有各自的优缺点,取决于您想要如何聚合数据。有时为每个条件创建自己的行是有用的,在这种情况下,CROSS APPLY 版本可能很不错。

英文:

If i understand you correctly, you want to create "conditions" out of thin air. This is usually how i do it:

SELECT
    ticket_id,
    SUM(CASE WHEN cond = 1 THEN response_time_in_min END) total_resp_time_cond_1,
    SUM(CASE WHEN cond = 2 THEN response_time_in_min END) total_resp_time_cond_2,
    SUM(CASE WHEN cond = 3 THEN response_time_in_min END) total_resp_time_cond_3
FROM yourTable t
CROSS APPLY (
   SELECT 1 AS cond
   WHERE t.somefield = 1
   AND t.anotherfield = 2
   UNION ALL
   SELECT 2 AS cond
   WHERE t.somefield = 2
   AND t.anotherfield = 3
   SELECT 3 AS cond
   WHERE t.somefield = 3
   AND t.anotherfield = 4
 ) c
GROUP BY ticket_id
ORDER BY ticket_id;

Alternative:

SELECT
    ticket_id,
    SUM(CASE WHEN t.somefield = 1
   AND t.anotherfield = 2 THEN response_time_in_min END) total_resp_time_cond_1,
    SUM(CASE WHEN t.somefield = 2
   AND t.anotherfield = 3 THEN response_time_in_min END) total_resp_time_cond_2,
    SUM(CASE WHEN t.somefield = 3
   AND t.anotherfield = 4 THEN response_time_in_min END) total_resp_time_cond_3
FROM yourTable t
GROUP BY ticket_id
ORDER BY ticket_id;

Both ways have it's pros and cons, depending on how you wanna aggregate stuff. Sometimes it's useful to create own row for each condition, in this case CROSS APPLY version can be good.

huangapple
  • 本文由 发表于 2023年4月13日 21:47:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76006226.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定