英文:
Redshift SQL: How to only aggregate subsequent rows with a certain value using window functions?
问题
我有以下表格:
| A | B | C |
|---|---|---|
| 1 | 0 | 12 |
| 2 | 0 | 13 |
| 3 | 1 | 5 |
| 4 | 1 | 1 |
| 5 | 1 | 2 |
| 6 | 0 | 22 |
| 7 | 0 | 20 |
| 8 | 1 | 1 |
| 9 | 1 | 10 |
| 10 | 0 | 11 |
| 11 | 0 | 12 |
我想以一种方式转换数据,即在按A升序排序后,仅在B的值为1时将后续行汇总在一起。每个窗口中的汇总应该取MAX(A)和SUM(C)。将此过程应用于上述表格后的结果如下:
| new_A | new_C |
|---|---|
| 1 | 12 |
| 2 | 13 |
| 5 | 8 |
| 6 | 22 |
| 7 | 20 |
| 9 | 11 |
| 10 | 11 |
| 11 | 12 |
如果可能的话,我想仅使用SQL来完成这个任务。我尝试了多个窗口函数,但未能实现我想要做的事情。
英文:
I have the following table:
| A | B | C |
|---|---|---|
| 1 | 0 | 12 |
| 2 | 0 | 13 |
| 3 | 1 | 5 |
| 4 | 1 | 1 |
| 5 | 1 | 2 |
| 6 | 0 | 22 |
| 7 | 0 | 20 |
| 8 | 1 | 1 |
| 9 | 1 | 10 |
| 10 | 0 | 11 |
| 11 | 0 | 12 |
I would like to transform the data in a way that subsequent rows (after ordering by A ascending), are aggregated together only if their value of B is 1. The aggregation in each window should take MAX(A) and SUM(C). The result of this process after applying it to the above table is the following:
| new_A | new_C |
|---|---|
| 1 | 12 |
| 2 | 13 |
| 5 | 8 |
| 6 | 22 |
| 7 | 20 |
| 9 | 11 |
| 10 | 11 |
| 11 | 12 |
I want to do this via SQL only if possible.
I tried multiple window functions but didn't manage to achieve what I am trying to do.
答案1
得分: 0
我们可以通过创建一个伪序列来实现这一目标,该序列为B列中的每个岛分配相同的值。然后,我们可以按照这一列进行聚合以获得最终结果。
WITH cte AS (
SELECT *, LAG(B) OVER (ORDER BY A) AS lag_B
FROM yourTable
),
cte2 AS (
SELECT *, SUM(CASE WHEN B = 1 AND lag_B = 1 THEN 0 ELSE 1 END)
OVER (ORDER BY A) AS seq
FROM cte
)
SELECT MAX(A) AS new_A, SUM(C) AS new_C
FROM cte2
GROUP BY seq
ORDER BY MAX(A);
英文:
We can achieve this by creating a pseudo sequence which assigns the same value to every island of a values in the B column. Then, we can aggregate by this column to get the final result.
<!-- language: sql -->
WITH cte AS (
SELECT *, LAG(B) OVER (ORDER BY A) AS lag_B
FROM yourTable
),
cte2 AS AS (
SELECT *, SUM(CASE WHEN B = 1 AND lag_B = 1 THEN 0 ELSE 1 END)
OVER (ORDER BY A) AS seq
FROM cte
)
SELECT MAX(A) AS new_A, SUM(C) AS new_C
FROM cte2
GROUP BY seq
ORDER BY MAX(A);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论