英文:
Missing records for a running sum using window functions
问题
以下是您要的翻译部分:
SELECT Account,
Date,
SUM(Amount) OVER (PARTITION BY Account ORDER BY Date) AS RunningTotal
FROM tab1;
但它不提供如上所示的数据。如何获得上述输出,以显示截止到最新日期不再存在的帐户的数据?
英文:
Account | date | Amount |
---|---|---|
X1 | 20230501 | 10 |
X2 | 20230501 | 11 |
X3 | 20230501 | 12 |
X1 | 20230502 | 20 |
X3 | 20230502 | 21 |
X1 | 20230503 | 30 |
X4 | 20230503 | 10 |
I want to retrieve a running total of "Amount" using windows function to get the following output:
Account | date | Running Total |
---|---|---|
X1 | 20230501 | 10 |
X2 | 20230501 | 11 |
X3 | 20230501 | 12 |
X1 | 20230502 | 30 |
X2 | 20230502 | 11 |
X3 | 20230502 | 33 |
X1 | 20230503 | 60 |
X2 | 20230503 | 11 |
X3 | 20230503 | 33 |
X4 | 20230503 | 10 |
I've tried the following SQL approach:
SELECT Account,
Date,
SUM(Amount) OVER (PARTITION BY Account ORDER BY Date) AS RunningTotal
FROM tab1;
But it does not provide data as above. How do I get output as above to show data for accounts which are no longer as of latest date?
答案1
得分: 3
以下是翻译好的部分:
一种解决此问题的可能方法是:
- 从您的表中选择所有可能的账户
- 生成一个在您表格日期范围内的日历表(介于最小日期和最大日期之间)
- 将所有可能的账户与所有可能的日期进行交叉连接,然后将其左连接到您的表格,并执行您的累积求和
WITH cte_accounts AS(
SELECT DISTINCT Account FROM tab
), cte_dates AS (
SELECT MIN(Date) AS date_val, MAX(Date) AS last_date_val FROM tab
UNION ALL
SELECT DATEADD(DAY, 1, date_val), last_date_VAL FROM cte_dates WHERE last_date_val > date_val
)
SELECT a.Account,
d.date_val,
SUM(tab.Amount) OVER (PARTITION BY a.Account ORDER BY d.date_val) AS RunningTotal
FROM cte_accounts a
CROSS JOIN cte_dates d
LEFT JOIN tab
ON tab.Account = a.Account
AND tab.date = d.date_val
ORDER BY d.date_val, a.Account
输出
Account | date_val | RunningTotal |
---|---|---|
X1 | 2023-05-01 | 10 |
X2 | 2023-05-01 | 11 |
X3 | 2023-05-01 | 12 |
X4 | 2023-05-01 | null |
X1 | 2023-05-02 | 30 |
X2 | 2023-05-02 | 11 |
X3 | 2023-05-02 | 33 |
X4 | 2023-05-02 | null |
X1 | 2023-05-03 | 60 |
X2 | 2023-05-03 | 11 |
X3 | 2023-05-03 | 33 |
X4 | 2023-05-03 | 10 |
在这里检查演示 链接。
最后的 ORDER BY
子句是不必要的,仅用于可视化目的。另外,如果您不希望有空值,可以在子查询中将它们筛选掉。
英文:
One possible way of addressing this problem is to:
- select all possible accounts from your table
- generate a calendar table in range (minimum_date, maximum_date) belonging to your table dates
- cross join all possible accounts with all possible dates, then left join this to your table, and fire your running sum
WITH cte_accounts AS(
SELECT DISTINCT Account FROM tab
), cte_dates AS (
SELECT MIN(Date) AS date_val, MAX(Date) AS last_date_val FROM tab
UNION ALL
SELECT DATEADD(DAY, 1, date_val), last_date_VAL FROM cte_dates WHERE last_date_val > date_val
)
SELECT a.Account,
d.date_val,
SUM(tab.Amount) OVER (PARTITION BY a.Account ORDER BY d.date_val) AS RunningTotal
FROM cte_accounts a
CROSS JOIN cte_dates d
LEFT JOIN tab
ON tab.Account = a.Account
AND tab.date = d.date_val
ORDER BY d.date_val, a.Account
Output
Account | date_val | RunningTotal |
---|---|---|
X1 | 2023-05-01 | 10 |
X2 | 2023-05-01 | 11 |
X3 | 2023-05-01 | 12 |
X4 | 2023-05-01 | null |
X1 | 2023-05-02 | 30 |
X2 | 2023-05-02 | 11 |
X3 | 2023-05-02 | 33 |
X4 | 2023-05-02 | null |
X1 | 2023-05-03 | 60 |
X2 | 2023-05-03 | 11 |
X3 | 2023-05-03 | 33 |
X4 | 2023-05-03 | 10 |
Check the demo here.
The last ORDER BY
clause is not necessary: it's just for visualization purposes. On a side note, if you do mind about having the null values, you can filter them out in a subquery.
答案2
得分: 1
以下是翻译好的代码部分:
with AccDates as (
select Account, min(date) minDate
from tab1
group by account
),
Dates as (select distinct date from tab1)
select a.Account, d.date, sum(t.Amount) over (partition by a.Account order by d.Date) RunningTotal
from AccDates a
left join Dates d on d.date >= a.minDate
left join tab1 t on t.Account = a.Account and t.date = d.date
order by d.date, a.account;
请注意,这只是翻译的代码部分,不包括其他内容。
英文:
Another example, similar to @lemon answer. Added checking for first date for Account. Without full calendar.
with AccDates as(
select Account, min(date) minDate
from tab1
group by account
)
,Dates as( select distinct date from tab1)
select a.Account
,d.date
,sum(t.Amount)over(partition by a.Account order by d.Date) RunningTotal
from AccDates a
left join Dates d on d.date>=a.minDate
left join tab1 t on t.Account=a.Account and t.date=d.date
order by d.date,a.account
;
Result
Account | date | RunningTotal |
---|---|---|
X1 | 2023-05-01 | 10 |
X2 | 2023-05-01 | 11 |
X3 | 2023-05-01 | 12 |
X1 | 2023-05-02 | 30 |
X2 | 2023-05-02 | 11 |
X3 | 2023-05-02 | 33 |
X1 | 2023-05-03 | 60 |
X2 | 2023-05-03 | 11 |
X3 | 2023-05-03 | 33 |
X4 | 2023-05-03 | 10 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论