英文:
Generate summary with counts of joiners, leavers and running count of users
问题
我有一个包含大量事件的表格,类似于以下内容:
时间戳 | 事件 |
---|---|
2023-05-09 19:20:19 UTC | 加入 |
2023-01-16 09:34:02 UTC | 加入 |
2022-08-19 10:02:44 UTC | 离开 |
2022-10-06 10:11:12 UTC | 加入 |
2021-10-06 10:11:13 UTC | 离开 |
每个事件都涉及到一个“加入者”或一个“离开者”,并带有与之相关的时间戳“ts”。我需要创建一个报告,其中包含每个月的加入者、离开者和总用户数。总用户数将是用户数量的累计,如下所示:
年份 | 月份 | 加入者 | 离开者 | 总数 |
---|---|---|---|---|
2023 | 03 | 50 | 0 | 50 |
2023 | 04 | 5 | 1 | 54 |
2023 | 05 | 30 | 10 | 74 |
2023 | 06 | 10 | 5 | 79 |
我需要计算:
- “加入者”的计数,即在该月发生的“加入”事件的数量,
- “离开者”的计数,即在该月发生的“离开”事件的数量,
- “总数”,作为加入用户数减去离开用户数的累计(如果有10人加入并有2人离开,则实际总人数将为8;如果下个月再有10人加入且没有人离开,则下个月的总数需要为18)。
以下是我迄今为止尝试过的内容:
SELECT EXTRACT(YEAR FROM ts) AS 年份,
EXTRACT(MONTH FROM ts) AS 月份,
事件,
COUNT(事件) AS 总数,
SUM(COUNT(事件)) OVER(ORDER BY 事件) AS 累计总数
FROM `data.events`
GROUP BY 年份, 月份, 事件
ORDER BY 年份 ASC, 月份 ASC, 事件 ASC
这是我在基本了解SQL的情况下迄今为止取得的最接近的结果,但不是我想要的结果。我无法使COUNT或SUM累计总数正常工作。希望能够指导我朝正确的方向前进,将不胜感激。
英文:
I have a table with a large number of events that have happened, like the following:
ts | event |
---|---|
2023-05-09 19:20:19 UTC | joined |
2023-01-16 09:34:02 UTC | joined |
2022-08-19 10:02:44 UTC | left |
2022-10-06 10:11:12 UTC | joined |
2021-10-06 10:11:13 UTC | left |
Each event involves either a "joiner" or a "leaver", with a TIMESTAMP "ts" associated. I need to create a report that gives joiners, leavers and total users for each month. The total users would be a running total of the number of users, like the following:
Year | Month | Joiners | Leavers | Total |
---|---|---|---|---|
2023 | 03 | 50 | 0 | 50 |
2023 | 04 | 5 | 1 | 54 |
2023 | 05 | 30 | 10 | 74 |
2023 | 06 | 10 | 5 | 79 |
I need to compute:
- count for "Joiners", namely the number of "joined" events that happened in the month,
- count for "Leavers", which would be the number of "left" events that happened in the month
- "Total", as a running total number of joined users minus left users (if 10 people joined and 2 people left, then the actual total number of people would be 8; if 10 more people joined next month and 0 people left then the Total needs to be 18 in the next month)
Here's what I've tried so far:
SELECT EXTRACT(YEAR FROM ts) year,
EXTRACT(MONTH FROM ts) month,
event,
COUNT(event) AS total,
SUM(COUNT(event)) OVER(ORDER BY event) AS running_total
FROM `data.events`
GROUP BY year, month, event
ORDER BY year ASC, month ASC, event ASC
This is the closest that I've achieved so far with a very basic understanding of SQL, however it's not what I was aiming for. I can't get the COUNT or SUM running total to work.
Any help that could point me in the right direction would be appreciated.
答案1
得分: 2
你应首先使用条件聚合生成两个汇总值 "Joiners" 和 "Leavers",然后在 Joiners-Leavers
上使用累加和。
WITH cte AS (
SELECT EXTRACT(YEAR FROM ts) AS year,
EXTRACT(MONTH FROM ts) AS month,
SUM(CASE WHEN event = 'joined' THEN 1 ELSE 0 END) AS Joiners,
SUM(CASE WHEN event = 'left' THEN 1 ELSE 0 END) AS Leavers
FROM `data.events`
GROUP BY EXTRACT(YEAR FROM ts), EXTRACT(MONTH FROM ts)
)
SELECT cte.*,
SUM(Joiners - Leavers) OVER (ORDER BY year, month) AS Total
FROM cte
ORDER BY year, month
英文:
You shall first generate your two aggregated values "Joiners" and "Leavers" with conditional aggregation, then use a running sum on Joiners-Leavers
.
WITH cte AS (
SELECT EXTRACT(YEAR FROM ts) year,
EXTRACT(MONTH FROM ts) month,
SUM(CASE WHEN event = 'joined' THEN 1 ELSE 0 END) AS Joiners,
SUM(CASE WHEN event = 'left' THEN 1 ELSE 0 END) AS Leavers
FROM `data.events`
GROUP BY EXTRACT(YEAR FROM ts),
EXTRACT(MONTH FROM ts)
)
SELECT cte.*,
SUM(Joiners-Leavers) OVER(ORDER BY year, month) AS Total
FROM cte
ORDER BY year, month
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论