生成加入者、离开者的摘要以及用户的运行计数。

huangapple go评论56阅读模式
英文:

Generate summary with counts of joiners, leavers and running count of users

问题

我有一个包含大量事件的表格,类似于以下内容:

时间戳 事件
2023-05-09 19:20:19 UTC 加入
2023-01-16 09:34:02 UTC 加入
2022-08-19 10:02:44 UTC 离开
2022-10-06 10:11:12 UTC 加入
2021-10-06 10:11:13 UTC 离开

每个事件都涉及到一个“加入者”或一个“离开者”,并带有与之相关的时间戳“ts”。我需要创建一个报告,其中包含每个月的加入者、离开者和总用户数。总用户数将是用户数量的累计,如下所示:

年份 月份 加入者 离开者 总数
2023 03 50 0 50
2023 04 5 1 54
2023 05 30 10 74
2023 06 10 5 79

我需要计算:

  • “加入者”的计数,即在该月发生的“加入”事件的数量,
  • “离开者”的计数,即在该月发生的“离开”事件的数量,
  • “总数”,作为加入用户数减去离开用户数的累计(如果有10人加入并有2人离开,则实际总人数将为8;如果下个月再有10人加入且没有人离开,则下个月的总数需要为18)。

以下是我迄今为止尝试过的内容:

SELECT EXTRACT(YEAR FROM ts) AS 年份,
       EXTRACT(MONTH FROM ts) AS 月份,
       事件,
       COUNT(事件) AS 总数,
       SUM(COUNT(事件)) OVER(ORDER BY 事件) AS 累计总数
FROM `data.events`
GROUP BY 年份, 月份, 事件
ORDER BY 年份 ASC, 月份 ASC, 事件 ASC

这是我在基本了解SQL的情况下迄今为止取得的最接近的结果,但不是我想要的结果。我无法使COUNT或SUM累计总数正常工作。希望能够指导我朝正确的方向前进,将不胜感激。

英文:

I have a table with a large number of events that have happened, like the following:

ts event
2023-05-09 19:20:19 UTC joined
2023-01-16 09:34:02 UTC joined
2022-08-19 10:02:44 UTC left
2022-10-06 10:11:12 UTC joined
2021-10-06 10:11:13 UTC left

Each event involves either a "joiner" or a "leaver", with a TIMESTAMP "ts" associated. I need to create a report that gives joiners, leavers and total users for each month. The total users would be a running total of the number of users, like the following:

Year Month Joiners Leavers Total
2023 03 50 0 50
2023 04 5 1 54
2023 05 30 10 74
2023 06 10 5 79

I need to compute:

  • count for "Joiners", namely the number of "joined" events that happened in the month,
  • count for "Leavers", which would be the number of "left" events that happened in the month
  • "Total", as a running total number of joined users minus left users (if 10 people joined and 2 people left, then the actual total number of people would be 8; if 10 more people joined next month and 0 people left then the Total needs to be 18 in the next month)

Here's what I've tried so far:

SELECT EXTRACT(YEAR FROM ts) year,
       EXTRACT(MONTH FROM ts) month,
       event,
       COUNT(event) AS total,
       SUM(COUNT(event)) OVER(ORDER BY event) AS running_total
FROM `data.events`
GROUP BY year, month, event
ORDER BY year ASC, month ASC, event ASC

This is the closest that I've achieved so far with a very basic understanding of SQL, however it's not what I was aiming for. I can't get the COUNT or SUM running total to work.

Any help that could point me in the right direction would be appreciated.

答案1

得分: 2

你应首先使用条件聚合生成两个汇总值 "Joiners" 和 "Leavers",然后在 Joiners-Leavers 上使用累加和。

WITH cte AS (
    SELECT EXTRACT(YEAR FROM ts) AS year,
           EXTRACT(MONTH FROM ts) AS month,
           SUM(CASE WHEN event = 'joined' THEN 1 ELSE 0 END) AS Joiners,
           SUM(CASE WHEN event = 'left' THEN 1 ELSE 0 END) AS Leavers
    FROM `data.events`
    GROUP BY EXTRACT(YEAR FROM ts), EXTRACT(MONTH FROM ts)
)
SELECT cte.*,
       SUM(Joiners - Leavers) OVER (ORDER BY year, month) AS Total
FROM cte
ORDER BY year, month
英文:

You shall first generate your two aggregated values "Joiners" and "Leavers" with conditional aggregation, then use a running sum on Joiners-Leavers.

WITH cte AS (
    SELECT EXTRACT(YEAR FROM ts) year,
           EXTRACT(MONTH FROM ts) month,
           SUM(CASE WHEN event = 'joined' THEN 1 ELSE 0 END)  AS Joiners,
           SUM(CASE WHEN event = 'left'   THEN 1 ELSE 0 END)  AS Leavers
    FROM `data.events`
    GROUP BY EXTRACT(YEAR FROM ts),
             EXTRACT(MONTH FROM ts)
)
SELECT cte.*, 
       SUM(Joiners-Leavers) OVER(ORDER BY year, month) AS Total
FROM cte 
ORDER BY year, month

huangapple
  • 本文由 发表于 2023年6月6日 05:32:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/76410133.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定