英文:
SQL LAG() function with condition where
问题
我有我的表格如下:
WITH
my_table (user_id, date, event_name, value) AS (
VALUES
(1,'2021-10-01', 'level', 5),
(1,'2021-10-03', 'purchase', null),
(1,'2021-10-05', 'purchase', null),
(2,'2021-10-02', 'level', 10),
(2,'2021-10-03', 'purchase', null)
)
SELECT *
FROM my_table
我只想为每个user_id的每次purchase之前添加一个列,显示最后一次level的值。期望的表格应如下所示:
|user_id|date |event_name|value|level_achieved|
|-------|----------|----------|-----|--------------|
|1 |2021-10-01|level |5 | |
|1 |2021-10-03|purchase |null |5 |
|1 |2021-10-05|purchase |null |5 |
|2 |2021-10-02|level |10 | |
|2 |2021-10-03|purchase |null |10 |
感谢您的帮助!
英文:
I have my table like this:
WITH
my_table (user_id, date, event_name, value) AS (
values
(1,'2021-10-01', 'level', 5),
(1,'2021-10-03', 'purchase', null),
(1,'2021-10-05', 'purchase', null),
(2,'2021-10-02', 'level', 10),
(2,'2021-10-03', 'purchase', null)
)
SELECT *
FROM my_table
| user_id | date | event_name | level |
|---|---|---|---|
| 1 | 2021-10-01 | level | 5 |
| 1 | 2021-10-03 | purchase | |
| 1 | 2021-10-05 | purchase | |
| 2 | 2021-10-02 | level | 10 |
| 2 | 2021-10-03 | purchase |
All I want is to add column with the last level achieved before every purchase for every user_id.
Desired table should look like this
| user_id | date | event_name | level | level_achieved |
|---|---|---|---|---|
| 1 | 2021-10-01 | level | 5 | |
| 1 | 2021-10-03 | purchase | 5 | |
| 1 | 2021-10-05 | purchase | 5 | |
| 2 | 2021-10-02 | level | 10 | |
| 2 | 2021-10-03 | purchase | 10 |
Thanks for helping me!
答案1
得分: 3
假设级别不会下降,您可以只使用 max,它将忽略 null 值:
SELECT *,
if(value is null, max(value) over(partition by user_id order by date)) level_achieved
FROM my_table;
输出:
| user_id | date | event_name | value | level_achieved |
|---|---|---|---|---|
| 1 | 2021-10-01 | level | 5 | NULL |
| 1 | 2021-10-03 | purchase | NULL | 5 |
| 1 | 2021-10-05 | purchase | NULL | 5 |
| 2 | 2021-10-02 | level | 10 | NULL |
| 2 | 2021-10-03 | purchase | NULL | 10 |
或者使用带有 ignore nulls 选项的 last_value:
SELECT *,
if(value is null, last_value(value) IGNORE NULLS over(partition by user_id order by date)) level_achieved
FROM my_table;
英文:
Assuming that level can't go down you can use just max which will ignore null's:
SELECT *,
if(value is null, max(value) over(partition by user_id order by date)) level_achieved
FROM my_table;
Output:
| user_id | date | event_name | value | level_achieved |
|---|---|---|---|---|
| 1 | 2021-10-01 | level | 5 | NULL |
| 1 | 2021-10-03 | purchase | NULL | 5 |
| 1 | 2021-10-05 | purchase | NULL | 5 |
| 2 | 2021-10-02 | level | 10 | NULL |
| 2 | 2021-10-03 | purchase | NULL | 10 |
Or using last_value with ignore nulls option:
SELECT *,
if(value is null, last_value(value) IGNORE NULLS over(partition by user_id order by date)) level_achieved
FROM my_table;
答案2
得分: 1
我建议您创建一个包含以下内容的子查询:
user_id、level、date as date_start、lead(date) over(partition by user_id order by date asc) as date_end
(其中 level 不为 null)
您将获得以下结构:
user_id、level、date_start、date_end
接下来,您可以与以下内容进行连接:
my_table ON my_table.date >= sub_query.date_start
AND my_table.date < sub_query.date_end
AND my_table.user_id = sub_query.user_id
这样,您可以轻松获取特定日期范围的级别。
祝您好运。
英文:
I would advise you to create a subquery with the following:
user_id, level, date as date_start, lead(date) over(partition by user_id order by date asc) as date_end
(where level is not null)
You will get a structure like:
user_id, level, date_start, date_end
Next you can join with:
my_table ON my_table.date >= sub_query.date_start
AND my_table.date < sub_query.date_end
AND my_table.user_id = sub_query.user_id
With this you can easily get the level for that specific date range.
Good luck
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论