英文:
Bigquery navigation function calculate id
问题
我是新手使用BigQuery,我在使用BigQuery的导航函数时遇到了问题。
我不明白这些函数如何工作。
我有一个具有以下结构的表:
事件ID | 创建时间 |
---|---|
1 | 2020-01-01 12:00:00 |
2 | 2020-01-01 12:01:00 |
3 | 2020-01-01 12:05:00 |
4 | 2020-01-01 12:06:00 |
我想计算会话ID,它在超过一分钟的不活动后更改。
会话ID | 创建时间 |
---|---|
1 | 2020-01-01 12:00:00 |
1 | 2020-01-01 12:01:00 |
2 | 2020-01-01 12:05:00 |
2 | 2020-01-01 12:06:00 |
这是我开始编写的SQL代码:
select event_id,
created_ts,
EXTRACT(minute FROM created_ts) as minute,
date_diff(date(created_ts),coalesce(date(lag(created_ts) over(partition by event_id order by event_id)), date(created_ts)),
) session_id
from table1
英文:
I'm new to bigquery and I'm having trouble using bigquery's navigation functions.
I don't understand how these functions work.
I have a table that has this structure:
Event_id | created_ts |
---|---|
1 | 2020-01-01 12:00:00 |
2 | 2020-01-01 12:01:00 |
3 | 2020-01-01 12:05:00 |
4 | 2020-01-01 12:06:00 |
And I would like to calculate the session id which changes after more than one minute of inactivity.
Session_id | created_ts |
---|---|
1 | 2020-01-01 12:00:00 |
1 | 2020-01-01 12:01:00 |
2 | 2020-01-01 12:05:00 |
2 | 2020-01-01 12:06:00 |
Here is the SQL code I started writing:
select event_id,
created_ts,
EXTRACT(minute FROM created_ts) as minute,
date_diff(date(created_ts),coalesce(date(lag(created_ts - minute) over(partition by event_id order by event_id)), date(created_ts)),
) session_id
from table1
答案1
得分: 2
这是另一种方法来做这个:
with table1 as (
select 1 as event_id, '2020-01-01 12:00:00' created_ts
union all select 2, '2020-01-01 12:01:00'
union all select 3, '2020-01-01 12:05:00'
union all select 4, '2020-01-01 12:06:00'
union all select 7, '2020-01-01 12:10:00'
union all select 8, '2020-01-01 12:11:00'
),
CTE AS (
SELECT *,
CASE WHEN COALESCE(TIMESTAMP_DIFF(DATETIME(created_ts), DATETIME(LAG(created_ts) OVER(order by event_id)), MINUTE), 0) > 1 THEN 1 ELSE 0 END as date_diff
FROM table1
)
SELECT created_ts, sum(date_diff) OVER (ORDER BY event_id)+1 as Session_id
FROM CTE
结果:
英文:
This is an other way to do it :
with table1 as (
select 1 as event_id, '2020-01-01 12:00:00' created_ts
union all select 2 , '2020-01-01 12:01:00'
union all select 3 , '2020-01-01 12:05:00'
union all select 4 , '2020-01-01 12:06:00'
union all select 7 , '2020-01-01 12:10:00'
union all select 8 , '2020-01-01 12:11:00'
),
CTE AS (
SELECT *,
CASE WHEN COALESCE(TIMESTAMP_DIFF(DATETIME(created_ts), DATETIME(LAG(created_ts) OVER(order by event_id)), MINUTE), 0) > 1 THEN 1 ELSE 0 END as date_diff
FROM table1
)
SELECT created_ts, sum(date_diff) OVER (ORDER BY event_id)+1 as Session_id
FROM CTE
Result :
答案2
得分: 0
你可以首先比较创建日期时间的分钟,并查看哪个大于1,然后筛选掉所有不需要的行。
WITH CTE AS (
SELECT event_id,
created_ts,
CASE WHEN TIMESTAMP_DIFF(COALESCE(DATE(LAG(created_ts, 1) OVER(PARTITION BY event_id ORDER BY event_id)), DATE(created_ts)), DATE(created_ts), MINUTE) > 1 THEN 1 ELSE 0 END AS bg
FROM table1)
SELECT event_id,
created_ts
FROM CTE
WHERE bg = 1
英文:
You can first compare the minutes of the created datetime and see which is bigger than 1, and then filter oiz all unwanted rows
WITH CTE AS (
select event_id,
created_ts,
CASE WHEN TIMESTAMP_DIFF( coalesce(date(lag(created_ts - minute) over(partition by event_id order by event_id)), date(created_ts)),date(created_ts), MINUTE) > 1 then 1 ELSE 0 END bg
from table1)
SELECT event_id,
created_ts
FROM CTE WHERE bg = 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论