BigQuery导航函数计算ID

huangapple go评论64阅读模式
英文:

Bigquery navigation function calculate id

问题

我是新手使用BigQuery,我在使用BigQuery的导航函数时遇到了问题。
我不明白这些函数如何工作。

我有一个具有以下结构的表:

事件ID 创建时间
1 2020-01-01 12:00:00
2 2020-01-01 12:01:00
3 2020-01-01 12:05:00
4 2020-01-01 12:06:00

我想计算会话ID,它在超过一分钟的不活动后更改。

会话ID 创建时间
1 2020-01-01 12:00:00
1 2020-01-01 12:01:00
2 2020-01-01 12:05:00
2 2020-01-01 12:06:00

这是我开始编写的SQL代码:

select event_id, 
    created_ts, 
    EXTRACT(minute FROM created_ts) as minute,
    date_diff(date(created_ts),coalesce(date(lag(created_ts) over(partition by event_id order by event_id)), date(created_ts)),
) session_id
from table1
英文:

I'm new to bigquery and I'm having trouble using bigquery's navigation functions.
I don't understand how these functions work.

I have a table that has this structure:

Event_id created_ts
1 2020-01-01 12:00:00
2 2020-01-01 12:01:00
3 2020-01-01 12:05:00
4 2020-01-01 12:06:00

And I would like to calculate the session id which changes after more than one minute of inactivity.

Session_id created_ts
1 2020-01-01 12:00:00
1 2020-01-01 12:01:00
2 2020-01-01 12:05:00
2 2020-01-01 12:06:00

Here is the SQL code I started writing:

select event_id, 
    created_ts, 
    EXTRACT(minute FROM created_ts) as minute,
    date_diff(date(created_ts),coalesce(date(lag(created_ts - minute) over(partition by event_id order by event_id)), date(created_ts)),
) session_id
from table1

答案1

得分: 2

这是另一种方法来做这个:

with table1 as (
  select 1 as event_id, '2020-01-01 12:00:00' created_ts
  union all select 2, '2020-01-01 12:01:00'
  union all select 3, '2020-01-01 12:05:00'
  union all select 4, '2020-01-01 12:06:00'
  union all select 7, '2020-01-01 12:10:00'
  union all select 8, '2020-01-01 12:11:00'
),
CTE AS (
  SELECT *,
        CASE WHEN COALESCE(TIMESTAMP_DIFF(DATETIME(created_ts), DATETIME(LAG(created_ts) OVER(order by event_id)), MINUTE), 0) > 1 THEN 1 ELSE 0 END as date_diff
  FROM table1
)
SELECT created_ts, sum(date_diff) OVER (ORDER BY event_id)+1 as Session_id
FROM CTE

结果:

BigQuery导航函数计算ID

英文:

This is an other way to do it :

with table1 as (
  select 1 as  event_id, 	'2020-01-01 12:00:00' created_ts
  union all select 2 , 	'2020-01-01 12:01:00' 
  union all select 3 , 	'2020-01-01 12:05:00' 
  union all select 4 , 	'2020-01-01 12:06:00'
  union all select 7 , 	'2020-01-01 12:10:00'
  union all select 8 , 	'2020-01-01 12:11:00'
),
CTE AS (
  SELECT *,
        CASE WHEN COALESCE(TIMESTAMP_DIFF(DATETIME(created_ts), DATETIME(LAG(created_ts) OVER(order by event_id)), MINUTE), 0) > 1 THEN 1 ELSE 0 END as date_diff
  FROM table1
)
SELECT created_ts, sum(date_diff) OVER (ORDER BY event_id)+1 as Session_id
FROM CTE

Result :

BigQuery导航函数计算ID

答案2

得分: 0

你可以首先比较创建日期时间的分钟,并查看哪个大于1,然后筛选掉所有不需要的行。

WITH CTE AS (
SELECT event_id,
       created_ts,
       CASE WHEN TIMESTAMP_DIFF(COALESCE(DATE(LAG(created_ts, 1) OVER(PARTITION BY event_id ORDER BY event_id)), DATE(created_ts)), DATE(created_ts), MINUTE) > 1 THEN 1 ELSE 0 END AS bg
FROM table1)
SELECT event_id,
       created_ts
FROM CTE
WHERE bg = 1
英文:

You can first compare the minutes of the created datetime and see which is bigger than 1, and then filter oiz all unwanted rows

WITH CTE AS (
select event_id, 
    created_ts, 
CASE WHEN TIMESTAMP_DIFF( coalesce(date(lag(created_ts - minute) over(partition by event_id order by event_id)), date(created_ts)),date(created_ts), MINUTE) > 1 then 1 ELSE 0 END bg
from table1)
SELECT event_id, 
    created_ts
FROM CTE WHERE bg = 1

huangapple
  • 本文由 发表于 2023年6月25日 22:37:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76550923.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定