英文:
BigQuery: counting entries between 30-minute timestamp intervals
问题
以下是翻译好的部分:
我有一个预先创建的表格 `data`:
id timestamp_entry
1 "2023-01-01 04:11:24 UTC"
2 "2023-01-01 04:14:55 UTC"
...
99999 "2023-01-31 23:45:59 UTC"
其中 `timestamp_entry` 具有统一的时区 "UTC",时间范围在 2023 年 1 月内。
我想创建一个 30 分钟的时间框架,并计算 `timestamp_entry` 中有多少条目落入每个时间间隔内。
我首先创建了一个子查询:
```sql
WITH
intervals AS (
SELECT interval AS start_time,
TIMESTAMP_SUB(TIMESTAMP_ADD(interval, INTERVAL 30 MINUTE), INTERVAL 1 SECOND) AS end_time
FROM UNNEST(GENERATE_TIMESTAMP_ARRAY("2023-01-01 00:00:00 UTC", "2023-01-31 23:59:59 UTC", INTERVAL 30 MINUTE)) interval
)
但理想情况下,我希望我的结果显示:
start_time end_time count
"2023-01-01 00:00:00 UTC" "2023-01-01 00:29:59 UTC" 0
"2023-01-01 00:30:00 UTC" "2023-01-01 00:59:59 UTC" 0
...
"2023-01-31 23:00:00 UTC" "2023-01-31 23:29:59 UTC" 12
"2023-01-31 23:30:00 UTC" "2023-01-31 23:59:59 UTC" 5
其中 count
显示了有多少条 data
中的 timestamp_entry
落入每个时间段。
我尝试使用 RIGHT JOIN
与 BETWEEN
,但由于没有确切的 "匹配项",我无法将这两个表连接起来。
任何见解都将不胜感激。
<details>
<summary>英文:</summary>
I have a pre-created table `data`:
id timestamp_entry
1 "2023-01-01 04:11:24 UTC"
2 "2023-01-01 04:14:55 UTC"
...
99999 "2023-01-31 23:45:59 UTC"
where `timestamp_entry` has a uniform time zone "UTC" and ranges within January 2023.
I want to create a 30-minute time skeleton and count how many entries in `timestamp_entry` falls into each interval.
I first created a subquery:
WITH
intervals AS(
SELECT interval AS start_time,
TIMESTAMP_SUB(TIMESTAMP_ADD(interval, INTERVAL 30 MINUTE), INTERVAL 1 SECOND) AS end_time
FROM UNNEST(GENERATE_TIMESTAMP_ARRAY("2023-01-01 00:00:00 UTC", "2023-01-31 23:59:59 UTC", INTERVAL 30 MINUTE)) interval
)
But ideally, I want my outcome to show:
start_time end_time count
"2023-01-01 00:00:00 UTC" "2023-01-01 00:29:59 UTC" 0
"2023-01-01 00:30:00 UTC" "2023-01-01 00:59:59 UTC" 0
...
"2023-01-31 23:00:00 UTC" "2023-01-31 23:29:59 UTC" 12
"2023-01-31 23:30:00 UTC" "2023-01-31 23:59:59 UTC" 5
where `count` shows how many `timestamp_entry` from `data` falls into each interval.
I have tried using `RIGHT JOIN` with `BETWEEN`, but I won't be able to join the two tables as there are no exact "matches".
Any insights are appreciated.
</details>
# 答案1
**得分**: 0
以下是代码部分的中文翻译:
```sql
与 `JOIN` 和 `BETWEEN` 完全匹配的操作:
SELECT start_time, end_time, COUNT(*) AS count
FROM data
LEFT JOIN intervals
ON timestamp_entry >= start_time
AND timestamp_entry < end_time
GROUP BY start_time, end_time
ORDER BY start_time
英文:
Works exactly with JOIN
and BETWEEN
:
SELECT start_time, end_time, COUNT(*) AS count
FROM data
LEFT JOIN intervals
ON timestamp_entry >= start_time
AND timestamp_entry < end_time
GROUP BY start_time, end_time
ORDER BY start_time
答案2
得分: 0
WITH data AS (
-- 在此处放入你的数据
)
SELECT TIMESTAMP_SECONDS(slot) start_time,
TIMESTAMP_SECONDS(slot + 1800 - 1) end_time,
COUNT(timestamp_entry) `count`
FROM UNNEST(GENERATE_ARRAY(1672531200, 1675207800, 1800)) slot
LEFT JOIN data ON DIV(slot, 1800) = DIV(UNIX_SECONDS(timestamp_entry), 1800)
GROUP BY 1, 2;
1672531200
-> UNIX_SECONDS("2023-01-01 00:00:00 UTC");1675207800
-> UNIX_SECONDS("2023-01-31 23:30:00 UTC");
英文:
You can consider below as well
WITH data AS (
-- put your data here
)
SELECT TIMESTAMP_SECONDS(slot) start_time,
TIMESTAMP_SECONDS(slot + 1800 - 1) end_time,
COUNT(timestamp_entry) `count`
FROM UNNEST(GENERATE_ARRAY(1672531200, 1675207800, 1800)) slot
LEFT JOIN data ON DIV(slot, 1800) = DIV(UNIX_SECONDS(timestamp_entry), 1800)
GROUP BY 1, 2;
1672531200
-> UNIX_SECONDS("2023-01-01 00:00:00 UTC");1675207800
-> UNIX_SECONDS("2023-01-31 23:30:00 UTC");
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论