BigQuery:在30分钟时间戳间隔之间计数条目

huangapple go评论80阅读模式
英文:

BigQuery: counting entries between 30-minute timestamp intervals

问题

以下是翻译好的部分:

我有一个预先创建的表格 `data`

id timestamp_entry
1 "2023-01-01 04:11:24 UTC"
2 "2023-01-01 04:14:55 UTC"
...
99999 "2023-01-31 23:45:59 UTC"


其中 `timestamp_entry` 具有统一的时区 "UTC",时间范围在 2023 年 1 月内。

我想创建一个 30 分钟的时间框架,并计算 `timestamp_entry` 中有多少条目落入每个时间间隔内。

我首先创建了一个子查询:

```sql
WITH
intervals AS (
  SELECT interval AS start_time,
         TIMESTAMP_SUB(TIMESTAMP_ADD(interval, INTERVAL 30 MINUTE), INTERVAL 1 SECOND) AS end_time
  FROM UNNEST(GENERATE_TIMESTAMP_ARRAY("2023-01-01 00:00:00 UTC", "2023-01-31 23:59:59 UTC", INTERVAL 30 MINUTE)) interval
)

但理想情况下,我希望我的结果显示:

start_time                  end_time                    count
"2023-01-01 00:00:00 UTC"   "2023-01-01 00:29:59 UTC"   0
"2023-01-01 00:30:00 UTC"   "2023-01-01 00:59:59 UTC"   0
...
"2023-01-31 23:00:00 UTC"   "2023-01-31 23:29:59 UTC"   12
"2023-01-31 23:30:00 UTC"   "2023-01-31 23:59:59 UTC"   5

其中 count 显示了有多少条 data 中的 timestamp_entry 落入每个时间段。

我尝试使用 RIGHT JOINBETWEEN,但由于没有确切的 "匹配项",我无法将这两个表连接起来。

任何见解都将不胜感激。



<details>
<summary>英文:</summary>

I have a pre-created table `data`:

id timestamp_entry
1 "2023-01-01 04:11:24 UTC"
2 "2023-01-01 04:14:55 UTC"
...
99999 "2023-01-31 23:45:59 UTC"


where `timestamp_entry` has a uniform time zone &quot;UTC&quot; and ranges within January 2023.

I want to create a 30-minute time skeleton and count how many entries in `timestamp_entry` falls into each interval.

I first created a subquery:

WITH
intervals AS(
SELECT interval AS start_time,
TIMESTAMP_SUB(TIMESTAMP_ADD(interval, INTERVAL 30 MINUTE), INTERVAL 1 SECOND) AS end_time
FROM UNNEST(GENERATE_TIMESTAMP_ARRAY("2023-01-01 00:00:00 UTC", "2023-01-31 23:59:59 UTC", INTERVAL 30 MINUTE)) interval
)


But ideally, I want my outcome to show:

start_time end_time count
"2023-01-01 00:00:00 UTC" "2023-01-01 00:29:59 UTC" 0
"2023-01-01 00:30:00 UTC" "2023-01-01 00:59:59 UTC" 0
...
"2023-01-31 23:00:00 UTC" "2023-01-31 23:29:59 UTC" 12
"2023-01-31 23:30:00 UTC" "2023-01-31 23:59:59 UTC" 5

where `count` shows how many `timestamp_entry` from `data` falls into each interval.

I have tried using `RIGHT JOIN` with `BETWEEN`, but I won&#39;t be able to join the two tables as there are no exact &quot;matches&quot;.

Any insights are appreciated.

</details>


# 答案1
**得分**: 0

以下是代码部分的中文翻译:

```sql
与 `JOIN` 和 `BETWEEN` 完全匹配的操作:

SELECT start_time, end_time, COUNT(*) AS count
  FROM data
  LEFT JOIN intervals
    ON timestamp_entry >= start_time
   AND timestamp_entry < end_time
 GROUP BY start_time, end_time
 ORDER BY start_time
英文:

Works exactly with JOIN and BETWEEN:

SELECT start_time, end_time, COUNT(*) AS count
  FROM data
  LEFT JOIN intervals
    ON timestamp_entry &gt;= start_time
   AND timestamp_entry &lt;  end_time
 GROUP BY start_time, end_time
 ORDER BY start_time

答案2

得分: 0

WITH data AS (
  -- 在此处放入你的数据
)
SELECT TIMESTAMP_SECONDS(slot) start_time,
       TIMESTAMP_SECONDS(slot + 1800 - 1) end_time,
       COUNT(timestamp_entry) `count`
  FROM UNNEST(GENERATE_ARRAY(1672531200, 1675207800, 1800)) slot
  LEFT JOIN data ON DIV(slot, 1800) = DIV(UNIX_SECONDS(timestamp_entry), 1800)
 GROUP BY 1, 2;
  • 1672531200 -> UNIX_SECONDS("2023-01-01 00:00:00 UTC");
  • 1675207800 -> UNIX_SECONDS("2023-01-31 23:30:00 UTC");
英文:

You can consider below as well

WITH data AS (
  -- put your data here
)
SELECT TIMESTAMP_SECONDS(slot) start_time,
       TIMESTAMP_SECONDS(slot + 1800 - 1) end_time,
       COUNT(timestamp_entry) `count`
  FROM UNNEST(GENERATE_ARRAY(1672531200, 1675207800, 1800)) slot
  LEFT JOIN data ON DIV(slot, 1800) = DIV(UNIX_SECONDS(timestamp_entry), 1800)
 GROUP BY 1, 2;
  • 1672531200 -> UNIX_SECONDS("2023-01-01 00:00:00 UTC");
  • 1675207800 -> UNIX_SECONDS("2023-01-31 23:30:00 UTC");

huangapple
  • 本文由 发表于 2023年3月9日 20:41:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/75684766.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定