2023年3月9日 20:41:25go评论119阅读模式

英文:

BigQuery: counting entries between 30-minute timestamp intervals

问题

以下是翻译好的部分：

我有一个预先创建的表格 `data`：

id timestamp_entry
1 "2023-01-01 04:11:24 UTC"
2 "2023-01-01 04:14:55 UTC"
...
99999 "2023-01-31 23:45:59 UTC"


其中 `timestamp_entry` 具有统一的时区 &quot;UTC&quot;，时间范围在 2023 年 1 月内。
我想创建一个 30 分钟的时间框架，并计算 `timestamp_entry` 中有多少条目落入每个时间间隔内。
我首先创建了一个子查询：
```sql
WITH
intervals AS (
  SELECT interval AS start_time,
         TIMESTAMP_SUB(TIMESTAMP_ADD(interval, INTERVAL 30 MINUTE), INTERVAL 1 SECOND) AS end_time
  FROM UNNEST(GENERATE_TIMESTAMP_ARRAY(&quot;2023-01-01 00:00:00 UTC&quot;, &quot;2023-01-31 23:59:59 UTC&quot;, INTERVAL 30 MINUTE)) interval
)

但理想情况下，我希望我的结果显示：

start_time                  end_time                    count
&quot;2023-01-01 00:00:00 UTC&quot;   &quot;2023-01-01 00:29:59 UTC&quot;   0
&quot;2023-01-01 00:30:00 UTC&quot;   &quot;2023-01-01 00:59:59 UTC&quot;   0
...
&quot;2023-01-31 23:00:00 UTC&quot;   &quot;2023-01-31 23:29:59 UTC&quot;   12
&quot;2023-01-31 23:30:00 UTC&quot;   &quot;2023-01-31 23:59:59 UTC&quot;   5

其中 count 显示了有多少条 data 中的 timestamp_entry 落入每个时间段。

我尝试使用 RIGHT JOIN 与 BETWEEN，但由于没有确切的 "匹配项"，我无法将这两个表连接起来。

任何见解都将不胜感激。


<details>
<summary>英文:</summary>
I have a pre-created table `data`:

id timestamp_entry
1 "2023-01-01 04:11:24 UTC"
2 "2023-01-01 04:14:55 UTC"
...
99999 "2023-01-31 23:45:59 UTC"


where `timestamp_entry` has a uniform time zone &quot;UTC&quot; and ranges within January 2023.
I want to create a 30-minute time skeleton and count how many entries in `timestamp_entry` falls into each interval.
I first created a subquery:

WITH
intervals AS(
SELECT interval AS start_time,
TIMESTAMP_SUB(TIMESTAMP_ADD(interval, INTERVAL 30 MINUTE), INTERVAL 1 SECOND) AS end_time
FROM UNNEST(GENERATE_TIMESTAMP_ARRAY("2023-01-01 00:00:00 UTC", "2023-01-31 23:59:59 UTC", INTERVAL 30 MINUTE)) interval
)


But ideally, I want my outcome to show:

start_time end_time count
"2023-01-01 00:00:00 UTC" "2023-01-01 00:29:59 UTC" 0
"2023-01-01 00:30:00 UTC" "2023-01-01 00:59:59 UTC" 0
...
"2023-01-31 23:00:00 UTC" "2023-01-31 23:29:59 UTC" 12
"2023-01-31 23:30:00 UTC" "2023-01-31 23:59:59 UTC" 5

where `count` shows how many `timestamp_entry` from `data` falls into each interval.
I have tried using `RIGHT JOIN` with `BETWEEN`, but I won&#39;t be able to join the two tables as there are no exact &quot;matches&quot;.
Any insights are appreciated.
</details>
# 答案1
**得分**: 0
以下是代码部分的中文翻译：
```sql
与 `JOIN` 和 `BETWEEN` 完全匹配的操作：
SELECT start_time, end_time, COUNT(*) AS count
  FROM data
  LEFT JOIN intervals
    ON timestamp_entry >= start_time
   AND timestamp_entry < end_time
 GROUP BY start_time, end_time
 ORDER BY start_time

英文:

Works exactly with JOIN and BETWEEN:

SELECT start_time, end_time, COUNT(*) AS count
  FROM data
  LEFT JOIN intervals
    ON timestamp_entry &gt;= start_time
   AND timestamp_entry &lt;  end_time
 GROUP BY start_time, end_time
 ORDER BY start_time

答案2

得分: 0

WITH data AS (
  -- 在此处放入你的数据
)
SELECT TIMESTAMP_SECONDS(slot) start_time,
       TIMESTAMP_SECONDS(slot + 1800 - 1) end_time,
       COUNT(timestamp_entry) `count`
  FROM UNNEST(GENERATE_ARRAY(1672531200, 1675207800, 1800)) slot
  LEFT JOIN data ON DIV(slot, 1800) = DIV(UNIX_SECONDS(timestamp_entry), 1800)
 GROUP BY 1, 2;

1672531200 -> UNIX_SECONDS("2023-01-01 00:00:00 UTC");
1675207800 -> UNIX_SECONDS("2023-01-31 23:30:00 UTC");

英文:

You can consider below as well

WITH data AS (
  -- put your data here
)
SELECT TIMESTAMP_SECONDS(slot) start_time,
       TIMESTAMP_SECONDS(slot + 1800 - 1) end_time,
       COUNT(timestamp_entry) `count`
  FROM UNNEST(GENERATE_ARRAY(1672531200, 1675207800, 1800)) slot
  LEFT JOIN data ON DIV(slot, 1800) = DIV(UNIX_SECONDS(timestamp_entry), 1800)
 GROUP BY 1, 2;

1672531200 -> UNIX_SECONDS("2023-01-01 00:00:00 UTC");
1675207800 -> UNIX_SECONDS("2023-01-31 23:30:00 UTC");

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

BigQuery：在30分钟时间戳间隔之间计数条目

问题

答案2

Impala查询（dbplyr）错误：遇到标识符：预期：（

选择列为真，如果在这种情况下有任何其他行为真？

Go SQL, scanning a row as a slice?

我只有 1970-01-1.. 当我使用 ifNull

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。