2023年4月17日 22:05:07go评论61阅读模式

英文:

Postgres query that ignores the first 2 entries of day per group

问题

我想制作一个忽略每队列每天的前2个条目的PostgreSQL查询，其中“一天”被定义为从18:00开始。

例如，如果我的数据如下：

id	queue	timestamp (bigint)
1	q1	1673721000000000000 (2023-01-14 18:30:00) - q1的当天第一个条目
2	q1	1673728200000000000 (2023-01-14 20:30:00)
3	q1	1673760600000000000 (2023-01-15 05:30:00)
4	q1	1673806000000000000 (2023-01-15 18:10:00) - q1的当天第一个条目
5	q2	1673721000000000000 (2023-01-14 18:30:00) - q2的当天第一个条目
6	q2	1673728200000000000 (2023-01-14 20:30:00)
7	q2	1673760600000000000 (2023-01-15 05:30:00)
8	q2	1673802600000000000 (2023-01-15 17:10:00)
9	q2	1674067800000000000 (2023-01-18 18:50:00) - q2的当天第一个条目
10	q2	1674075000000000000 (2023-01-18 20:50:00)
11	q2	1674096600000000000 (2023-01-19 02:50:00)
12	q2	1674132600000000000 (2023-01-19 12:50:00)
13	q3	1673721000000000000 (2023-01-14 18:30:00) - q3的当天第一个条目
14	q3	1673728200000000000 (2023-01-14 20:30:00)

查询结果应该是id为3、7、8、11和12的条目：

id	queue	timestamp
3	q1	1673760600000000000
7	q2	1673760600000000000
8	q2	1673802600000000000
11	q2	1674096600000000000
12	q2	1674132600000000000

我尝试过使用在分区上的row_number函数，但在筛选方面遇到了问题。

英文:

I want to make a postgres select query that ignores the first 2 entries of day per queue, where a "day" is defined as starting at 18:00

For example, If my data is like:

id	queue	timestamp (bigint)
1	q1	1673721000000000000 (2023-01-14 18:30:00) - first entry of day for q1
2	q1	1673728200000000000 (2023-01-14 20:30:00)
3	q1	1673760600000000000 (2023-01-15 05:30:00)
4	q1	1673806000000000000 (2023-01-15 18:10:00) - first entry of day for q1
5	q2	1673721000000000000 (2023-01-14 18:30:00) - first entry of day for q2
6	q2	1673728200000000000 (2023-01-14 20:30:00)
7	q2	1673760600000000000 (2023-01-15 05:30:00)
8	q2	1673802600000000000 (2023-01-15 17:10:00)
9	q2	1674067800000000000 (2023-01-18 18:50:00) - first entry of day for q2
10	q2	1674075000000000000 (2023-01-18 20:50:00)
11	q2	1674096600000000000 (2023-01-19 02:50:00)
12	q2	1674132600000000000 (2023-01-19 12:50:00)
13	q3	1673721000000000000 (2023-01-14 18:30:00) - first entry of day for q3
14	q3	1673728200000000000 (2023-01-14 20:30:00)

The select results should be ids 3 7 8 11 and 12:

id	queue	timestamp
3	q1	1673760600000000000
7	q2	1673760600000000000
8	q2	1673802600000000000
11	q2	1674096600000000000
12	q2	1674132600000000000

I've tried using select row_number over a partition, but having issues getting the filtering correct.

// of course this doesn't work, since it only ignores first 2 of all time, not per day starting at 18:00

SELECT *
FROM (
  SELECT row_number() over (PARTITION by queue ORDER by timestamp) as row_n,
                                                                      *
  FROM mytable
) results
WHERE results.row_n &gt; 2

// gives hourly time of timestamp

CAST((to_timestamp(&quot;timestamp&quot;/1000000000)) AS time)

Any help is appreciated. Thanks!

答案1

得分: 1

如果ts是你的timestamp，你应该在以下表达式上进行partition

partition by queue, (ts - interval&#39;18&#39; hour)::date

注意，你要减去18小时（用于当天开始），并将其转换为date以获取单个值。

示例查询

要获取每天从18点开始的第一行和第二行的row_number。（其余部分不重要）

with dt as (
select &#39;q1&#39; queue, date&#39;2023-01-01&#39; + n * interval &#39;3&#39; hour ts from generate_series(1,20) t(n) union all
select &#39;q2&#39; queue, date&#39;2023-01-01&#39; + n * interval &#39;5&#39; hour ts from generate_series(1,20) t(n)
)
select 
  queue, ts,
  row_number() over (partition by queue, (ts - interval&#39;18&#39; hour)::date  order by ts) as rn,
  (ts - interval&#39;18&#39; hour)::date 
from dt

队列|时间戳 |行数|日期 |
-----+-------------------+--+----------+
q1 |2023-01-01 03:00:00| 1|2022-12-31|
q1 |2023-01-01 06:00:00| 2|2022-12-31|
q1 |2023-01-01 09:00:00| 3|2022-12-31|
...
q1 |2023-01-01 18:00:00| 1|2023-01-01|
q1 |2023-01-01 21:00:00| 2|2023-01-01|
q1 |2023-01-02 00:00:00| 3|2023-01-01|
...
q2 |2023-01-01 05:00:00| 1|2022-12-31|
q2 |2023-01-01 10:00:00| 2|2022-12-31|
q2 |2023-01-01 15:00:00| 3|2022-12-31|
q2 |2023-01-01 20:00:00| 1|2023-01-01|
q2 |2023-01-02 01:00:00| 2|2023-01-01|
q2 |2023-01-02 06:00:00| 3|2023-01-01|
...
q2 |2023-01-02 21:00:00| 1|2023-01-02|
q2 |2023-01-03 02:00:00| 2|2023-01-02|
q2 |2023-01-03 07:00:00| 3|2023-01-02|

英文:

If ts is your timestampyou should partition on the following expression

 partition by queue, (ts - interval&#39;18&#39; hour)::date

Note that you subtract the 18 hours (for the day start) and cast to date to get a single value.

Sample Query

To get the row_number of the first and second row per day starting at 18 hour. (leaving the rest out as not important)

with dt as (
select &#39;q1&#39; queue, date&#39;2023-01-01&#39; + n * interval &#39;3&#39; hour ts from generate_series(1,20) t(n) union all
select &#39;q2&#39; queue, date&#39;2023-01-01&#39; + n * interval &#39;5&#39; hour ts from generate_series(1,20) t(n)
)
select 
  queue, ts,
  row_number() over (partition by queue, (ts - interval&#39;18&#39; hour)::date  order by ts) as rn,
  (ts - interval&#39;18&#39; hour)::date 
from dt 

queue|ts                 |rn|date      |
-----+-------------------+--+----------+
q1   |2023-01-01 03:00:00| 1|2022-12-31|
q1   |2023-01-01 06:00:00| 2|2022-12-31|
q1   |2023-01-01 09:00:00| 3|2022-12-31|
...
q1   |2023-01-01 18:00:00| 1|2023-01-01|
q1   |2023-01-01 21:00:00| 2|2023-01-01|
q1   |2023-01-02 00:00:00| 3|2023-01-01|
...
q2   |2023-01-01 05:00:00| 1|2022-12-31|
q2   |2023-01-01 10:00:00| 2|2022-12-31|
q2   |2023-01-01 15:00:00| 3|2022-12-31|
q2   |2023-01-01 20:00:00| 1|2023-01-01|
q2   |2023-01-02 01:00:00| 2|2023-01-01|
q2   |2023-01-02 06:00:00| 3|2023-01-01|
...
q2   |2023-01-02 21:00:00| 1|2023-01-02|
q2   |2023-01-03 02:00:00| 2|2023-01-02|
q2   |2023-01-03 07:00:00| 3|2023-01-02|
...

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Postgres查询，忽略每个组的第一个2个条目。

问题

答案1

Postgresql + gorm 复杂分页与分组

Golang中的表达式语言支持

为什么具有2个条件的A WHERE子句中的OR语句不运行？

Checkmarx Postgres查询构建错误SQL注入错误，SQL二次注入错误Java Springboot

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论