如何检查SQL中每天是否都有条目,而没有间隙。

huangapple go评论67阅读模式
英文:

how to check if there is an entry for each day without gaps SQL

问题

以下是翻译好的内容:

我有一个名为“table1”的日志表,每天都会附加值,并需要从“column1”中查找“column3”值和日期,以确定哪些数据未记录。例如,我的表格如下:

column1 column2 column3
2022-07-14 274,5 markus
2022-07-14 251,2 tess
2022-07-14 162,6 mike
2022-07-15 286,9 markus
2022-07-15 254,8 tess
2022-07-16 289,1 markus
2022-07-17 295,2 markus
2022-07-17 260,0 tess
2022-07-17 182,3 mike

“column3 = 'markus'”的一切都没问题,但我需要得到类似以下的输出:

column1 column3
2022-07-15 mike
2022-07-16 tess
2022-07-17 mike
英文:

I have a log table "table1" with values appended every day and need to find "column3" values and dates from "column1"​​, for which the data wasn't recorded. For example, my table looks like this:

column1 column2 column3
2022-07-14 274,5 markus
2022-07-14 251,2 tess
2022-07-14 162,6 mike
2022-07-15 286,9 markus
2022-07-15 254,8 tess
2022-07-16 289,1 markus
2022-07-17 295,2 markus
2022-07-17 260,0 tess
2022-07-17 182,3 mike

Everything is ok with column3 = 'markus', but I need to get something like this as output:

column1 column3
2022-07-15 mike
2022-07-16 tess
2022-07-16 mike

答案1

得分: 3

One way of addressing this problem is by:

  • rebuilding all combinations of names and dates
  • left-joining this table with your original tables
  • filtering on records whose table values are null
WITH cte_dates AS (
    SELECT DISTINCT column1 AS "date" FROM tab
), cte_names AS (
    SELECT DISTINCT column3 AS "name" FROM tab
)
SELECT cte_dates.date,
       cte_names.name
FROM cte_dates
CROSS JOIN cte_names
LEFT JOIN tab
       ON cte_dates.date = tab.column1
      AND cte_names.name = tab.column3
WHERE tab.column2 IS NULL

Output:

date name
2022-07-16T00:00:00.000Z mike
2022-07-16T00:00:00.000Z tess
2022-07-15T00:00:00.000Z mike

Check the demo here.

If gaps can be found among your dates, you need to use generate_series with boundary dates to generate the corresponding calendar in the first CTE:

WITH cte_dates AS (
    SELECT "date"
    FROM (SELECT MIN(column1) AS startdt, 
                 MAX(column1) AS enddt 
          FROM tab) boundaries
    CROSS JOIN generate_series(startdt :: timestamp, enddt :: timestamp, '1 day' :: interval) "date"
),
...

Check the demo here.

英文:

One way of addressing this problem is by:

  • rebuilding all combinations of names and dates
  • left-joining this table with your original tables
  • filtering on records whose table values is null
WITH cte_dates AS (
	SELECT DISTINCT column1 AS "date" FROM tab
), cte_names AS (
	SELECT DISTINCT column3 AS "name" FROM tab
)
SELECT cte_dates.date,
       cte_names.name
FROM       cte_dates
CROSS JOIN cte_names
LEFT JOIN tab
       ON cte_dates.date = tab.column1
      AND cte_names.name = tab.column3
WHERE tab.column2 IS NULL

Output:

date name
2022-07-16T00:00:00.000Z mike
2022-07-16T00:00:00.000Z tess
2022-07-15T00:00:00.000Z mike

Check the demo here.


If gaps can be found among your dates, you need to use generate_series with boundary dates to generate the corresponding calendar in the first cte:

WITH cte_dates AS (
    SELECT "date"
    FROM (SELECT MIN(column1) AS startdt, 
                 MAX(column1) AS enddt 
          FROM tab) boundaries
    CROSS JOIN generate_series( startdt :: timestamp 
                              , enddt   :: timestamp
                              , '1 day' :: interval ) "date"
),
...

Check the demo here.

答案2

得分: 1

以下是翻译好的代码部分:

首先,您必须为整个时间段建立一个时间基准,然后与column3的唯一值交叉连接以创建所有可能性,然后从基础数据中减去它,如下所示:

with cross_sql as (select * from 
(SELECT timebase 
FROM   generate_series(timestamp '2022-07-14'
                     , timestamp '2022-07-17'
                     , interval  '1 day') as timebase) A
cross join					 
(select distinct column3 from table1) B )

select column3, timebase from cross_sql 
except
select column3, column1 from table1;

请注意,这是代码的中文翻译部分。如果您需要进一步的解释或帮助,请随时提出。

英文:

At first, you must build a time base for the whole period, then cross-join it with distinct of column3 to create all possibilities, then subtract it from your base data as follows:

with cross_sql as (select * from 
(SELECT timebase 
FROM   generate_series(timestamp '2022-07-14'
                     , timestamp '2022-07-17'
                     , interval  '1 day') as timebase) A
cross join					 
(select distinct column3 from table1) B )

select column3, timebase from cross_sql 
except
select column3, column1 from table1;

huangapple
  • 本文由 发表于 2023年5月17日 19:37:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76271681.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定