如何检查SQL中每天是否都有条目,而没有间隙。

huangapple go评论88阅读模式
英文:

how to check if there is an entry for each day without gaps SQL

问题

以下是翻译好的内容:

我有一个名为“table1”的日志表,每天都会附加值,并需要从“column1”中查找“column3”值和日期,以确定哪些数据未记录。例如,我的表格如下:

column1 column2 column3
2022-07-14 274,5 markus
2022-07-14 251,2 tess
2022-07-14 162,6 mike
2022-07-15 286,9 markus
2022-07-15 254,8 tess
2022-07-16 289,1 markus
2022-07-17 295,2 markus
2022-07-17 260,0 tess
2022-07-17 182,3 mike

“column3 = 'markus'”的一切都没问题,但我需要得到类似以下的输出:

column1 column3
2022-07-15 mike
2022-07-16 tess
2022-07-17 mike
英文:

I have a log table "table1" with values appended every day and need to find "column3" values and dates from "column1"​​, for which the data wasn't recorded. For example, my table looks like this:

column1 column2 column3
2022-07-14 274,5 markus
2022-07-14 251,2 tess
2022-07-14 162,6 mike
2022-07-15 286,9 markus
2022-07-15 254,8 tess
2022-07-16 289,1 markus
2022-07-17 295,2 markus
2022-07-17 260,0 tess
2022-07-17 182,3 mike

Everything is ok with column3 = 'markus', but I need to get something like this as output:

column1 column3
2022-07-15 mike
2022-07-16 tess
2022-07-16 mike

答案1

得分: 3

One way of addressing this problem is by:

  • rebuilding all combinations of names and dates
  • left-joining this table with your original tables
  • filtering on records whose table values are null
  1. WITH cte_dates AS (
  2. SELECT DISTINCT column1 AS "date" FROM tab
  3. ), cte_names AS (
  4. SELECT DISTINCT column3 AS "name" FROM tab
  5. )
  6. SELECT cte_dates.date,
  7. cte_names.name
  8. FROM cte_dates
  9. CROSS JOIN cte_names
  10. LEFT JOIN tab
  11. ON cte_dates.date = tab.column1
  12. AND cte_names.name = tab.column3
  13. WHERE tab.column2 IS NULL

Output:

date name
2022-07-16T00:00:00.000Z mike
2022-07-16T00:00:00.000Z tess
2022-07-15T00:00:00.000Z mike

Check the demo here.

If gaps can be found among your dates, you need to use generate_series with boundary dates to generate the corresponding calendar in the first CTE:

  1. WITH cte_dates AS (
  2. SELECT "date"
  3. FROM (SELECT MIN(column1) AS startdt,
  4. MAX(column1) AS enddt
  5. FROM tab) boundaries
  6. CROSS JOIN generate_series(startdt :: timestamp, enddt :: timestamp, '1 day' :: interval) "date"
  7. ),
  8. ...

Check the demo here.

英文:

One way of addressing this problem is by:

  • rebuilding all combinations of names and dates
  • left-joining this table with your original tables
  • filtering on records whose table values is null
  1. WITH cte_dates AS (
  2. SELECT DISTINCT column1 AS "date" FROM tab
  3. ), cte_names AS (
  4. SELECT DISTINCT column3 AS "name" FROM tab
  5. )
  6. SELECT cte_dates.date,
  7. cte_names.name
  8. FROM cte_dates
  9. CROSS JOIN cte_names
  10. LEFT JOIN tab
  11. ON cte_dates.date = tab.column1
  12. AND cte_names.name = tab.column3
  13. WHERE tab.column2 IS NULL

Output:

date name
2022-07-16T00:00:00.000Z mike
2022-07-16T00:00:00.000Z tess
2022-07-15T00:00:00.000Z mike

Check the demo here.


If gaps can be found among your dates, you need to use generate_series with boundary dates to generate the corresponding calendar in the first cte:

  1. WITH cte_dates AS (
  2. SELECT "date"
  3. FROM (SELECT MIN(column1) AS startdt,
  4. MAX(column1) AS enddt
  5. FROM tab) boundaries
  6. CROSS JOIN generate_series( startdt :: timestamp
  7. , enddt :: timestamp
  8. , '1 day' :: interval ) "date"
  9. ),
  10. ...

Check the demo here.

答案2

得分: 1

以下是翻译好的代码部分:

  1. 首先,您必须为整个时间段建立一个时间基准,然后与column3的唯一值交叉连接以创建所有可能性,然后从基础数据中减去它,如下所示:
  2. with cross_sql as (select * from
  3. (SELECT timebase
  4. FROM generate_series(timestamp '2022-07-14'
  5. , timestamp '2022-07-17'
  6. , interval '1 day') as timebase) A
  7. cross join
  8. (select distinct column3 from table1) B )
  9. select column3, timebase from cross_sql
  10. except
  11. select column3, column1 from table1;

请注意,这是代码的中文翻译部分。如果您需要进一步的解释或帮助,请随时提出。

英文:

At first, you must build a time base for the whole period, then cross-join it with distinct of column3 to create all possibilities, then subtract it from your base data as follows:

  1. with cross_sql as (select * from
  2. (SELECT timebase
  3. FROM generate_series(timestamp '2022-07-14'
  4. , timestamp '2022-07-17'
  5. , interval '1 day') as timebase) A
  6. cross join
  7. (select distinct column3 from table1) B )
  8. select column3, timebase from cross_sql
  9. except
  10. select column3, column1 from table1;

huangapple
  • 本文由 发表于 2023年5月17日 19:37:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76271681.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定