英文:
how to check if there is an entry for each day without gaps SQL
问题
以下是翻译好的内容:
我有一个名为“table1”的日志表,每天都会附加值,并需要从“column1”中查找“column3”值和日期,以确定哪些数据未记录。例如,我的表格如下:
column1 | column2 | column3 |
---|---|---|
2022-07-14 | 274,5 | markus |
2022-07-14 | 251,2 | tess |
2022-07-14 | 162,6 | mike |
2022-07-15 | 286,9 | markus |
2022-07-15 | 254,8 | tess |
2022-07-16 | 289,1 | markus |
2022-07-17 | 295,2 | markus |
2022-07-17 | 260,0 | tess |
2022-07-17 | 182,3 | mike |
“column3 = 'markus'”的一切都没问题,但我需要得到类似以下的输出:
column1 | column3 |
---|---|
2022-07-15 | mike |
2022-07-16 | tess |
2022-07-17 | mike |
英文:
I have a log table "table1" with values appended every day and need to find "column3" values and dates from "column1", for which the data wasn't recorded. For example, my table looks like this:
column1 | column2 | column3 |
---|---|---|
2022-07-14 | 274,5 | markus |
2022-07-14 | 251,2 | tess |
2022-07-14 | 162,6 | mike |
2022-07-15 | 286,9 | markus |
2022-07-15 | 254,8 | tess |
2022-07-16 | 289,1 | markus |
2022-07-17 | 295,2 | markus |
2022-07-17 | 260,0 | tess |
2022-07-17 | 182,3 | mike |
Everything is ok with column3 = 'markus'
, but I need to get something like this as output:
column1 | column3 |
---|---|
2022-07-15 | mike |
2022-07-16 | tess |
2022-07-16 | mike |
答案1
得分: 3
One way of addressing this problem is by:
- rebuilding all combinations of names and dates
- left-joining this table with your original tables
- filtering on records whose table values are null
WITH cte_dates AS (
SELECT DISTINCT column1 AS "date" FROM tab
), cte_names AS (
SELECT DISTINCT column3 AS "name" FROM tab
)
SELECT cte_dates.date,
cte_names.name
FROM cte_dates
CROSS JOIN cte_names
LEFT JOIN tab
ON cte_dates.date = tab.column1
AND cte_names.name = tab.column3
WHERE tab.column2 IS NULL
Output:
date | name |
---|---|
2022-07-16T00:00:00.000Z | mike |
2022-07-16T00:00:00.000Z | tess |
2022-07-15T00:00:00.000Z | mike |
Check the demo here.
If gaps can be found among your dates, you need to use generate_series with boundary dates to generate the corresponding calendar in the first CTE:
WITH cte_dates AS (
SELECT "date"
FROM (SELECT MIN(column1) AS startdt,
MAX(column1) AS enddt
FROM tab) boundaries
CROSS JOIN generate_series(startdt :: timestamp, enddt :: timestamp, '1 day' :: interval) "date"
),
...
Check the demo here.
英文:
One way of addressing this problem is by:
- rebuilding all combinations of names and dates
- left-joining this table with your original tables
- filtering on records whose table values is null
WITH cte_dates AS (
SELECT DISTINCT column1 AS "date" FROM tab
), cte_names AS (
SELECT DISTINCT column3 AS "name" FROM tab
)
SELECT cte_dates.date,
cte_names.name
FROM cte_dates
CROSS JOIN cte_names
LEFT JOIN tab
ON cte_dates.date = tab.column1
AND cte_names.name = tab.column3
WHERE tab.column2 IS NULL
Output:
date | name |
---|---|
2022-07-16T00:00:00.000Z | mike |
2022-07-16T00:00:00.000Z | tess |
2022-07-15T00:00:00.000Z | mike |
Check the demo here.
If gaps can be found among your dates, you need to use generate_series with boundary dates to generate the corresponding calendar in the first cte:
WITH cte_dates AS (
SELECT "date"
FROM (SELECT MIN(column1) AS startdt,
MAX(column1) AS enddt
FROM tab) boundaries
CROSS JOIN generate_series( startdt :: timestamp
, enddt :: timestamp
, '1 day' :: interval ) "date"
),
...
Check the demo here.
答案2
得分: 1
以下是翻译好的代码部分:
首先,您必须为整个时间段建立一个时间基准,然后与column3的唯一值交叉连接以创建所有可能性,然后从基础数据中减去它,如下所示:
with cross_sql as (select * from
(SELECT timebase
FROM generate_series(timestamp '2022-07-14'
, timestamp '2022-07-17'
, interval '1 day') as timebase) A
cross join
(select distinct column3 from table1) B )
select column3, timebase from cross_sql
except
select column3, column1 from table1;
请注意,这是代码的中文翻译部分。如果您需要进一步的解释或帮助,请随时提出。
英文:
At first, you must build a time base for the whole period, then cross-join it with distinct of column3 to create all possibilities, then subtract it from your base data as follows:
with cross_sql as (select * from
(SELECT timebase
FROM generate_series(timestamp '2022-07-14'
, timestamp '2022-07-17'
, interval '1 day') as timebase) A
cross join
(select distinct column3 from table1) B )
select column3, timebase from cross_sql
except
select column3, column1 from table1;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论