SQL查询以检查计数。

huangapple go评论103阅读模式
英文:

SQL query to check the count

问题

我有一个在BigQuery中的列,其格式如下所示。时间(时间以UTC为准)以15分钟的间隔增加(从00:10开始),并持续一整天直到23:......

现在我想使用查询来检查数据是否及时更新。也就是说,在每个时间点,我们应该有多个条目(按时间分组时计数应该大于0)。如果在特定时间点没有计数,我的查询应该返回一个引发该情况的标志。
如果我们在01:05运行查询,我们知道应该有四个点(00:10、00:25、00:40、00:55)的计数不为零。
如何使用查询进行比较和确保这一点?

  1. select timecolumn as time, count(*) as cnt
  2. from 'project.dataset.table'
  3. group by 1
  4. order by 1

将会得到以下结果:

  1. 00:10 5
  2. 00:25 5
  3. 00:40 10
  4. ....
英文:

I have a column in bigquery which is time in the below format. The time(time is in utc) increases in 15 minute interval(starting from 00:10) and it continues for the whole day till 23:.....

  1. 00:10
  2. 00:25
  3. 00:40
  4. ...

now i want to check if we have data uptodate using a query. That is every time point we should have multiple entries(count should be >0 when grouped by time). If there is no count at a particular point my query should be returning a flag whih raises that.
if we are running the query at 01:05 we know that there should be four points(00:10 00:25 00:40 00:55) with non zero count
How do we compare and ensure this using a query?

  1. select timecolumn as time, count(*) as cnt
  2. from 'project.dataset.table'
  3. group by 1
  4. order by 1

will give the below

  1. 00:10 5
  2. 00:25 5
  3. 00:40 10
  4. ....
  5. select timecolumn as time, count(*) as cnt
  6. from 'project.dataset.table'

答案1

得分: 0

I believe you can skip group by and do something like shown below.

required_times 返回的行数将在一天内从0增长到最多24*4条记录。

my_data_table 仅用于模拟您的数据表。如果它与 required_times 的值不完全匹配,您可能需要应用 timestamp_trunc(timecolumn, MINUTE)

  1. with required_times as (
  2. select * from unnest(GENERATE_TIMESTAMP_ARRAY(
  3. date_add(timestamp_trunc(current_timestamp(), DAY), INTERVAL 10 MINUTE),
  4. current_timestamp(),
  5. INTERVAL 15 MINUTE)) as timecolumn
  6. ),
  7. my_data_table as (
  8. select * from unnest(GENERATE_TIMESTAMP_ARRAY(
  9. date_add(timestamp_trunc(current_timestamp(), DAY), INTERVAL 10 MINUTE),
  10. date_add(current_timestamp(), INTERVAL -1 HOUR),
  11. INTERVAL 15 MINUTE)) as timecolumn
  12. )
  13. select
  14. required_times.timecolumn
  15. from
  16. required_times
  17. left join my_data_table on required_times.timecolumn = my_data_table.timecolumn
  18. where
  19. my_data_table.timecolumn is null

将返回类似以下的结果,因为缺少一天中最后一个小时的记录在 my_data_table 中。

SQL查询以检查计数。

英文:

I believe you can skip group by and do something like shown below.

Number of rows returned by required_times will grow from 0 to a max of 24*4 records during the day.

my_data_table is just for emulating your data table. You may need to apply timestamp_trunc(timecolumn, MINUTE) for your data if it does not match exactly with values from required_times.

  1. with required_times as (
  2. select * from unnest(GENERATE_TIMESTAMP_ARRAY(
  3. date_add(timestamp_trunc(current_timestamp(), DAY), INTERVAL 10 MINUTE),
  4. current_timestamp(),
  5. INTERVAL 15 MINUTE)) as timecolumn
  6. ),
  7. my_data_table as (
  8. select * from unnest(GENERATE_TIMESTAMP_ARRAY(
  9. date_add(timestamp_trunc(current_timestamp(), DAy), INTERVAL 10 MINUTE),
  10. date_add(current_timestamp(), INTERVAL -1 HOUR),
  11. INTERVAL 15 MINUTE)) as timecolumn
  12. )
  13. select
  14. required_times.timecolumn
  15. from
  16. required_times
  17. left join my_data_table on required_times.timecolumn = my_data_table.timecolumn
  18. where
  19. my_data_table.timecolumn is null

will return e.g. like this because records for the last hour of day is missing from my_data_table

SQL查询以检查计数。

huangapple
  • 本文由 发表于 2023年4月19日 21:52:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76055358.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定