SQL查询以检查计数。

huangapple go评论72阅读模式
英文:

SQL query to check the count

问题

我有一个在BigQuery中的列,其格式如下所示。时间(时间以UTC为准)以15分钟的间隔增加(从00:10开始),并持续一整天直到23:......

现在我想使用查询来检查数据是否及时更新。也就是说,在每个时间点,我们应该有多个条目(按时间分组时计数应该大于0)。如果在特定时间点没有计数,我的查询应该返回一个引发该情况的标志。
如果我们在01:05运行查询,我们知道应该有四个点(00:10、00:25、00:40、00:55)的计数不为零。
如何使用查询进行比较和确保这一点?

select timecolumn as time, count(*) as cnt 
from 'project.dataset.table' 
group by 1 
order by 1

将会得到以下结果:

00:10     5 
00:25     5
00:40     10
    
....
英文:

I have a column in bigquery which is time in the below format. The time(time is in utc) increases in 15 minute interval(starting from 00:10) and it continues for the whole day till 23:.....

00:10   
00:25   
00:40   
...

now i want to check if we have data uptodate using a query. That is every time point we should have multiple entries(count should be >0 when grouped by time). If there is no count at a particular point my query should be returning a flag whih raises that.
if we are running the query at 01:05 we know that there should be four points(00:10 00:25 00:40 00:55) with non zero count
How do we compare and ensure this using a query?

select timecolumn as time, count(*) as cnt 
from 'project.dataset.table' 
group by 1 
order by 1 

will give the below

00:10     5 
00:25     5
00:40     10

....



select timecolumn as time, count(*) as cnt 
from 'project.dataset.table'

答案1

得分: 0

I believe you can skip group by and do something like shown below.

required_times 返回的行数将在一天内从0增长到最多24*4条记录。

my_data_table 仅用于模拟您的数据表。如果它与 required_times 的值不完全匹配,您可能需要应用 timestamp_trunc(timecolumn, MINUTE)

with required_times as (
  select * from unnest(GENERATE_TIMESTAMP_ARRAY(
    date_add(timestamp_trunc(current_timestamp(), DAY), INTERVAL 10 MINUTE), 
    current_timestamp(), 
    INTERVAL 15 MINUTE)) as timecolumn
),
my_data_table as (
  select * from unnest(GENERATE_TIMESTAMP_ARRAY(
    date_add(timestamp_trunc(current_timestamp(), DAY), INTERVAL 10 MINUTE), 
    date_add(current_timestamp(), INTERVAL -1 HOUR), 
    INTERVAL 15 MINUTE)) as timecolumn
)

select
  required_times.timecolumn
from
  required_times
  left join my_data_table on required_times.timecolumn = my_data_table.timecolumn
where
  my_data_table.timecolumn is null 

将返回类似以下的结果,因为缺少一天中最后一个小时的记录在 my_data_table 中。

SQL查询以检查计数。

英文:

I believe you can skip group by and do something like shown below.

Number of rows returned by required_times will grow from 0 to a max of 24*4 records during the day.

my_data_table is just for emulating your data table. You may need to apply timestamp_trunc(timecolumn, MINUTE) for your data if it does not match exactly with values from required_times.

with required_times as (
  select * from unnest(GENERATE_TIMESTAMP_ARRAY(
    date_add(timestamp_trunc(current_timestamp(), DAY), INTERVAL 10 MINUTE), 
    current_timestamp(), 
    INTERVAL 15 MINUTE)) as timecolumn
),
my_data_table as (
  select * from unnest(GENERATE_TIMESTAMP_ARRAY(
    date_add(timestamp_trunc(current_timestamp(), DAy), INTERVAL 10 MINUTE), 
    date_add(current_timestamp(), INTERVAL -1 HOUR), 
    INTERVAL 15 MINUTE)) as timecolumn
)

select
  required_times.timecolumn
from
  required_times
  left join my_data_table on required_times.timecolumn = my_data_table.timecolumn
where
  my_data_table.timecolumn is null 

will return e.g. like this because records for the last hour of day is missing from my_data_table

SQL查询以检查计数。

huangapple
  • 本文由 发表于 2023年4月19日 21:52:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76055358.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定