使用UNIX时间在SQL中聚合数据。

huangapple go评论60阅读模式
英文:

Aggregate Data Using UNIX Time in SQL

问题

我已经看到很多帖子讨论如何使用DATEPARTS()来基于日期时间进行数据聚合,但我还没有找到关于UNIX时间的讨论。我的目标是根据提供的初始和最终时间戳,能够按秒、分钟、小时或天进行数据聚合。例如:假设我的日期时间是:

tstart = datetime('2023-05-22 07:00:00','TimeZone','UTC');
tstop = datetime('2023-05-23 07:00:00','TimeZone','UTC');

这些转换为:

Tstart = "1684738800000000"
Tstop = "1684825200000000"

这是经过的微秒数。之所以将其转换为UNIX时间,是因为服务器端的数据集只有这种时间戳格式,而没有日期时间列。我应该如何对不同的聚合类型(如秒、分钟、小时和天)以及它们的倍增因子进行处理?(例如,2秒、7小时等),以便如果我有微秒级的数据频率,并将其转换为上述提供的时间戳范围内的每3小时,我会得到类似以下的结果:

输出

我还没有尝试任何方法,因为我不知道从哪里开始,因为我以前没有使用过UNIX时间。

我正在尝试对Azure数据结构进行聚合/计算,数据结构如下:

数据

英文:

I’ve seen many posts already discussing how to use DATEPARTS() to aggregate data based on a datetime, but I haven’t found anything discussing this for UNIX time specifically. My goal is to be able to aggregate data whether by sec, min, hr or day, based on the initial and final timestamps provided. For example: Say my datetimes are:

tstart = datetime('2023-05-22 07:00:00','TimeZone','UTC');
tstop = datetime('2023-05-23 07:00:00','TimeZone','UTC');

These convert to:

Tstart = "1684738800000000"
Tstop = "1684825200000000"

Which is the number of microseconds that have passed. The reason for this conversion to UNIX is because the dataset on the server side only has this timestamp format and not a datetime column. How can I do this for different agg types such as sec, min, hr, and day and scale multipliers of those? (ie. 2 seconds, 7 hours, etc.) such that if I have data frequency in microseconds, and I covert it to every 3 hours over the timetamps provided above, I get something like this:

Output

I haven't tried anything yet because I don't know where to start with this as I haven't worked with UNIX time before.

Azure Data Structure I am trying to do agg/calc on:

Data

答案1

得分: 0

为了按照3小时间隔汇总表格中的数据并计算每个间隔内data1data2的平均值,您可以按照下面的方法进行操作。

SQL代码

with
   cte1 as (SELECT  DATEADD(SECOND, timestamp / 1000000, '19700101') AS dt,* FROM  Mytable1),
   cte2 as (select DATEPART(hour,dt)-((DATEPART(hour, dt) + 0) % 3) as interval, cast(dt as date) as date_col,* from cte1)
select min(timestamp) as timestamp, avg(data1) as data1, avg(data2) as data2 from cte2 group by date_col,interval;

第一个公用表达式(CTE)(cte1) 使用DATEADD函数将Mytable1表中的Unix时间戳转换为日期时间值。第二个CTE (cte2) 使用DATEPART函数计算每个日期时间值的3小时间隔。最后的SELECT语句按日期和间隔对数据进行分组,并计算每个组的最小时间戳以及data1data2的平均值。

db-fiddle

英文:

In order to aggregate the data in the table by a 3-hour interval and calculate the average of data1 and data2 for each interval, you can follow the below approach.

SQL code

with
   cte1 as (SELECT  DATEADD(SECOND, timestamp / 1000000, '19700101') AS dt,* FROM  Mytable1),
   cte2 as (select DATEPART(hour,dt)-((DATEPART(hour, dt) + 0) % 3) as interval, cast(dt as date) as date_col,* from cte1)
select min(timestamp) as timestamp, avg(data1) as data1, avg(data2) as data2 from cte2 group by date_col,interval;

The first CTE (cte1) converts the Unix timestamps in the Mytable1 table to datetime values using the DATEADD function. The second CTE (cte2) calculates the 3-hour interval for each datetime value using the DATEPART function The final SELECT statement groups the data by date and interval and calculates the minimum timestamp and the average of data1 and data2 for each group.

db-fiddle

huangapple
  • 本文由 发表于 2023年7月28日 04:14:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/76783120.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定