英文:
Aggregate Data Using UNIX Time in SQL
问题
我已经看到很多帖子讨论如何使用DATEPARTS()来基于日期时间进行数据聚合,但我还没有找到关于UNIX时间的讨论。我的目标是根据提供的初始和最终时间戳,能够按秒、分钟、小时或天进行数据聚合。例如:假设我的日期时间是:
tstart = datetime('2023-05-22 07:00:00','TimeZone','UTC');
tstop = datetime('2023-05-23 07:00:00','TimeZone','UTC');
这些转换为:
Tstart = "1684738800000000"
Tstop = "1684825200000000"
这是经过的微秒数。之所以将其转换为UNIX时间,是因为服务器端的数据集只有这种时间戳格式,而没有日期时间列。我应该如何对不同的聚合类型(如秒、分钟、小时和天)以及它们的倍增因子进行处理?(例如,2秒、7小时等),以便如果我有微秒级的数据频率,并将其转换为上述提供的时间戳范围内的每3小时,我会得到类似以下的结果:
我还没有尝试任何方法,因为我不知道从哪里开始,因为我以前没有使用过UNIX时间。
我正在尝试对Azure数据结构进行聚合/计算,数据结构如下:
英文:
I’ve seen many posts already discussing how to use DATEPARTS() to aggregate data based on a datetime, but I haven’t found anything discussing this for UNIX time specifically. My goal is to be able to aggregate data whether by sec, min, hr or day, based on the initial and final timestamps provided. For example: Say my datetimes are:
tstart = datetime('2023-05-22 07:00:00','TimeZone','UTC');
tstop = datetime('2023-05-23 07:00:00','TimeZone','UTC');
These convert to:
Tstart = "1684738800000000"
Tstop = "1684825200000000"
Which is the number of microseconds that have passed. The reason for this conversion to UNIX is because the dataset on the server side only has this timestamp format and not a datetime column. How can I do this for different agg types such as sec, min, hr, and day and scale multipliers of those? (ie. 2 seconds, 7 hours, etc.) such that if I have data frequency in microseconds, and I covert it to every 3 hours over the timetamps provided above, I get something like this:
I haven't tried anything yet because I don't know where to start with this as I haven't worked with UNIX time before.
Azure Data Structure I am trying to do agg/calc on:
答案1
得分: 0
为了按照3小时间隔汇总表格中的数据并计算每个间隔内data1
和data2
的平均值,您可以按照下面的方法进行操作。
SQL代码
with
cte1 as (SELECT DATEADD(SECOND, timestamp / 1000000, '19700101') AS dt,* FROM Mytable1),
cte2 as (select DATEPART(hour,dt)-((DATEPART(hour, dt) + 0) % 3) as interval, cast(dt as date) as date_col,* from cte1)
select min(timestamp) as timestamp, avg(data1) as data1, avg(data2) as data2 from cte2 group by date_col,interval;
第一个公用表达式(CTE)(cte1
) 使用DATEADD
函数将Mytable1
表中的Unix时间戳转换为日期时间值。第二个CTE (cte2
) 使用DATEPART
函数计算每个日期时间值的3小时间隔。最后的SELECT语句按日期和间隔对数据进行分组,并计算每个组的最小时间戳以及data1
和data2
的平均值。
英文:
In order to aggregate the data in the table by a 3-hour interval and calculate the average of data1
and data2
for each interval, you can follow the below approach.
SQL code
with
cte1 as (SELECT DATEADD(SECOND, timestamp / 1000000, '19700101') AS dt,* FROM Mytable1),
cte2 as (select DATEPART(hour,dt)-((DATEPART(hour, dt) + 0) % 3) as interval, cast(dt as date) as date_col,* from cte1)
select min(timestamp) as timestamp, avg(data1) as data1, avg(data2) as data2 from cte2 group by date_col,interval;
The first CTE (cte1
) converts the Unix timestamps in the Mytable1
table to datetime values using the DATEADD
function. The second CTE (cte2
) calculates the 3-hour interval for each datetime value using the DATEPART
function The final SELECT statement groups the data by date and interval and calculates the minimum timestamp and the average of data1
and data2
for each group.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论