英文:
mysql: get running count over time based on start and end timestamps
问题
有一个 workflows
表格,包含列 (processID
, started_at
, ended_at
)。
我该如何根据下表格中的数据,构建给定时间戳下活跃运行的 processID
的运行计数作为时间序列:
进程时间戳表格:
id started_at ended_at
------- -------------------- --------------------
1203914 2023-04-20T04:54:29Z 2023-04-20T20:43:53Z
1197674 2023-04-20T06:00:28Z 2023-04-20T21:17:53Z
1212050 2023-04-20T18:47:29Z 0001-01-01T00:00:00Z
1198434 2023-04-22T18:16:53Z 2023-04-22T19:02:59Z
1210450 2023-04-22T19:06:53Z 2023-04-26T03:23:39Z
1210466 2023-04-23T05:34:53Z 2023-04-25T07:09:39Z
1201986 2023-04-24T06:30:53Z 2023-04-24T23:49:53Z
1200122 2023-04-24T17:22:53Z 2023-04-25T05:29:39Z
1209114 2023-04-25T01:07:53Z 2023-04-26T23:03:39Z
1198570 2023-04-25T01:10:53Z 2023-04-27T00:59:38Z
期望的运行中进程列表:
timestamp running_process_count
-------------------- ---------------------
2023-04-20T04:54:29Z 1
2023-04-20T06:00:28Z 2
2023-04-20T18:47:29Z 3
2023-04-22T18:16:53Z 1
2023-04-22T19:06:53Z 1
2023-04-23T05:34:53Z 2
2023-04-24T06:30:53Z 3
2023-04-24T17:22:53Z 4
2023-04-25T01:07:53Z 4
我正在寻找类似于以下链接中的方式进行操作:
我可以使用以下查询获得特定小时的进程ID计数,但我寻找的是每个时间戳下的“运行”进程计数(可以是started_at),其中我们显示已开始_at < timestamp < ended_at的进程的计数。
是否需要使用MySQL窗口函数来实现这一点?(lag、lead、partition等)- 抱歉,我不熟悉高级MySQL运算符。
目前我的进展:
SELECT
started_at,
count(*) AS running_count
FROM workflows
GROUP BY
YEAR(started_at),
MONTH(started_at),
DAY(started_at),
HOUR(started_at)
ORDER BY
YEAR(started_at),
MONTH(started_at),
DAY(started_at),
HOUR(started_at);
英文:
I have a workflows
table with columns (processID
, started_at
, ended_at
)
How can I build running counts
of actively running process IDs per a given timestamp as a timeseries from data tabulated below:
Table of process timestamps:
id started_at ended_at
------- -------------------- --------------------
1203914 2023-04-20T04:54:29Z 2023-04-20T20:43:53Z
1197674 2023-04-20T06:00:28Z 2023-04-20T21:17:53Z
1212050 2023-04-20T18:47:29Z 0001-01-01T00:00:00Z
1198434 2023-04-22T18:16:53Z 2023-04-22T19:02:59Z
1210450 2023-04-22T19:06:53Z 2023-04-26T03:23:39Z
1210466 2023-04-23T05:34:53Z 2023-04-25T07:09:39Z
1201986 2023-04-24T06:30:53Z 2023-04-24T23:49:53Z
1200122 2023-04-24T17:22:53Z 2023-04-25T05:29:39Z
1209114 2023-04-25T01:07:53Z 2023-04-26T23:03:39Z
1198570 2023-04-25T01:10:53Z 2023-04-27T00:59:38Z
expected running process list:
timestamp running_process_count
-------------------- ---------------------
2023-04-20T04:54:29Z 1
2023-04-20T06:00:28Z 2
2023-04-20T18:47:29Z 3
2023-04-22T18:16:53Z 1
2023-04-22T19:06:53Z 1
2023-04-23T05:34:53Z 2
2023-04-24T06:30:53Z 3
2023-04-24T17:22:53Z 4
2023-04-25T01:07:53Z 4
I'm looking for something similar to how it's done in:
I can get counts of process IDs for a particular HOUR by using the following query, however what I'm looking for is "running" process count per timestamp (can be started_at) where we display count of processes that have started_at < timestamp < ended_at.
Do I need to use MySQL windowing functions to achieve this? (lag, lead, partition etc) - apologize as I'm not familiar with advanced MySQL operators.
What I have so far:
SELECT
started_at,
count(*) AS running_count
FROM workflows
GROUP BY
YEAR(started_at),
MONTH(started_at),
DAY(started_at),
HOUR(started_at)
ORDER BY
YEAR(started_at),
MONTH(started_at),
DAY(started_at),
HOUR(started_at);
答案1
得分: 1
执行自连接并聚合如下所示:
select t1.started_at,
count(t2.id) cnt
from workflows t1 left join workflows t2
on t1.started_at between t2.started_at and t2.ended_at
group by t1.started_at
order by t1.started_at
英文:
Do a self-join and aggregate as the following:
select t1.started_at,
count(t2.id) cnt
from workflows t1 left join workflows t2
on t1.started_at between t2.started_at and t2.ended_at
group by t1.started_at
order by t1.started_at
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论