2023年5月10日 11:40:51go评论142阅读模式

英文:

mysql: get running count over time based on start and end timestamps

问题

有一个 workflows 表格，包含列 (processID, started_at, ended_at)。

我该如何根据下表格中的数据，构建给定时间戳下活跃运行的 processID 的运行计数作为时间序列：

进程时间戳表格：

    id	    started_at	            ended_at
    ------- --------------------    --------------------
    1203914	2023-04-20T04:54:29Z	2023-04-20T20:43:53Z
    1197674	2023-04-20T06:00:28Z	2023-04-20T21:17:53Z
    1212050	2023-04-20T18:47:29Z	0001-01-01T00:00:00Z
    1198434	2023-04-22T18:16:53Z	2023-04-22T19:02:59Z
    1210450	2023-04-22T19:06:53Z	2023-04-26T03:23:39Z
    1210466	2023-04-23T05:34:53Z	2023-04-25T07:09:39Z
    1201986	2023-04-24T06:30:53Z	2023-04-24T23:49:53Z
    1200122	2023-04-24T17:22:53Z	2023-04-25T05:29:39Z
    1209114	2023-04-25T01:07:53Z	2023-04-26T23:03:39Z
    1198570	2023-04-25T01:10:53Z	2023-04-27T00:59:38Z

期望的运行中进程列表：

    timestamp               running_process_count
    --------------------    ---------------------
    2023-04-20T04:54:29Z	1
    2023-04-20T06:00:28Z	2
    2023-04-20T18:47:29Z	3
    2023-04-22T18:16:53Z	1
    2023-04-22T19:06:53Z	1
    2023-04-23T05:34:53Z	2
    2023-04-24T06:30:53Z	3
    2023-04-24T17:22:53Z	4
    2023-04-25T01:07:53Z	4

我正在寻找类似于以下链接中的方式进行操作：

https://stackoverflow.com/questions/26290314/r-calculate-a-count-of-items-over-time-using-start-and-end-dates

我可以使用以下查询获得特定小时的进程ID计数，但我寻找的是每个时间戳下的“运行”进程计数（可以是started_at），其中我们显示已开始_at < timestamp < ended_at的进程的计数。

是否需要使用MySQL窗口函数来实现这一点？（lag、lead、partition等）- 抱歉，我不熟悉高级MySQL运算符。

目前我的进展：

    SELECT   
      started_at,
      count(*) AS running_count
    FROM workflows
    GROUP BY 
      YEAR(started_at),
      MONTH(started_at),
      DAY(started_at),
      HOUR(started_at)
    ORDER BY 
      YEAR(started_at),
      MONTH(started_at),
      DAY(started_at),
      HOUR(started_at);

英文:

I have a workflows table with columns (processID, started_at, ended_at)

How can I build running counts of actively running process IDs per a given timestamp as a timeseries from data tabulated below:

Table of process timestamps:

id	    started_at	            ended_at
------- --------------------    --------------------
1203914	2023-04-20T04:54:29Z	2023-04-20T20:43:53Z
1197674	2023-04-20T06:00:28Z	2023-04-20T21:17:53Z
1212050	2023-04-20T18:47:29Z	0001-01-01T00:00:00Z
1198434	2023-04-22T18:16:53Z	2023-04-22T19:02:59Z
1210450	2023-04-22T19:06:53Z	2023-04-26T03:23:39Z
1210466	2023-04-23T05:34:53Z	2023-04-25T07:09:39Z
1201986	2023-04-24T06:30:53Z	2023-04-24T23:49:53Z
1200122	2023-04-24T17:22:53Z	2023-04-25T05:29:39Z
1209114	2023-04-25T01:07:53Z	2023-04-26T23:03:39Z
1198570	2023-04-25T01:10:53Z	2023-04-27T00:59:38Z

expected running process list:

timestamp               running_process_count
--------------------    ---------------------
2023-04-20T04:54:29Z	1
2023-04-20T06:00:28Z	2
2023-04-20T18:47:29Z	3
2023-04-22T18:16:53Z	1
2023-04-22T19:06:53Z	1
2023-04-23T05:34:53Z	2
2023-04-24T06:30:53Z	3
2023-04-24T17:22:53Z	4
2023-04-25T01:07:53Z	4

I'm looking for something similar to how it's done in:

https://stackoverflow.com/questions/26290314/r-calculate-a-count-of-items-over-time-using-start-and-end-dates

I can get counts of process IDs for a particular HOUR by using the following query, however what I'm looking for is "running" process count per timestamp (can be started_at) where we display count of processes that have started_at < timestamp < ended_at.

Do I need to use MySQL windowing functions to achieve this? (lag, lead, partition etc) - apologize as I'm not familiar with advanced MySQL operators.

What I have so far:

SELECT   
  started_at,
  count(*) AS running_count
FROM workflows
GROUP BY 
  YEAR(started_at),
  MONTH(started_at),
  DAY(started_at),
  HOUR(started_at)
ORDER BY 
  YEAR(started_at),
  MONTH(started_at),
  DAY(started_at),
  HOUR(started_at);

答案1

得分: 1

执行自连接并聚合如下所示：

select t1.started_at,
  count(t2.id) cnt
from workflows t1 left join workflows t2
on t1.started_at between t2.started_at and t2.ended_at
group by t1.started_at
order by t1.started_at

查看演示

英文:

Do a self-join and aggregate as the following:

select t1.started_at,
  count(t2.id) cnt
from workflows t1 left join workflows t2
on t1.started_at between t2.started_at and t2.ended_at
group by t1.started_at
order by t1.started_at

See a demo

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

mysql：根据开始和结束时间戳获取随时间变化的运行计数。

问题

答案1

SELECT语句中的部分不需要翻译。

如何在选择语句中根据条件选择最大值？

什么是从Go连接到MySQL的推荐方法？

子查询与GROUP BY和MAX不会使用索引。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论