2023年6月6日 03:23:11go评论71阅读模式

英文:

Counting sequenced records in ranges, based on grouped varchar values

问题

I'm trying to calculate count of same values, in a certain pattern, where the pattern might occur again.

To simply, please see below data, and expected result.
Solution is implemented in SQL Server 2016 and above ...

The blocks are sequenced based on ts column.

The expected result should look like:

MAX_OF_PROP1_IN_BLOCK	PROP1	PROP2	COUNT_PROP1_IN_BLOCK	COUNT_OF_PROP2_IN_BLOCK
2023-05-01 10:51:00	AAAA	01	3	3
2023-05-01 10:48:00	BBBB	01	3	3
2023-05-01 10:45:00	AAAA	01	4	2
2023-05-01 10:45:00	AAAA	02	4	2
2023-05-01 10:41:00	CCCC	01	3	2
2023-05-01 10:41:00	CCCC	02	3	1
2023-05-01 10:38:00	BBBB	01	3	3
2023-05-01 10:35:00	AAAA	01	2	2
2023-05-01 10:33:00	CCCC	02	9	2
2023-05-01 10:33:00	CCCC	01	9	5
2023-05-01 10:33:00	CCCC	02	9	2

I've tried experimenting with window functions, but I'm keep getting always count/max values based on prop1/prop2 within whole scope, and not in ranges defined by prop1/prop2 values.
It is possible to preorganize data with CTE query before...

I believe the solution is quite simple, but after several tries, and searching the internet, I could not find an appropriate example. I believe, RANGE should be used, but I could not get this to work.

Any help/advice would be greatly appreciated.

英文:

I'm trying to calculate count of same values, in a certain pattern, where the pattern might occur again.

To simply, please see below data, and expected result.
Solution is implemented in SQL Server 2016 and above ...

DECLARE @dataTable TABLE
                   (
                       ts DATETIME, 
                       prop1 VARCHAR(4), 
                       prop2 VARCHAR(2)
                   );

INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:51:00&#39;, &#39;AAAA&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:50:00&#39;, &#39;AAAA&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:49:00&#39;, &#39;AAAA&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:48:00&#39;, &#39;BBBB&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:47:00&#39;, &#39;BBBB&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:46:00&#39;, &#39;BBBB&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:45:00&#39;, &#39;AAAA&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:44:00&#39;, &#39;AAAA&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:43:00&#39;, &#39;AAAA&#39;, &#39;02&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:32:00&#39;, &#39;AAAA&#39;, &#39;02&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:41:00&#39;, &#39;CCCC&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:40:00&#39;, &#39;CCCC&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:39:00&#39;, &#39;CCCC&#39;, &#39;02&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:38:00&#39;, &#39;BBBB&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:37:00&#39;, &#39;BBBB&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:36:00&#39;, &#39;BBBB&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:35:00&#39;, &#39;AAAA&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:34:00&#39;, &#39;AAAA&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:33:00&#39;, &#39;CCCC&#39;, &#39;02&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:32:00&#39;, &#39;CCCC&#39;, &#39;02&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:31:00&#39;, &#39;CCCC&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:30:00&#39;, &#39;CCCC&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:29:00&#39;, &#39;CCCC&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:28:00&#39;, &#39;CCCC&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:27:00&#39;, &#39;CCCC&#39;, &#39;01&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:26:00&#39;, &#39;CCCC&#39;, &#39;02&#39;);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES (&#39;2023-05-01 10:25:00&#39;, &#39;CCCC&#39;, &#39;02&#39;);

SELECT 
    *, 
    MAX(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC), 
    COUNT(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC )
FROM 
    @dataTable 
ORDER BY 
    ts DESC;

The blocks are sequenced based on ts column.

The expected result should look like:

MAX_OF_PROP1_IN_BLOCK	PROP1	PROP2	COUNT_PROP1_IN_BLOCK	COUNT_OF_PROP2_IN_BLOCK
2023-05-01 10:51:00	AAAA	01	3	3
2023-05-01 10:48:00	BBBB	01	3	3
2023-05-01 10:45:00	AAAA	01	4	2
2023-05-01 10:45:00	AAAA	02	4	2
2023-05-01 10:41:00	CCCC	01	3	2
2023-05-01 10:41:00	CCCC	02	3	1
2023-05-01 10:38:00	BBBB	01	3	3
2023-05-01 10:35:00	AAAA	01	2	2
2023-05-01 10:33:00	CCCC	02	9	2
2023-05-01 10:33:00	CCCC	01	9	5
2023-05-01 10:33:00	CCCC	02	9	2

SELECT * 
, MAX(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC)
, COUNT(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC )
FROM @dataTable
ORDER BY ts DESC;

I believe the solution is quite simple, but after several tries, and searching the internet, I could not find appropriate example. I believe, RANGE should be used, but I could not get this to work.

Any help/advice would be greatly appreciated.

答案1

得分: 2

对于假设有两行包含 10:32:00 是一个打字错误的情况，您面临一个 gaps and islands 问题。

我更喜欢的方法是使用 lag 来在下一行的值不同时设置一个标志，然后将其求和以提供代表每个 island 的累积总数；有了这个设置后，您可以计算和汇总最终的数据。

请查看此演示示例。

以下是代码部分，不需要翻译。

英文:

On the assumption that having two rows with 10:32:00 is a typo, you have a gaps and islands problem.

The method I prefer is to set a flag by using lag whenever the value in the next row is different, then sum this to provide a running total that represents each island; with this in place you can then count and aggregate the final data.

See this Demo Fiddle

with diff as (
  select *, 
    case when lag(prop1) over(order by ts desc ) = prop1 then 0 else 1 end gp1,
    case when lag(prop2) over(order by ts desc ) = prop2 then 0 else 1 end gp2
  from t
), grp as (
  select *, 
    Sum(gp1) over(order by ts desc) grp1, Sum(gp2) over(order by ts desc) grp2
  from diff
)
select Max_Of_Prop_In_Block, prop1, prop2, 
  grp1Count Count_Of_Prop1_In_Block, 
  grp2Count Count_Of_Prop2_In_Block
from (
  select *, 
	Max(ts) over(partition by grp1) Max_Of_Prop_In_Block, 
	Count(*) over(partition by grp1) grp1Count, 
	Count(*) over(partition by grp1, grp2) grp2Count
  from grp
)t
group by Max_Of_Prop_In_Block, grp1Count, prop1, prop2, grp2Count, grp2
order by Max_Of_Prop_In_Block desc, Max(grp2);

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

统计基于分组的字符型值的范围内的已排序记录数。

问题

答案1

更简洁的在GO语言中获取SQL结果集的函数

How to connect to mssql database in linux with GO

Django在多对多关系上使用count筛选

SQL Server：AFTER INSERT触发器和INSERT INTO

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论