英文:
Counting sequenced records in ranges, based on grouped varchar values
问题
I'm trying to calculate count of same values, in a certain pattern, where the pattern might occur again.
To simply, please see below data, and expected result.
Solution is implemented in SQL Server 2016 and above ...
The blocks are sequenced based on ts column.
The expected result should look like:
MAX_OF_PROP1_IN_BLOCK | PROP1 | PROP2 | COUNT_PROP1_IN_BLOCK | COUNT_OF_PROP2_IN_BLOCK |
---|---|---|---|---|
2023-05-01 10:51:00 | AAAA | 01 | 3 | 3 |
2023-05-01 10:48:00 | BBBB | 01 | 3 | 3 |
2023-05-01 10:45:00 | AAAA | 01 | 4 | 2 |
2023-05-01 10:45:00 | AAAA | 02 | 4 | 2 |
2023-05-01 10:41:00 | CCCC | 01 | 3 | 2 |
2023-05-01 10:41:00 | CCCC | 02 | 3 | 1 |
2023-05-01 10:38:00 | BBBB | 01 | 3 | 3 |
2023-05-01 10:35:00 | AAAA | 01 | 2 | 2 |
2023-05-01 10:33:00 | CCCC | 02 | 9 | 2 |
2023-05-01 10:33:00 | CCCC | 01 | 9 | 5 |
2023-05-01 10:33:00 | CCCC | 02 | 9 | 2 |
I've tried experimenting with window functions, but I'm keep getting always count/max values based on prop1/prop2 within whole scope, and not in ranges defined by prop1/prop2 values.
It is possible to preorganize data with CTE query before...
I believe the solution is quite simple, but after several tries, and searching the internet, I could not find an appropriate example. I believe, RANGE should be used, but I could not get this to work.
Any help/advice would be greatly appreciated.
英文:
I'm trying to calculate count of same values, in a certain pattern, where the pattern might occur again.
To simply, please see below data, and expected result.
Solution is implemented in SQL Server 2016 and above ...
DECLARE @dataTable TABLE
(
ts DATETIME,
prop1 VARCHAR(4),
prop2 VARCHAR(2)
);
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:51:00', 'AAAA', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:50:00', 'AAAA', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:49:00', 'AAAA', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:48:00', 'BBBB', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:47:00', 'BBBB', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:46:00', 'BBBB', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:45:00', 'AAAA', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:44:00', 'AAAA', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:43:00', 'AAAA', '02');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:32:00', 'AAAA', '02');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:41:00', 'CCCC', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:40:00', 'CCCC', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:39:00', 'CCCC', '02');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:38:00', 'BBBB', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:37:00', 'BBBB', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:36:00', 'BBBB', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:35:00', 'AAAA', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:34:00', 'AAAA', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:33:00', 'CCCC', '02');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:32:00', 'CCCC', '02');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:31:00', 'CCCC', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:30:00', 'CCCC', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:29:00', 'CCCC', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:28:00', 'CCCC', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:27:00', 'CCCC', '01');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:26:00', 'CCCC', '02');
INSERT INTO @dataTable (ts, prop1, prop2) VALUES ('2023-05-01 10:25:00', 'CCCC', '02');
SELECT
*,
MAX(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC),
COUNT(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC )
FROM
@dataTable
ORDER BY
ts DESC;
The blocks are sequenced based on ts column.
The expected result should look like:
MAX_OF_PROP1_IN_BLOCK | PROP1 | PROP2 | COUNT_PROP1_IN_BLOCK | COUNT_OF_PROP2_IN_BLOCK |
---|---|---|---|---|
2023-05-01 10:51:00 | AAAA | 01 | 3 | 3 |
2023-05-01 10:48:00 | BBBB | 01 | 3 | 3 |
2023-05-01 10:45:00 | AAAA | 01 | 4 | 2 |
2023-05-01 10:45:00 | AAAA | 02 | 4 | 2 |
2023-05-01 10:41:00 | CCCC | 01 | 3 | 2 |
2023-05-01 10:41:00 | CCCC | 02 | 3 | 1 |
2023-05-01 10:38:00 | BBBB | 01 | 3 | 3 |
2023-05-01 10:35:00 | AAAA | 01 | 2 | 2 |
2023-05-01 10:33:00 | CCCC | 02 | 9 | 2 |
2023-05-01 10:33:00 | CCCC | 01 | 9 | 5 |
2023-05-01 10:33:00 | CCCC | 02 | 9 | 2 |
I've tried experimenting with window functions, but I'm keep getting always count/max values based on prop1/prop2 within whole scope, and not in ranges defined by prop1/prop2 values.
It is possible to preorganized data with CTE query before...
SELECT *
, MAX(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC)
, COUNT(ts) OVER (PARTITION BY prop1 ORDER BY ts DESC )
FROM @dataTable
ORDER BY ts DESC;
I believe the solution is quite simple, but after several tries, and searching the internet, I could not find appropriate example. I believe, RANGE should be used, but I could not get this to work.
Any help/advice would be greatly appreciated.
答案1
得分: 2
对于假设有两行包含 10:32:00
是一个打字错误的情况,您面临一个 gaps and islands 问题。
我更喜欢的方法是使用 lag
来在下一行的值不同时设置一个标志,然后将其求和以提供代表每个 island 的累积总数;有了这个设置后,您可以计算和汇总最终的数据。
请查看此演示示例。
以下是代码部分,不需要翻译。
英文:
On the assumption that having two rows with 10:32:00
is a typo, you have a gaps and islands problem.
The method I prefer is to set a flag by using lag
whenever the value in the next row is different, then sum this to provide a running total that represents each island; with this in place you can then count and aggregate the final data.
See this Demo Fiddle
with diff as (
select *,
case when lag(prop1) over(order by ts desc ) = prop1 then 0 else 1 end gp1,
case when lag(prop2) over(order by ts desc ) = prop2 then 0 else 1 end gp2
from t
), grp as (
select *,
Sum(gp1) over(order by ts desc) grp1, Sum(gp2) over(order by ts desc) grp2
from diff
)
select Max_Of_Prop_In_Block, prop1, prop2,
grp1Count Count_Of_Prop1_In_Block,
grp2Count Count_Of_Prop2_In_Block
from (
select *,
Max(ts) over(partition by grp1) Max_Of_Prop_In_Block,
Count(*) over(partition by grp1) grp1Count,
Count(*) over(partition by grp1, grp2) grp2Count
from grp
)t
group by Max_Of_Prop_In_Block, grp1Count, prop1, prop2, grp2Count, grp2
order by Max_Of_Prop_In_Block desc, Max(grp2);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论