英文:
Give number to rows based on condition
问题
你想要为给定的数据创建一个名为 'sessionid' 的新列,当出现新会话行('YES')时,希望 'sessionid' 的下一组行的值递增1,直到找到另一个 'YES'。你想知道是否有办法在rank函数中添加条件,但不确定如何操作。感谢你的帮助!
英文:
all. I am trying to create another column called 'sessionid' for the following data; When the newsession row = 'YES'. I want the next group of rows value for 'sessionid' to be incremented by 1 until we find another 'YES'. So for example:
newsession | empid | date_time | sessionid |
---|---|---|---|
NO | 12345 | 2023-01-18 00:01:40.000000 | 1 |
NO | 12345 | 2023-01-18 00:01:46.000000 | 1 |
NO | 12345 | 2023-01-18 00:01:42.000000 | 1 |
YES | 12345 | 2023-01-18 00:01:46.000000 | 1 |
NO | 12345 | 2023-01-18 01:34:48.000000 | 2 |
NO | 12345 | 2023-01-18 02:01:59.000000 | 2 |
NO | 12345 | 2023-01-18 02:02:06.000000 | 2 |
YES | 12345 | 2023-01-18 02:02:12.000000 | 2 |
NO | 12345 | 2023-01-18 12:50:45.000000 | 3 |
NO | 12345 | 2023-01-18 12:50:45.000000 | 3 |
I am wondering if there is any way to condition the rank function, but not sure how to do it; Any help will be really appreciated!
答案1
得分: 1
在子查询中,每当前一条记录为“是”时,在新列中放入1
:
SELECT
CASE WHEN LAG(newsession) OVER (PARTITION BY empid ORDER BY date_Time ASC) = 'Yes' THEN 1 ELSE 0 END as newsession_found,
yourtable.*
FROM yourtable
然后在外部查询中,使用这个子查询的结果集,在这个新列上运行累积求和:
SELECT
newsession,
empid,
date_time,
SUM(newsession_found) OVER (PARTITION BY empid ORDER BY date_Time ASC ROWS UNBOUNDED PRECEDING) AS sessionid
FROM
(
SELECT
CASE WHEN LAG(newsession) OVER (PARTITION BY empid ORDER BY date_Time ASC) = 'Yes' THEN 1 ELSE 0 END as newsession_found,
yourtable.*
FROM yourtable
) sub;
英文:
In a subquery, put a 1
in a new column every time the previous record is Yes
:
SELECT
CASE WHEN LAG(newsession) OVER (PARTITION BY empid ORDER BY date_Time ASC) = 'Yes' THEN 1 ELSE 0 END as newsession_found,
yourtable.*
FROM yourtable
Then in an outer query using this subquery's result set, run a cumulative sum on this new column:
SELECT
newsession,
empid,
date_time
SUM(newsession_found) OVER (PARTITION BY empid ORDER BY date_Time ASC ROWS UNBOUNDED PRECEDING) AS sessionid
FROM
(
SELECT
CASE WHEN LAG(newsession) OVER (PARTITION BY empid ORDER BY date_Time ASC) = 'Yes' THEN 1 ELSE 0 END as newsession_found,
yourtable.*
FROM yourtable
) sub;
答案2
得分: 0
我无法按你的要求提供代码的翻译。
英文:
I didn't managed to get the results you're looking for in a single query, but if you can iterate over the records, then your problem becomes much simpler.
You'd have to test the performance of a query like this with a much larger data set to see how it holds up.
------------------------------------------------------------------------------------------------------------------
-- build sample data
------------------------------------------------------------------------------------------------------------------
declare @sessiontable table(
newsession varchar(55),
empid int,
date_time datetime,
sessionId int)
insert into @sessiontable(newsession, empid, date_time)
values
('NO','12345','2023-01-18 00:01:40'),
('NO','12345','2023-01-18 00:01:46'),
('NO','12345','2023-01-18 00:01:42'),
('YES','12345','2023-01-18 00:01:46'),
('NO','12345','2023-01-18 01:34:48'),
('NO','12345','2023-01-18 02:01:59'),
('NO','12345','2023-01-18 02:02:06'),
('YES','12345','2023-01-18 02:02:12'),
('NO','12345','2023-01-18 12:50:45'),
('NO','12345','2023-01-18 12:50:45')
------------------------------------------------------------------------------------------------------------------
-- id each record
------------------------------------------------------------------------------------------------------------------
drop table if exists #numberedRows
select *, row_number() over (order by date_time) as row_number
into #numberedRows
from @sessiontable
------------------------------------------------------------------------------------------------------------------
-- iterate over the records
------------------------------------------------------------------------------------------------------------------
declare @currentRowNumber int = 1
declare @currentSessionId int = 1
while exists(select 1 from #numberedRows where sessionId is null)
begin
declare @newssion varchar(55)
select @newssion = isnull(newsession, 'NO') from #numberedRows where row_number = (@currentRowNumber - 1)
if (@newssion = 'YES')
begin
set @currentSessionId = @currentSessionId + 1
end
update #numberedRows
set sessionId = @currentSessionId
where row_number = @currentRowNumber
set @currentRowNumber = @currentRowNumber + 1
end
select newsession,
empid,
date_time,
sessionId
from #numberedRows
Results:
newsession | empid | date_time | sessionId |
---|---|---|---|
NO | 12345 | 2023-01-18 00:01:40.000 | 1 |
NO | 12345 | 2023-01-18 00:01:42.000 | 1 |
YES | 12345 | 2023-01-18 00:01:46.000 | 1 |
NO | 12345 | 2023-01-18 00:01:46.000 | 2 |
NO | 12345 | 2023-01-18 01:34:48.000 | 2 |
NO | 12345 | 2023-01-18 02:01:59.000 | 2 |
NO | 12345 | 2023-01-18 02:02:06.000 | 2 |
YES | 12345 | 2023-01-18 02:02:12.000 | 2 |
NO | 12345 | 2023-01-18 12:50:45.000 | 3 |
NO | 12345 | 2023-01-18 12:50:45.000 | 3 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论