英文:
DB2: Flag the first event and rows that are 60 days after the last flagged event
问题
| 用户ID | 访问日期 | 新访问 | 
|---|---|---|
| 001 | 1/1/21 | 1 | 
| 001 | 2/7/21 | 0 | 
| 001 | 3/6/21 | 1 | 
| 002 | 2/8/21 | 1 | 
| 002 | 6/3/22 | 1 | 
| 003 | 4/9/21 | 1 | 
| 003 | 5/4/21 | 0 | 
英文:
I have a dataset that looks like this:
| UserID | VisitDate | 
|---|---|
| 001 | 1/1/21 | 
| 001 | 2/7/21 | 
| 001 | 3/6/21 | 
| 002 | 2/8/21 | 
| 002 | 6/3/22 | 
| 003 | 4/9/21 | 
| 003 | 5/4/21 | 
I need to create a logic that would flag the very first visit as a NewVisit per user and then also flag rows if it's more than 60 days since the last "NewVisit".
| UserID | VisitDate | NewVisit | 
|---|---|---|
| 001 | 1/1/21 | 1 | 
| 001 | 2/7/21 | 0 | 
| 001 | 3/6/21 | 1 | 
| 002 | 2/8/21 | 1 | 
| 002 | 6/3/22 | 1 | 
| 003 | 4/9/21 | 1 | 
| 003 | 5/4/21 | 0 | 
I tried using lag/lead but I think I'm missing something.
答案1
得分: 1
建议使用MAX OLAP函数来简化操作。以下是已翻译的代码部分:
WITH MYTAB (UserID, VisitDate) AS
(
VALUES 
  ('001', DATE (TO_DATE ('1/1/21', 'MM/DD/YY'))
  , ('001', DATE (TO_DATE ('2/7/21', 'MM/DD/YY'))
  , ('001', DATE (TO_DATE ('3/6/21', 'MM/DD/YY'))
  , ('002', DATE (TO_DATE ('2/8/21', 'MM/DD/YY'))
  , ('002', DATE (TO_DATE ('6/3/22', 'MM/DD/YY'))
  , ('003', DATE (TO_DATE ('4/9/21', 'MM/DD/YY'))
  , ('003', DATE (TO_DATE ('5/4/21', 'MM/DD/YY'))
)
SELECT 
  UserID
, VisitDate
, CASE 
	WHEN 
	  VisitDate = VisitDate_max 
	  OR DAYS (VisitDate_max) - DAYS (VisitDate) > 60
	THEN 1
	ELSE 0
  END AS NewVisit
FROM 
(
SELECT 
  *
, MAX (VisitDate) OVER (PARTITION BY UserID) AS VisitDate_max
FROM MYTAB
)
ORDER BY UserID, VisitDate
希望这对您有所帮助。
英文:
It's easier to use the MAX OLAP function instead.
WITH MYTAB (UserID, VisitDate) AS
(
VALUES 
  ('001', DATE (TO_DATE ('1/1/21', 'MM/DD/YY')))
, ('001', DATE (TO_DATE ('2/7/21', 'MM/DD/YY')))
, ('001', DATE (TO_DATE ('3/6/21', 'MM/DD/YY')))
, ('002', DATE (TO_DATE ('2/8/21', 'MM/DD/YY')))
, ('002', DATE (TO_DATE ('6/3/22', 'MM/DD/YY')))
, ('003', DATE (TO_DATE ('4/9/21', 'MM/DD/YY')))
, ('003', DATE (TO_DATE ('5/4/21', 'MM/DD/YY')))
)
SELECT 
  UserID
, VisitDate
, CASE 
	WHEN 
	  VisitDate = VisitDate_max 
	  OR DAYS (VisitDate_max) - DAYS (VisitDate) > 60
	THEN 1
	ELSE 0
  END AS NewVisit
FROM 
(
SELECT 
  *
, MAX (VisitDate) OVER (PARTITION BY UserID) AS VisitDate_max
FROM MYTAB
)
ORDER BY UserID, VisitDate
| USERID | VISITDATE | NEWVISIT | 
|---|---|---|
| 001 | 2021-01-01 | 1 | 
| 001 | 2021-02-07 | 0 | 
| 001 | 2021-03-06 | 1 | 
| 002 | 2021-02-08 | 1 | 
| 002 | 2022-06-03 | 1 | 
| 003 | 2021-04-09 | 0 | 
| 003 | 2021-05-04 | 1 | 
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论