英文:
SAS: How do I delete records with a date within 30 days of the previous record, recursively?
问题
以下是您要翻译的内容:
假设我有一组按照ID和日期(从最早到最晚)排序的患者实验记录数据集,格式如下,我想删除所有加粗的实验记录,保留所有未加粗的记录。
基本上,我只想保留特定患者的实验记录(不应将来自不同患者的实验记录相互比较),这些记录是在前一次实验后> 30天完成的,同时在评估下一次实验时不计算已删除的实验记录(即,如果一个实验记录标记为红色,我希望在评估下一次实验是否在30天内时将其忽略并因此删除)。
您可以看到对于患者1111,他们的第三次实验与第二次实验相隔不到30天,但由于第二次实验已被标记为删除,因此第二次实验不计算在内,因此应保留第三次实验。
有人有关于如何在SAS中完成此操作的建议或建议吗?
ID 实验日期
1111 Jan 1 2023
1111 Jan 15 2023
1111 Feb 3 2023
1111 Feb 16 2023
2222 Jan 2 2023
2222 Jan 20 2023
2222 Feb 8 2023
2222 Feb 10 2023
2222 Feb 12 2023
3333 Jan 15 2023
3333 Feb 5 2023
3333 Feb 18 2023
英文:
Say I have a dataset of laboratory records for a set of patients (ordered by ID and date from earliest to latest) with this format, I want to remove all the bolded labs and keep all the non-bolded ones.
Basically, I want to only keep labs for a specific patient (labs from different patients should not be compared against each other) that are done >30 days from the previous lab, while not counting the removed labs when evaluating the next lab (i.e. if a lab is marked in red I want to ignore it when evaluating if the next lab is within 30 days and should thus be removed).
You can see for patient 1111, their 3rd lab is within 30 days of their 2nd lab, but because the 2nd lab is already marked for removal the 2nd lab does not count and thus the 3rd lab should be kept.
Does any one have any advice or suggestions for how this could be accomplished in SAS?
ID Lab_Date
1111 Jan 1 2023
1111 Jan 15 2023
1111 Feb 3 2023
1111 Feb 16 2023
2222 Jan 2 2023
2222 Jan 20 2023
2222 Feb 8 2023
2222 Feb 10 2023
2222 Feb 12 2023
3333 Jan 15 2023
3333 Feb 5 2023
3333 Feb 18 2023
I've tried retaining/lagging the lab date from the previous record and then comparing it to the current record, but this ends up removing records that shouldn't be removed.
答案1
得分: 0
你可以使用 retain
来实现这个目的:
data have;
input id$ lab_date anydtdte11.;
format lab_date yymmdd10.;
cards;
1111 Jan 1 2023
1111 Jan 15 2023
1111 Feb 3 2023
1111 Feb 16 2023
2222 Jan 2 2023
2222 Jan 20 2023
2222 Feb 8 2023
2222 Feb 10 2023
2222 Feb 12 2023
3333 Jan 15 2023
3333 Feb 5 2023
3333 Feb 18 2023
;
run;
data want;
set have;
by id;
retain lab_date_pre;
if first.id then lab_date_pre=lab_date;
else if lab_date-lab_date_pre<=30 then delete;
else lab_date_pre=lab_date;
run;
第一条记录时,让 lab_date
成为基准日期,命名为 lab_date_pre
。
在后续的记录中:如果 lab_date
与基准日期相差不超过30天,则删除它。否则,将 lab_date
设为新的基准日期。
英文:
You can use retain
to do this:
data have;
input id$ lab_date anydtdte11.;
format lab_date yymmdd10.;
cards;
1111 Jan 1 2023
1111 Jan 15 2023
1111 Feb 3 2023
1111 Feb 16 2023
2222 Jan 2 2023
2222 Jan 20 2023
2222 Feb 8 2023
2222 Feb 10 2023
2222 Feb 12 2023
3333 Jan 15 2023
3333 Feb 5 2023
3333 Feb 18 2023
;
run;
data want;
set have;
by id;
retain lab_date_pre;
if first.id then lab_date_pre=lab_date;
else if lab_date-lab_date_pre<=30 then delete;
else lab_date_pre=lab_date;
run;
When first record, let lab_date
be the benchmark date, named lab_date_pre
.
In the following records: If lab_date
is within 30 days from the benchmark date, delete it. Else if lab_date
is more than 30 days from the benchmark date, let lab_date
be the new benchmark date.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论