英文:
SAS. calculate % from DISTINCT COUNT
问题
我在使用SAS Studio 版本:2022.09 进行工作。
我正在处理调查数据,将跟踪在三周以上没有提交调查的区域-设施。调查是自愿的,但理想情况下,设施会每周提交一次新调查。
为了获得每个区域之前提交过但已经三周未提交的设施列表。因为我们不指望每个设施都会成功,我们将在跟踪设施十周后停止。
Data have;
set want;
DaysDiff=intck('day', Date, today());
run;
proc sort data=have;
by Facility Region Date;
run;
data have;
set have;
by Facility;
if last.Facility;
run;
proc sort data=have
out=SurveysMissing;
BY Region Facility;
WHERE DaysDiff>21 AND DaysDiff<70;
run;
为了帮助确定最近未提交调查的设施的重要性,我想获得一个百分比。
[每个区域未提交调查超过21但少于70天的设施总数] / [过去10周内已报告的每个区域设施的总数]
/* 未提交调查>21且<70天的设施数 */
proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff>21 AND DaysDiff <70
GROUP BY Region;
run;
/* 每个区域的不同设施数量 */
proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff <70
GROUP BY Region;
run;
我需要创建表并执行左连接来计算百分比吗?
谢谢。
英文:
I am working in SAS Studio Version: 2022.09.
I am working with survey data and will be tracking Region-Facility that has not submitted a survey in over 3 weeks. Surveys are voluntary but ideally facilities will submit a new survey weekly.
Region | Facility (Type&Name) | Date Survey Submitted |
---|---|---|
North | Hospital-Baptist Hospital | 1/01/2023 |
South | PCP-Family Care | 1/01/2023 |
North | PCP- Primary Medical | 1/08/2023 |
South | PCP-Family Care | 1/08/2023 |
North | Hospital-Baptist Hospital | 1/15/2023 |
North | Hospital-St Mary Hospital | 1/15/2023 |
West | Daycare-Early Learning | 1/15/2023 |
West | Hospital-Methodist | 1/15/2023 |
South | Daycare-Early Learning | 1/15/2023 |
To obtain a list of facilities by region that submitted before but have not submitted in 3 weeks. Since we do not expect to be successful with every facility, we will stop following facilities after 10 weeks.
Data have;
set want;
DaysDiff=intck('day', Date, today());
run;
proc sort data=have;
by Facility Region Date;
run;
data have;
set have;
by Facility;
if last.Facility;
run;
proc sort data=have
out=SurveysMissing;
BY Region Facility;
WHERE DaysDiff>21 AND DaysDiff<70;
run;
To assist in determining significance of losing facilities that had not submitted recently, I would like to obtain a %.
[Total # of facilities per REGION that have not submitted survey >21 <70] / [Total # of facilities per REGION that have reported in the last 10 weeks]
/*#facilities not submitted >21 AND <70 /*
proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff>21 AND DaysDiff <70
GROUP BY Region;
run;
/*Count of Distinct Facilities per Region*/
proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff <70
GROUP BY Region;
run;
Would I need to create tables and do a left join to calculate %?
Thanks.
答案1
得分: 1
在Proc SQL中,真条件解析为1
,假条件解析为0
。 您可以利用这一特性来计算表达式或二进制标志的总和比率。
示例:
基于标记设施的子查询计算比率
proc sql;
create table want as
select
region, sum (isquiet_flag) / sum (submitted_flag) label = 'Fraction of quiet facilities'
from
( select region, facility
, min(today() - date_submitted ) > 21 as isquiet_flag
, min(today() - date_submitted ) < 70 as submitted_flag
from have
where today() - date_submitted < 70
group by region, facility
)
group by
region
;
英文:
In Proc SQL a true condition resolves to 1
and false to 0
. You can leverage this feature to compute the ratio of sums of expressions or binary flags.
Example:
Compute the ratio based on a subquery that flags facilities
proc sql;
create table want as
select
region, sum (isquiet_flag) / sum (submitted_flag) label = 'Fraction of quiet facilities'
from
( select region, facility
, min(today() - date_submitted ) > 21 as isquiet_flag
, min(today() - date_submitted ) < 70 as submitted_flag
from have
where today() - date_submitted < 70
group by region, facility
)
group by
region
;
</details>
# 答案2
**得分**: 1
在你上一个针对 have
数据的步骤中,为缺失调查添加一个指示器。
data have;
set have;
by Facility;
if last.Facility;
surverymissing = (daysdiff > 21); * 如果条件为真,则包含1,否则为0;
run;
然后使用 proc summary
计算每个地区的分子和分母。分子是 surveymissing
的总和,而分母是相同项目的计数。
proc summary data=have nway;
where daysdiff < 70;
class region;
var surveymissing;
output out=region_summary(drop=_) sum=SurveysMissing n=TotalFacilities;
run;
<details>
<summary>英文:</summary>
In your last data step for `have`, add an indicator for missing survery.
data have;
set have;
by Facility;
if last. Facility;
surverymissing = (daysdiff > 21); * contains 1 if condition is true, otherwise 0;
run;
Then use `proc summary` to compute your numerator and denominator for each region. The numerator is the sum of `surveymissing` while the denominator is the count of the same.
proc summary data=have nway;
where daysdiff < 70;
class region;
var surveymissing;
output out=region_summary (drop=_:) sum=SurveysMissing n=TotalFacilities;
run;
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论