计算 % 从不同计数中

huangapple go评论50阅读模式
英文:

SAS. calculate % from DISTINCT COUNT

问题

我在使用SAS Studio 版本:2022.09 进行工作。

我正在处理调查数据,将跟踪在三周以上没有提交调查的区域-设施。调查是自愿的,但理想情况下,设施会每周提交一次新调查。

为了获得每个区域之前提交过但已经三周未提交的设施列表。因为我们不指望每个设施都会成功,我们将在跟踪设施十周后停止。

Data have;
    set want;
    DaysDiff=intck('day', Date, today());
run;

proc sort data=have;
    by Facility Region Date;
run;

data have;
    set have;
    by Facility;
    if last.Facility;
run;

proc sort data=have
   out=SurveysMissing;
   BY Region Facility;
   WHERE DaysDiff>21 AND DaysDiff<70;
run;

为了帮助确定最近未提交调查的设施的重要性,我想获得一个百分比。

[每个区域未提交调查超过21但少于70天的设施总数] / [过去10周内已报告的每个区域设施的总数]

/* 未提交调查&gt;21且&lt;70天的设施数 */
proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff>21 AND DaysDiff <70
GROUP BY Region;
run;
/* 每个区域的不同设施数量 */
proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff <70
GROUP BY Region;
run;

我需要创建表并执行左连接来计算百分比吗?

谢谢。

英文:

I am working in SAS Studio Version: 2022.09.

I am working with survey data and will be tracking Region-Facility that has not submitted a survey in over 3 weeks. Surveys are voluntary but ideally facilities will submit a new survey weekly.

Region Facility (Type&Name) Date Survey Submitted
North Hospital-Baptist Hospital 1/01/2023
South PCP-Family Care 1/01/2023
North PCP- Primary Medical 1/08/2023
South PCP-Family Care 1/08/2023
North Hospital-Baptist Hospital 1/15/2023
North Hospital-St Mary Hospital 1/15/2023
West Daycare-Early Learning 1/15/2023
West Hospital-Methodist 1/15/2023
South Daycare-Early Learning 1/15/2023

To obtain a list of facilities by region that submitted before but have not submitted in 3 weeks. Since we do not expect to be successful with every facility, we will stop following facilities after 10 weeks.

Data have;
    set want;
    DaysDiff=intck(&#39;day&#39;, Date, today());
run;

proc sort data=have;
    by Facility Region Date;
run;

data have;
    set have;
    by Facility;
    if last.Facility;
run;

proc sort data=have
   out=SurveysMissing;
   BY Region Facility;
   WHERE DaysDiff&gt;21 AND DaysDiff&lt;70;
run;

To assist in determining significance of losing facilities that had not submitted recently, I would like to obtain a %.

[Total # of facilities per REGION that have not submitted survey &gt;21 &lt;70] / [Total # of facilities per REGION that have reported in the last 10 weeks]

/*#facilities not submitted &gt;21 AND &lt;70 /*

proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff&gt;21 AND DaysDiff &lt;70
GROUP BY Region;
run;
/*Count of Distinct Facilities per Region*/
proc sql;
SELECT Count(Distinct Facility) AS Count, Region
FROM have
WHERE DaysDiff &lt;70
GROUP BY Region;
run;

Would I need to create tables and do a left join to calculate %?

Thanks.

答案1

得分: 1

在Proc SQL中,真条件解析为1,假条件解析为0。 您可以利用这一特性来计算表达式或二进制标志的总和比率。

示例:

基于标记设施的子查询计算比率

proc sql;
  create table want as
  select
    region, sum (isquiet_flag) / sum (submitted_flag) label = 'Fraction of quiet facilities'
  from 
    ( select region, facility
      , min(today() - date_submitted ) > 21 as isquiet_flag
      , min(today() - date_submitted ) < 70 as submitted_flag
      from have
      where today() - date_submitted < 70
      group by region, facility
    )
  group by 
    region
  ;
英文:

In Proc SQL a true condition resolves to 1 and false to 0. You can leverage this feature to compute the ratio of sums of expressions or binary flags.

Example:

Compute the ratio based on a subquery that flags facilities

proc sql;
  create table want as
  select
    region, sum (isquiet_flag) / sum (submitted_flag) label = &#39;Fraction of quiet facilities&#39;
  from 
    ( select region, facility
      , min(today() - date_submitted ) &gt; 21 as isquiet_flag
      , min(today() - date_submitted ) &lt; 70 as submitted_flag
      from have
      where today() - date_submitted &lt; 70
      group by region, facility
    )
  group by 
    region
  ;

</details>



# 答案2
**得分**: 1

在你上一个针对 have 数据的步骤中,为缺失调查添加一个指示器。

data have;
set have;
by Facility;
if last.Facility;
surverymissing = (daysdiff > 21); * 如果条件为真,则包含1,否则为0;
run;

然后使用 proc summary 计算每个地区的分子和分母。分子是 surveymissing 的总和,而分母是相同项目的计数。

proc summary data=have nway;
where daysdiff < 70;
class region;
var surveymissing;
output out=region_summary(drop=_) sum=SurveysMissing n=TotalFacilities;
run;


<details>
<summary>英文:</summary>

In your last data step for `have`, add an indicator for missing survery.

data have;
set have;
by Facility;
if last. Facility;
surverymissing = (daysdiff > 21); * contains 1 if condition is true, otherwise 0;
run;

Then use `proc summary` to compute your numerator and denominator for each region. The numerator is the sum of `surveymissing` while the denominator is the count of the same.

proc summary data=have nway;
where daysdiff < 70;
class region;
var surveymissing;
output out=region_summary (drop=_:) sum=SurveysMissing n=TotalFacilities;
run;



</details>



huangapple
  • 本文由 发表于 2023年2月16日 06:38:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/75466072.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定