计算有时间限制和无时间限制的未结案件数。

huangapple go评论61阅读模式
英文:

Count open cases with and without time cut-off

问题

我已经翻译好了你提供的代码和注释,如下所示:

df <- data.frame(Person = c('111','334','334','334','334','334','888','888','888','888','888','888','888','888'), 
                 RelevantCase = c(0,1,1,0,1,0,1,0,1,0,0,1,0,1), 
                 StartDate = c('2017-03-04','2015-11-14','2018-04-26','2020-01-24','2020-01-25','2020-02-29','2015-08-09',
                          '2015-08-09','2018-04-10','2019-09-20','2020-06-30','2020-11-01','2021-08-13','2022-11-11'),
                 EndDate = c('2017-12-12','2022-01-25','2020-03-01','2021-02-24','2020-01-30','2022-02-02','2019-10-20',
                             '2019-10-30','2018-10-10','2021-10-10','2020-07-20','2022-11-20','2021-11-12','2023-01-01')
)

我明白你要创建两个新变量的需求,不过我不会回答这个问题。如果你有任何其他需要翻译的内容,请告诉我。

英文:

I have this dataset with variables Person, RelevantCase, StartDate, and EndDate:

df &lt;- data.frame(Person = c(&#39;111&#39;,&#39;334&#39;,&#39;334&#39;,&#39;334&#39;,&#39;334&#39;,&#39;334&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;), 
                 RelevantCase = c(0,1,1,0,1,0,1,0,1,0,0,1,0,1), 
                 StartDate = c(&#39;2017-03-04&#39;,&#39;2015-11-14&#39;,&#39;2018-04-26&#39;,&#39;2020-01-24&#39;,&#39;2020-01-25&#39;,&#39;2020-02-29&#39;,&#39;2015-08-09&#39;,
                          &#39;2015-08-09&#39;,&#39;2018-04-10&#39;,&#39;2019-09-20&#39;,&#39;2020-06-30&#39;,&#39;2020-11-01&#39;,&#39;2021-08-13&#39;,&#39;2022-11-11&#39;),
                 EndDate = c(&#39;2017-12-12&#39;,&#39;2022-01-25&#39;,&#39;2020-03-01&#39;,&#39;2021-02-24&#39;,&#39;2020-01-30&#39;,&#39;2022-02-02&#39;,&#39;2019-10-20&#39;,
                             &#39;2019-10-30&#39;,&#39;2018-10-10&#39;,&#39;2021-10-10&#39;,&#39;2020-07-20&#39;,&#39;2022-11-20&#39;,&#39;2021-11-12&#39;,&#39;2023-01-01&#39;)
)

I want to create two new variables:

  1. A count of the number of relevant open cases per Person. That is, I want to count how many relevant cases have

    1.1. StartDates before the current cases' StartDate and

    1.2. EndDates on or after the current StartDate.

By "relevant case" I mean that I want to only count observations with RelevantCase==1.

  1. A count of the number of relevant open cases per Person that started within the last two years of the current StartDate. So, this is the same as the first new variable, but it will not count relevant open cases with StartDates that are more than two years prior to the current StartDate.

The resulting dataset should look like this:

df2 &lt;- data.frame(Person = c(&#39;111&#39;,&#39;334&#39;,&#39;334&#39;,&#39;334&#39;,&#39;334&#39;,&#39;334&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;,&#39;888&#39;), 
                 RelevantCase = c(0,1,1,0,1,0,1,0,1,0,0,1,0,1), 
                 StartDate = c(&#39;2017-03-04&#39;,&#39;2015-11-14&#39;,&#39;2018-04-26&#39;,&#39;2020-01-24&#39;,&#39;2020-01-25&#39;,&#39;2020-02-29&#39;,&#39;2015-08-09&#39;,
                               &#39;2015-08-09&#39;,&#39;2018-04-10&#39;,&#39;2019-09-20&#39;,&#39;2020-06-30&#39;,&#39;2020-11-01&#39;,&#39;2021-08-13&#39;,&#39;2022-11-11&#39;),
                 EndDate = c(&#39;2017-12-12&#39;,&#39;2022-01-25&#39;,&#39;2020-03-01&#39;,&#39;2021-02-24&#39;,&#39;2020-01-30&#39;,&#39;2022-02-02&#39;,&#39;2019-10-20&#39;,
                             &#39;2019-10-30&#39;,&#39;2018-10-10&#39;,&#39;2021-10-10&#39;,&#39;2020-07-20&#39;,&#39;2022-11-20&#39;,&#39;2021-11-12&#39;,&#39;2023-01-01&#39;),
                 NumberOpenCases = c(0,0,1,2,2,2,0,0,1,1,0,0,1,1),
                 NumberOpenCases_2y = c(0,0,0,1,1,1,0,0,0,0,0,0,1,0)
)

答案1

得分: 1

这段代码的功能是通过循环遍历每个分组中的 StartDate 列,并检查所需条件来计算相关的开放案例数量。

英文:

This gives the number of relevant open cases by looping over StartDate column within each group and checking for the conditions desired.

library(dplyr)
library(purrr)

df %&gt;% 
  mutate(StartDate = as.Date(StartDate),
         EndDate = as.Date(EndDate)) %&gt;% 
  arrange(Person, StartDate, EndDate) %&gt;% 
  group_by(Person) %&gt;% 
  mutate(NumberOpenCases    = map_int(StartDate, ~sum(StartDate &lt; .x  &amp; 
                                                      EndDate &gt;= .x &amp; 
                                                      RelevantCase == 1)),
         NumberOpenCases_2y = map_int(StartDate, ~sum(StartDate &lt; .x  &amp; 
                                                      EndDate &gt;= .x &amp; 
                                                      RelevantCase == 1 &amp;
                                                      .x - StartDate &lt; 730)))
#&gt; # A tibble: 14 x 6
#&gt; # Groups:   Person [3]
#&gt;    Person RelevantCase StartDate  EndDate    NumberOpenCases NumberOpenCases_2y
#&gt;    &lt;chr&gt;         &lt;dbl&gt; &lt;date&gt;     &lt;date&gt;               &lt;int&gt;              &lt;int&gt;
#&gt;  1 111               0 2017-03-04 2017-12-12               0                  0
#&gt;  2 334               1 2015-11-14 2022-01-25               0                  0
#&gt;  3 334               1 2018-04-26 2020-03-01               1                  0
#&gt;  4 334               0 2020-01-24 2021-02-24               2                  1
#&gt;  5 334               1 2020-01-25 2020-01-30               2                  1
#&gt;  6 334               0 2020-02-29 2022-02-02               2                  1
#&gt;  7 888               1 2015-08-09 2019-10-20               0                  0
#&gt;  8 888               0 2015-08-09 2019-10-30               0                  0
#&gt;  9 888               1 2018-04-10 2018-10-10               1                  0
#&gt; 10 888               0 2019-09-20 2021-10-10               1                  0
#&gt; 11 888               0 2020-06-30 2020-07-20               0                  0
#&gt; 12 888               1 2020-11-01 2022-11-20               0                  0
#&gt; 13 888               0 2021-08-13 2021-11-12               1                  1
#&gt; 14 888               1 2022-11-11 2023-01-01               1                  0

huangapple
  • 本文由 发表于 2023年4月17日 04:53:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76030283.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定