Time since event calculations in R R中的事件后时间计算

huangapple go评论63阅读模式
英文:

Time since event calculations in R

问题

我有以下表格。我想计算自最近一次接受以来的天数。例如,用户1多次接受,但只有其中一次是最近的。我想做两件事:

标记一个简单的真值,表示mostRecent == TRUE/FALSE,并计算从那次接受到今天日期的天数。

  用户 状态   邀请日期
  <fct> <fct>       <date>        
1 1     接受    2021-09-09    
2 1     拒绝    2021-09-10    
3 1     接受    2021-09-30    
4 4     接受    2021-11-10    
5 1     接受    2021-11-22    
6 4     拒绝    2021-11-29  

我已包含重新创建表格的代码。

df <- tribble(
  ~user, ~status_name, ~invitationDate,
  "1", "拒绝", "2021-07-13",
  "4", "拒绝", "2021-07-31",
  "1", "接受", "2021-09-09",
  "1", "拒绝", "2021-09-10",
  "1", "接受", "2021-09-30",
  "4", "接受", "2021-11-10",
  "3", "拒绝", "2021-11-12",
  "2", "拒绝", "2021-11-18",
  "1", "接受", "2021-11-22",
  "4", "拒绝", "2021-11-29"
) %>%
  mutate(
    user = as.factor(user),
    status_name = as.factor(status_name),
    invitationDate = as.Date(invitationDate, format = "%Y-%m-%d")
  ) %>%
  group_by(user) %>%
  mutate(cumsum = cumsum(status_name == "接受")) %>%
  filter(cumsum > 0) %>%
  select(-cumsum)
英文:

I have the below table. I want to count the number of days since the most recent acceptance. So, for example, user 1 accepted on multiple occasions but only one of them is the most recent. I want to do 2 things:

Mark a simple truth value indicating mostRecent == TRUE/FALSE and calculate the number of days from that acceptance to today's date.

  user  status_name invitationDate
  &lt;fct&gt; &lt;fct&gt;       &lt;date&gt;        
1 1     Accepted    2021-09-09    
2 1     Declined    2021-09-10    
3 1     Accepted    2021-09-30    
4 4     Accepted    2021-11-10    
5 1     Accepted    2021-11-22    
6 4     Declined    2021-11-29 


I have included the code to recreate the table below.

df &lt;- tribble(
  ~user, ~status_name, ~invitationDate,
  &quot;1&quot;, &quot;Declined&quot;, &quot;2021-07-13&quot;,
  &quot;4&quot;, &quot;Declined&quot;, &quot;2021-07-31&quot;,
  &quot;1&quot;, &quot;Accepted&quot;, &quot;2021-09-09&quot;,
  &quot;1&quot;, &quot;Declined&quot;, &quot;2021-09-10&quot;,
  &quot;1&quot;, &quot;Accepted&quot;, &quot;2021-09-30&quot;,
  &quot;4&quot;, &quot;Accepted&quot;, &quot;2021-11-10&quot;,
  &quot;3&quot;, &quot;Declined&quot;, &quot;2021-11-12&quot;,
  &quot;2&quot;, &quot;Declined&quot;, &quot;2021-11-18&quot;,
  &quot;1&quot;, &quot;Accepted&quot;, &quot;2021-11-22&quot;,
  &quot;4&quot;, &quot;Declined&quot;, &quot;2021-11-29&quot;
) %&gt;%
  mutate(
    user = as.factor(user),
    status_name = as.factor(status_name),
    invitationDate = as.Date(invitationDate, format = &quot;%Y-%m-%d&quot;)
  ) %&gt;%
  group_by(user) %&gt;%
  mutate(cumsum = cumsum(status_name == &quot;Accepted&quot;)) %&gt;%
  filter(cumsum &gt; 0) %&gt;%
  select(-cumsum)

答案1

得分: 4

我们可以通过子集化 'invitationDate' 来创建逻辑索引,其中 'status_name' 为 Accepted,然后获取 max 并与 'invitationDate' 进行比较。

library(dplyr)
df %>%
  mutate(mostRecent = invitationDate %in%
     max(invitationDate[status_name == "Accepted"]),
   Diff = Sys.Date() - invitationDate[mostRecent]) %>%
  ungroup
英文:

We could create the logical index by subsetting the 'invitationDate' where the 'status_name' is Accepted, get the max and check with the invitationDate

library(dplyr)
df %&gt;% 
  mutate(mostRecent = invitationDate %in%
     max(invitationDate[status_name == &quot;Accepted&quot;]),
   Diff = Sys.Date() - invitationDate[mostRecent]) %&gt;%
  ungroup

</details>



huangapple
  • 本文由 发表于 2023年5月17日 15:18:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76269452.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定