Time since event calculations in R R中的事件后时间计算

huangapple go评论96阅读模式
英文:

Time since event calculations in R

问题

我有以下表格。我想计算自最近一次接受以来的天数。例如,用户1多次接受,但只有其中一次是最近的。我想做两件事:

标记一个简单的真值,表示mostRecent == TRUE/FALSE,并计算从那次接受到今天日期的天数。

  1. 用户 状态 邀请日期
  2. <fct> <fct> <date>
  3. 1 1 接受 2021-09-09
  4. 2 1 拒绝 2021-09-10
  5. 3 1 接受 2021-09-30
  6. 4 4 接受 2021-11-10
  7. 5 1 接受 2021-11-22
  8. 6 4 拒绝 2021-11-29

我已包含重新创建表格的代码。

  1. df <- tribble(
  2. ~user, ~status_name, ~invitationDate,
  3. "1", "拒绝", "2021-07-13",
  4. "4", "拒绝", "2021-07-31",
  5. "1", "接受", "2021-09-09",
  6. "1", "拒绝", "2021-09-10",
  7. "1", "接受", "2021-09-30",
  8. "4", "接受", "2021-11-10",
  9. "3", "拒绝", "2021-11-12",
  10. "2", "拒绝", "2021-11-18",
  11. "1", "接受", "2021-11-22",
  12. "4", "拒绝", "2021-11-29"
  13. ) %>%
  14. mutate(
  15. user = as.factor(user),
  16. status_name = as.factor(status_name),
  17. invitationDate = as.Date(invitationDate, format = "%Y-%m-%d")
  18. ) %>%
  19. group_by(user) %>%
  20. mutate(cumsum = cumsum(status_name == "接受")) %>%
  21. filter(cumsum > 0) %>%
  22. select(-cumsum)
英文:

I have the below table. I want to count the number of days since the most recent acceptance. So, for example, user 1 accepted on multiple occasions but only one of them is the most recent. I want to do 2 things:

Mark a simple truth value indicating mostRecent == TRUE/FALSE and calculate the number of days from that acceptance to today's date.

  1. user status_name invitationDate
  2. &lt;fct&gt; &lt;fct&gt; &lt;date&gt;
  3. 1 1 Accepted 2021-09-09
  4. 2 1 Declined 2021-09-10
  5. 3 1 Accepted 2021-09-30
  6. 4 4 Accepted 2021-11-10
  7. 5 1 Accepted 2021-11-22
  8. 6 4 Declined 2021-11-29

I have included the code to recreate the table below.

  1. df &lt;- tribble(
  2. ~user, ~status_name, ~invitationDate,
  3. &quot;1&quot;, &quot;Declined&quot;, &quot;2021-07-13&quot;,
  4. &quot;4&quot;, &quot;Declined&quot;, &quot;2021-07-31&quot;,
  5. &quot;1&quot;, &quot;Accepted&quot;, &quot;2021-09-09&quot;,
  6. &quot;1&quot;, &quot;Declined&quot;, &quot;2021-09-10&quot;,
  7. &quot;1&quot;, &quot;Accepted&quot;, &quot;2021-09-30&quot;,
  8. &quot;4&quot;, &quot;Accepted&quot;, &quot;2021-11-10&quot;,
  9. &quot;3&quot;, &quot;Declined&quot;, &quot;2021-11-12&quot;,
  10. &quot;2&quot;, &quot;Declined&quot;, &quot;2021-11-18&quot;,
  11. &quot;1&quot;, &quot;Accepted&quot;, &quot;2021-11-22&quot;,
  12. &quot;4&quot;, &quot;Declined&quot;, &quot;2021-11-29&quot;
  13. ) %&gt;%
  14. mutate(
  15. user = as.factor(user),
  16. status_name = as.factor(status_name),
  17. invitationDate = as.Date(invitationDate, format = &quot;%Y-%m-%d&quot;)
  18. ) %&gt;%
  19. group_by(user) %&gt;%
  20. mutate(cumsum = cumsum(status_name == &quot;Accepted&quot;)) %&gt;%
  21. filter(cumsum &gt; 0) %&gt;%
  22. select(-cumsum)

答案1

得分: 4

我们可以通过子集化 'invitationDate' 来创建逻辑索引,其中 'status_name' 为 Accepted,然后获取 max 并与 'invitationDate' 进行比较。

  1. library(dplyr)
  2. df %>%
  3. mutate(mostRecent = invitationDate %in%
  4. max(invitationDate[status_name == "Accepted"]),
  5. Diff = Sys.Date() - invitationDate[mostRecent]) %>%
  6. ungroup
英文:

We could create the logical index by subsetting the 'invitationDate' where the 'status_name' is Accepted, get the max and check with the invitationDate

  1. library(dplyr)
  2. df %&gt;%
  3. mutate(mostRecent = invitationDate %in%
  4. max(invitationDate[status_name == &quot;Accepted&quot;]),
  5. Diff = Sys.Date() - invitationDate[mostRecent]) %&gt;%
  6. ungroup
  7. </details>

huangapple
  • 本文由 发表于 2023年5月17日 15:18:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76269452.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定