在每个分组中添加两行。日期的开始和结束。

huangapple go评论91阅读模式
英文:

Add two rows per Group. Start and end of date

问题

我的数据集包含了护士在医院房间内的扫描时间。我的目标是确定房间无人看护的时间有多少分钟。

首先,我需要为每个房间每天创建2行新记录。这些新记录应该表示一天的开始和结束(00:00:00 和 23:59:59)。创建这两个记录的逻辑是,例如,如果2023-04-24的最后一次扫描是在下午3点,房间1,我们想知道从2023-04-24 23:59:59到2023-04-24 3:00:00经过了多少分钟。

我需要的结果如下:

  1. Room ScanDT NewRecord
  2. 1 room1 2023-04-24 00:00:00 New
  3. 2 room1 2023-04-24 10:08:38 Old
  4. 3 room1 2023-04-24 10:09:36 Old
  5. 4 room1 2023-04-24 11:54:35 Old
  6. 5 room1 2023-04-24 23:59:59 New
  7. 6 room1 2023-05-24 00:00:00 New
  8. 7 room1 2023-05-24 13:51:10 Old
  9. 8 room1 2023-05-24 23:59:59 New
  10. 9 room2 2023-04-24 00:00:00 New
  11. 10 room2 2023-04-24 18:51:10 Old
  12. 11 room2 2023-04-24 20:51:10 Old
  13. 12 room2 2023-04-24 23:23:23 New

样本数据如下:

  1. df <- data.frame(Room = c('room1', 'room1', 'room1', 'room1',
  2. 'room2', 'room2'),
  3. ScanDT = as.POSIXct(c('2023-04-24 10:08:38', '2023-04-24 10:09:36', '2023-04-24 11:54:35', '2023-05-24 13:51:10',
  4. '2023-04-24 18:51:10', '2023-04-24 20:51:10')))
英文:

My dataset contains scan times by nurses in hospital rooms. My goal is to determine how many minutes the the room was unattended.

The first thing I need to do is to create 2 new rows per room for each day. The new records should represent the start and end of the day (00:00:00 and 23:59:59). The logic for creating these two records is that for example if the last scan for 2023-04-24 was at 3pm for Room1, we would like to know how many minutes had passed from 2023-04-24 23:59:59 to 2023-04-24 3:00:00.

  1. Room ScanDT
  2. 1 room1 2023-04-24 10:08:38
  3. 2 room1 2023-04-24 10:09:36
  4. 3 room1 2023-04-24 11:54:35
  5. 4 room1 2023-05-24 13:51:10
  6. 5 room2 2023-04-24 18:51:10
  7. 6 room2 2023-04-24 20:51:10

What I need:

  1. Room ScanDT NewRecord
  2. 1 room1 2023-04-24 00:00:00 New
  3. 2 room1 2023-04-24 10:08:38 Old
  4. 3 room1 2023-04-24 10:09:36 Old
  5. 4 room1 2023-04-24 11:54:35 Old
  6. 5 room1 2023-04-24 23:59:59 New
  7. 6 room1 2023-05-24 00:00:00 New
  8. 7 room1 2023-05-24 13:51:10 Old
  9. 8 room1 2023-05-24 23:59:59 New
  10. 9 room2 2023-04-24 00:00:00 New
  11. 10 room2 2023-04-24 18:51:10 Old
  12. 11 room2 2023-04-24 20:51:10 Old
  13. 12 room2 2023-04-24 23:23:23 New

Sample data

  1. df &lt;- data.frame(Room = c(&#39;room1&#39;, &#39;room1&#39;, &#39;room1&#39;, &#39;room1&#39;,
  2. &#39;room2&#39;, &#39;room2&#39;),
  3. ScanDT = as.POSIXct(c(&#39;2023-04-24 10:08:38&#39;, &#39;2023-04-24 10:09:36&#39;, &#39;2023-04-24 11:54:35&#39;, &#39;2023-05-24 13:51:10&#39;,
  4. &#39;2023-04-24 18:51:10&#39;, &#39;2023-04-24 20:51:10&#39;)))

答案1

得分: 1

以下是使用 data.table 包的代码尝试:

  1. library(data.table)
  2. setDT(df)
  3. ## 添加一个扫描ID以展示如何保留它
  4. df[, ScanID := .I]
  5. df[, ScanDate := as.POSIXct(trunc(ScanDT, units="days"))]
  6. dfout <- df[, .(
  7. ScanID=c(NA,ScanID,NA),
  8. ScanDT = c(ScanDate, ScanDT, ScanDate + as.difftime(1, units="days") - 1)),
  9. by=.(Room, ScanDate)]
  10. dfout
  11. ## Room ScanDate ScanID ScanDT
  12. ## <char> <POSc> <int> <POSc>
  13. ## 1: room1 2023-04-24 NA 2023-04-24 00:00:00
  14. ## 2: room1 2023-04-24 1 2023-04-24 10:08:38
  15. ## 3: room1 2023-04-24 2 2023-04-24 10:09:36
  16. ## 4: room1 2023-04-24 3 2023-04-24 11:54:35
  17. ## 5: room1 2023-04-24 NA 2023-04-24 23:59:59
  18. ## 6: room1 2023-05-24 NA 2023-05-24 00:00:00
  19. ## 7: room1 2023-05-24 4 2023-05-24 13:51:10
  20. ## 8: room1 2023-05-24 NA 2023-05-24 23:59:59
  21. ## 9: room2 2023-04-24 NA 2023-04-24 00:00:00
  22. ##10: room2 2023-04-24 5 2023-04-24 18:51:10
  23. ##11: room2 2023-04-24 6 2023-04-24 20:51:10
  24. ##12: room2 2023-04-24 NA 2023-04-24 23:59:59

如有需要,可以进一步计算差异:

  1. dfout[, ScanDTdiff := c(NA, `units<-`(diff(ScanDT), "mins")), by=.(Room, ScanDate)]
  2. ## Room ScanDate ScanID ScanDT ScanDTdiff
  3. ## <char> <POSc> <int> <POSc> <difftime>
  4. ## 1: room1 2023-04-24 NA 2023-04-24 00:00:00 NA mins
  5. ## 2: room1 2023-04-24 1 2023-04-24 10:08:38 608.6333333 mins
  6. ## 3: room1 2023-04-24 2 2023-04-24 10:09:36 0.9666667 mins
  7. ## 4: room1 2023-04-24 3 2023-04-24 11:54:35 104.9833333 mins
  8. ## 5: room1 2023-04-24 NA 2023-04-24 23:59:59 725.4000000 mins
  9. ## 6: room1 2023-05-24 NA 2023-05-24 00:00:00 NA mins
  10. ## 7: room1 2023-05-24 4 2023-05-24 13:51:10 831.1666667 mins
  11. ## 8: room1 2023-05-24 NA 2023-05-24 23:59:59 608.8166667 mins
  12. ## 9: room2 2023-04-24 NA 2023-04-24 00:00:00 NA mins
  13. ##10: room2 2023-04-24 5 2023-04-24 18:51:10 1131.1666667 mins
  14. ##11: room2 2023-04-24 6 2023-04-24 20:51:10 120.0000000 mins
  15. ##12: room2 2023-04-24 NA 2023-04-24 23:59:59 188.8166667 mins
英文:

Here's an attempt using the data.table package:

  1. library(data.table)
  2. setDT(df)
  3. ## add a scan id to show how to keep it
  4. df[, ScanID := .I]
  5. df[, ScanDate := as.POSIXct(trunc(ScanDT, units=&quot;days&quot;))]
  6. dfout &lt;- df[, .(
  7. ScanID=c(NA,ScanID,NA),
  8. ScanDT = c(ScanDate, ScanDT, ScanDate + as.difftime(1, units=&quot;days&quot;) - 1)),
  9. by=.(Room, ScanDate)]
  10. dfout
  11. ## Room ScanDate ScanID ScanDT
  12. ## &lt;char&gt; &lt;POSc&gt; &lt;int&gt; &lt;POSc&gt;
  13. ## 1: room1 2023-04-24 NA 2023-04-24 00:00:00
  14. ## 2: room1 2023-04-24 1 2023-04-24 10:08:38
  15. ## 3: room1 2023-04-24 2 2023-04-24 10:09:36
  16. ## 4: room1 2023-04-24 3 2023-04-24 11:54:35
  17. ## 5: room1 2023-04-24 NA 2023-04-24 23:59:59
  18. ## 6: room1 2023-05-24 NA 2023-05-24 00:00:00
  19. ## 7: room1 2023-05-24 4 2023-05-24 13:51:10
  20. ## 8: room1 2023-05-24 NA 2023-05-24 23:59:59
  21. ## 9: room2 2023-04-24 NA 2023-04-24 00:00:00
  22. ##10: room2 2023-04-24 5 2023-04-24 18:51:10
  23. ##11: room2 2023-04-24 6 2023-04-24 20:51:10
  24. ##12: room2 2023-04-24 NA 2023-04-24 23:59:59

Further calculations of differences if required:

  1. dfout[, ScanDTdiff := c(NA, `units&lt;-`(diff(ScanDT), &quot;mins&quot;)), by=.(Room, ScanDate)]
  2. ## Room ScanDate ScanID ScanDT ScanDTdiff
  3. ## &lt;char&gt; &lt;POSc&gt; &lt;int&gt; &lt;POSc&gt; &lt;difftime&gt;
  4. ## 1: room1 2023-04-24 NA 2023-04-24 00:00:00 NA mins
  5. ## 2: room1 2023-04-24 1 2023-04-24 10:08:38 608.6333333 mins
  6. ## 3: room1 2023-04-24 2 2023-04-24 10:09:36 0.9666667 mins
  7. ## 4: room1 2023-04-24 3 2023-04-24 11:54:35 104.9833333 mins
  8. ## 5: room1 2023-04-24 NA 2023-04-24 23:59:59 725.4000000 mins
  9. ## 6: room1 2023-05-24 NA 2023-05-24 00:00:00 NA mins
  10. ## 7: room1 2023-05-24 4 2023-05-24 13:51:10 831.1666667 mins
  11. ## 8: room1 2023-05-24 NA 2023-05-24 23:59:59 608.8166667 mins
  12. ## 9: room2 2023-04-24 NA 2023-04-24 00:00:00 NA mins
  13. ##10: room2 2023-04-24 5 2023-04-24 18:51:10 1131.1666667 mins
  14. ##11: room2 2023-04-24 6 2023-04-24 20:51:10 120.0000000 mins
  15. ##12: room2 2023-04-24 NA 2023-04-24 23:59:59 188.8166667 mins

答案2

得分: 1

这样怎么样?

  1. df <- data.frame(
  2. Room = c(
  3. 'room1', 'room1', 'room1', 'room1', 'room2', 'room2'),
  4. ScanDT = as.POSIXct(c(
  5. '2023-04-24 10:08:38', '2023-04-24 10:09:36', '2023-04-24 11:54:35',
  6. '2023-05-24 13:51:10', '2023-04-24 18:51:10', '2023-04-24 20:51:10')))
  7. df$day = as.Date(df$ScanDT)
  8. res = do.call(rbind, by(df, df[,c("Room", "day")], function(x) {
  9. x$NewRecord = "Old"
  10. rbind(
  11. data.frame(
  12. Room=x$Room[1],
  13. ScanDT=as.POSIXct(paste(x$day[1], "00:00:00")),
  14. day=x$day[1],
  15. NewRecord = "New"),
  16. x,
  17. data.frame(
  18. Room=x$Room[1],
  19. ScanDT=as.POSIXct(paste(x$day[1], "23:59:59")),
  20. day=x$day[1],
  21. NewRecord = "New")
  22. )
  23. }))
  24. Output:
  25. Room ScanDT day NewRecord
  26. 1 room1 2023-04-24 00:00:00 2023-04-24 New
  27. 2 room1 2023-04-24 10:08:38 2023-04-24 Old
  28. 3 room1 2023-04-24 10:09:36 2023-04-24 Old
  29. 4 room1 2023-04-24 11:54:35 2023-04-24 Old
  30. 5 room1 2023-04-24 23:59:59 2023-04-24 New
  31. 12 room2 2023-04-24 00:00:00 2023-04-24 New
  32. 51 room2 2023-04-24 18:51:10 2023-04-24 Old
  33. 11 room2 2023-04-24 23:59:59 2023-04-24 New
  34. 13 room2 2023-04-25 00:00:00 2023-04-25 New
  35. 6 room2 2023-04-24 20:51:10 2023-04-25 Old
  36. 111 room2 2023-04-25 23:59:59 2023-04-25 New
  37. 14 room1 2023-05-24 00:00:00 2023-05-24 New
  38. 41 room1 2023-05-24 13:51:10 2023-05-24 Old
  39. 112 room1 2023-05-24 23:59:59 2023-05-24 New
  40. 如果需要保留排序顺序,则需要进行更多的工作。
英文:

How about this?

  1. df &lt;- data.frame(
  2. Room = c(
  3. &#39;room1&#39;, &#39;room1&#39;, &#39;room1&#39;, &#39;room1&#39;, &#39;room2&#39;, &#39;room2&#39;),
  4. ScanDT = as.POSIXct(c(
  5. &#39;2023-04-24 10:08:38&#39;, &#39;2023-04-24 10:09:36&#39;, &#39;2023-04-24 11:54:35&#39;,
  6. &#39;2023-05-24 13:51:10&#39;, &#39;2023-04-24 18:51:10&#39;, &#39;2023-04-24 20:51:10&#39;)))
  7. df$day = as.Date(df$ScanDT)
  8. res = do.call(rbind, by(df, df[,c(&quot;Room&quot;, &quot;day&quot;)], function(x) {
  9. x$NewRecord = &quot;Old&quot;
  10. rbind(
  11. data.frame(
  12. Room=x$Room[1],
  13. ScanDT=as.POSIXct(paste(x$day[1], &quot;00:00:00&quot;)),
  14. day=x$day[1],
  15. NewRecord = &quot;New&quot;),
  16. x,
  17. data.frame(
  18. Room=x$Room[1],
  19. ScanDT=as.POSIXct(paste(x$day[1], &quot;23:59:59&quot;)),
  20. day=x$day[1],
  21. NewRecord = &quot;New&quot;)
  22. )
  23. }))

Output:

  1. Room ScanDT day NewRecord
  2. 1 room1 2023-04-24 00:00:00 2023-04-24 New
  3. 2 room1 2023-04-24 10:08:38 2023-04-24 Old
  4. 3 room1 2023-04-24 10:09:36 2023-04-24 Old
  5. 4 room1 2023-04-24 11:54:35 2023-04-24 Old
  6. 5 room1 2023-04-24 23:59:59 2023-04-24 New
  7. 12 room2 2023-04-24 00:00:00 2023-04-24 New
  8. 51 room2 2023-04-24 18:51:10 2023-04-24 Old
  9. 11 room2 2023-04-24 23:59:59 2023-04-24 New
  10. 13 room2 2023-04-25 00:00:00 2023-04-25 New
  11. 6 room2 2023-04-24 20:51:10 2023-04-25 Old
  12. 111 room2 2023-04-25 23:59:59 2023-04-25 New
  13. 14 room1 2023-05-24 00:00:00 2023-05-24 New
  14. 41 room1 2023-05-24 13:51:10 2023-05-24 Old
  15. 112 room1 2023-05-24 23:59:59 2023-05-24 New

If the sort order needs to be preserved, then there is a little more work to do.

huangapple
  • 本文由 发表于 2023年5月25日 04:57:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76327356.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定