英文:
Add two rows per Group. Start and end of date
问题
我的数据集包含了护士在医院房间内的扫描时间。我的目标是确定房间无人看护的时间有多少分钟。
首先,我需要为每个房间每天创建2行新记录。这些新记录应该表示一天的开始和结束(00:00:00 和 23:59:59)。创建这两个记录的逻辑是,例如,如果2023-04-24的最后一次扫描是在下午3点,房间1,我们想知道从2023-04-24 23:59:59到2023-04-24 3:00:00经过了多少分钟。
我需要的结果如下:
Room ScanDT NewRecord
1 room1 2023-04-24 00:00:00 New
2 room1 2023-04-24 10:08:38 Old
3 room1 2023-04-24 10:09:36 Old
4 room1 2023-04-24 11:54:35 Old
5 room1 2023-04-24 23:59:59 New
6 room1 2023-05-24 00:00:00 New
7 room1 2023-05-24 13:51:10 Old
8 room1 2023-05-24 23:59:59 New
9 room2 2023-04-24 00:00:00 New
10 room2 2023-04-24 18:51:10 Old
11 room2 2023-04-24 20:51:10 Old
12 room2 2023-04-24 23:23:23 New
样本数据如下:
df <- data.frame(Room = c('room1', 'room1', 'room1', 'room1',
'room2', 'room2'),
ScanDT = as.POSIXct(c('2023-04-24 10:08:38', '2023-04-24 10:09:36', '2023-04-24 11:54:35', '2023-05-24 13:51:10',
'2023-04-24 18:51:10', '2023-04-24 20:51:10')))
英文:
My dataset contains scan times by nurses in hospital rooms. My goal is to determine how many minutes the the room was unattended.
The first thing I need to do is to create 2 new rows per room for each day. The new records should represent the start and end of the day (00:00:00 and 23:59:59). The logic for creating these two records is that for example if the last scan for 2023-04-24 was at 3pm for Room1, we would like to know how many minutes had passed from 2023-04-24 23:59:59 to 2023-04-24 3:00:00.
Room ScanDT
1 room1 2023-04-24 10:08:38
2 room1 2023-04-24 10:09:36
3 room1 2023-04-24 11:54:35
4 room1 2023-05-24 13:51:10
5 room2 2023-04-24 18:51:10
6 room2 2023-04-24 20:51:10
What I need:
Room ScanDT NewRecord
1 room1 2023-04-24 00:00:00 New
2 room1 2023-04-24 10:08:38 Old
3 room1 2023-04-24 10:09:36 Old
4 room1 2023-04-24 11:54:35 Old
5 room1 2023-04-24 23:59:59 New
6 room1 2023-05-24 00:00:00 New
7 room1 2023-05-24 13:51:10 Old
8 room1 2023-05-24 23:59:59 New
9 room2 2023-04-24 00:00:00 New
10 room2 2023-04-24 18:51:10 Old
11 room2 2023-04-24 20:51:10 Old
12 room2 2023-04-24 23:23:23 New
Sample data
df <- data.frame(Room = c('room1', 'room1', 'room1', 'room1',
'room2', 'room2'),
ScanDT = as.POSIXct(c('2023-04-24 10:08:38', '2023-04-24 10:09:36', '2023-04-24 11:54:35', '2023-05-24 13:51:10',
'2023-04-24 18:51:10', '2023-04-24 20:51:10')))
答案1
得分: 1
以下是使用 data.table 包的代码尝试:
library(data.table)
setDT(df)
## 添加一个扫描ID以展示如何保留它
df[, ScanID := .I]
df[, ScanDate := as.POSIXct(trunc(ScanDT, units="days"))]
dfout <- df[, .(
ScanID=c(NA,ScanID,NA),
ScanDT = c(ScanDate, ScanDT, ScanDate + as.difftime(1, units="days") - 1)),
by=.(Room, ScanDate)]
dfout
## Room ScanDate ScanID ScanDT
## <char> <POSc> <int> <POSc>
## 1: room1 2023-04-24 NA 2023-04-24 00:00:00
## 2: room1 2023-04-24 1 2023-04-24 10:08:38
## 3: room1 2023-04-24 2 2023-04-24 10:09:36
## 4: room1 2023-04-24 3 2023-04-24 11:54:35
## 5: room1 2023-04-24 NA 2023-04-24 23:59:59
## 6: room1 2023-05-24 NA 2023-05-24 00:00:00
## 7: room1 2023-05-24 4 2023-05-24 13:51:10
## 8: room1 2023-05-24 NA 2023-05-24 23:59:59
## 9: room2 2023-04-24 NA 2023-04-24 00:00:00
##10: room2 2023-04-24 5 2023-04-24 18:51:10
##11: room2 2023-04-24 6 2023-04-24 20:51:10
##12: room2 2023-04-24 NA 2023-04-24 23:59:59
如有需要,可以进一步计算差异:
dfout[, ScanDTdiff := c(NA, `units<-`(diff(ScanDT), "mins")), by=.(Room, ScanDate)]
## Room ScanDate ScanID ScanDT ScanDTdiff
## <char> <POSc> <int> <POSc> <difftime>
## 1: room1 2023-04-24 NA 2023-04-24 00:00:00 NA mins
## 2: room1 2023-04-24 1 2023-04-24 10:08:38 608.6333333 mins
## 3: room1 2023-04-24 2 2023-04-24 10:09:36 0.9666667 mins
## 4: room1 2023-04-24 3 2023-04-24 11:54:35 104.9833333 mins
## 5: room1 2023-04-24 NA 2023-04-24 23:59:59 725.4000000 mins
## 6: room1 2023-05-24 NA 2023-05-24 00:00:00 NA mins
## 7: room1 2023-05-24 4 2023-05-24 13:51:10 831.1666667 mins
## 8: room1 2023-05-24 NA 2023-05-24 23:59:59 608.8166667 mins
## 9: room2 2023-04-24 NA 2023-04-24 00:00:00 NA mins
##10: room2 2023-04-24 5 2023-04-24 18:51:10 1131.1666667 mins
##11: room2 2023-04-24 6 2023-04-24 20:51:10 120.0000000 mins
##12: room2 2023-04-24 NA 2023-04-24 23:59:59 188.8166667 mins
英文:
Here's an attempt using the data.table package:
library(data.table)
setDT(df)
## add a scan id to show how to keep it
df[, ScanID := .I]
df[, ScanDate := as.POSIXct(trunc(ScanDT, units="days"))]
dfout <- df[, .(
ScanID=c(NA,ScanID,NA),
ScanDT = c(ScanDate, ScanDT, ScanDate + as.difftime(1, units="days") - 1)),
by=.(Room, ScanDate)]
dfout
## Room ScanDate ScanID ScanDT
## <char> <POSc> <int> <POSc>
## 1: room1 2023-04-24 NA 2023-04-24 00:00:00
## 2: room1 2023-04-24 1 2023-04-24 10:08:38
## 3: room1 2023-04-24 2 2023-04-24 10:09:36
## 4: room1 2023-04-24 3 2023-04-24 11:54:35
## 5: room1 2023-04-24 NA 2023-04-24 23:59:59
## 6: room1 2023-05-24 NA 2023-05-24 00:00:00
## 7: room1 2023-05-24 4 2023-05-24 13:51:10
## 8: room1 2023-05-24 NA 2023-05-24 23:59:59
## 9: room2 2023-04-24 NA 2023-04-24 00:00:00
##10: room2 2023-04-24 5 2023-04-24 18:51:10
##11: room2 2023-04-24 6 2023-04-24 20:51:10
##12: room2 2023-04-24 NA 2023-04-24 23:59:59
Further calculations of differences if required:
dfout[, ScanDTdiff := c(NA, `units<-`(diff(ScanDT), "mins")), by=.(Room, ScanDate)]
## Room ScanDate ScanID ScanDT ScanDTdiff
## <char> <POSc> <int> <POSc> <difftime>
## 1: room1 2023-04-24 NA 2023-04-24 00:00:00 NA mins
## 2: room1 2023-04-24 1 2023-04-24 10:08:38 608.6333333 mins
## 3: room1 2023-04-24 2 2023-04-24 10:09:36 0.9666667 mins
## 4: room1 2023-04-24 3 2023-04-24 11:54:35 104.9833333 mins
## 5: room1 2023-04-24 NA 2023-04-24 23:59:59 725.4000000 mins
## 6: room1 2023-05-24 NA 2023-05-24 00:00:00 NA mins
## 7: room1 2023-05-24 4 2023-05-24 13:51:10 831.1666667 mins
## 8: room1 2023-05-24 NA 2023-05-24 23:59:59 608.8166667 mins
## 9: room2 2023-04-24 NA 2023-04-24 00:00:00 NA mins
##10: room2 2023-04-24 5 2023-04-24 18:51:10 1131.1666667 mins
##11: room2 2023-04-24 6 2023-04-24 20:51:10 120.0000000 mins
##12: room2 2023-04-24 NA 2023-04-24 23:59:59 188.8166667 mins
答案2
得分: 1
这样怎么样?
df <- data.frame(
Room = c(
'room1', 'room1', 'room1', 'room1', 'room2', 'room2'),
ScanDT = as.POSIXct(c(
'2023-04-24 10:08:38', '2023-04-24 10:09:36', '2023-04-24 11:54:35',
'2023-05-24 13:51:10', '2023-04-24 18:51:10', '2023-04-24 20:51:10')))
df$day = as.Date(df$ScanDT)
res = do.call(rbind, by(df, df[,c("Room", "day")], function(x) {
x$NewRecord = "Old"
rbind(
data.frame(
Room=x$Room[1],
ScanDT=as.POSIXct(paste(x$day[1], "00:00:00")),
day=x$day[1],
NewRecord = "New"),
x,
data.frame(
Room=x$Room[1],
ScanDT=as.POSIXct(paste(x$day[1], "23:59:59")),
day=x$day[1],
NewRecord = "New")
)
}))
Output:
Room ScanDT day NewRecord
1 room1 2023-04-24 00:00:00 2023-04-24 New
2 room1 2023-04-24 10:08:38 2023-04-24 Old
3 room1 2023-04-24 10:09:36 2023-04-24 Old
4 room1 2023-04-24 11:54:35 2023-04-24 Old
5 room1 2023-04-24 23:59:59 2023-04-24 New
12 room2 2023-04-24 00:00:00 2023-04-24 New
51 room2 2023-04-24 18:51:10 2023-04-24 Old
11 room2 2023-04-24 23:59:59 2023-04-24 New
13 room2 2023-04-25 00:00:00 2023-04-25 New
6 room2 2023-04-24 20:51:10 2023-04-25 Old
111 room2 2023-04-25 23:59:59 2023-04-25 New
14 room1 2023-05-24 00:00:00 2023-05-24 New
41 room1 2023-05-24 13:51:10 2023-05-24 Old
112 room1 2023-05-24 23:59:59 2023-05-24 New
如果需要保留排序顺序,则需要进行更多的工作。
英文:
How about this?
df <- data.frame(
Room = c(
'room1', 'room1', 'room1', 'room1', 'room2', 'room2'),
ScanDT = as.POSIXct(c(
'2023-04-24 10:08:38', '2023-04-24 10:09:36', '2023-04-24 11:54:35',
'2023-05-24 13:51:10', '2023-04-24 18:51:10', '2023-04-24 20:51:10')))
df$day = as.Date(df$ScanDT)
res = do.call(rbind, by(df, df[,c("Room", "day")], function(x) {
x$NewRecord = "Old"
rbind(
data.frame(
Room=x$Room[1],
ScanDT=as.POSIXct(paste(x$day[1], "00:00:00")),
day=x$day[1],
NewRecord = "New"),
x,
data.frame(
Room=x$Room[1],
ScanDT=as.POSIXct(paste(x$day[1], "23:59:59")),
day=x$day[1],
NewRecord = "New")
)
}))
Output:
Room ScanDT day NewRecord
1 room1 2023-04-24 00:00:00 2023-04-24 New
2 room1 2023-04-24 10:08:38 2023-04-24 Old
3 room1 2023-04-24 10:09:36 2023-04-24 Old
4 room1 2023-04-24 11:54:35 2023-04-24 Old
5 room1 2023-04-24 23:59:59 2023-04-24 New
12 room2 2023-04-24 00:00:00 2023-04-24 New
51 room2 2023-04-24 18:51:10 2023-04-24 Old
11 room2 2023-04-24 23:59:59 2023-04-24 New
13 room2 2023-04-25 00:00:00 2023-04-25 New
6 room2 2023-04-24 20:51:10 2023-04-25 Old
111 room2 2023-04-25 23:59:59 2023-04-25 New
14 room1 2023-05-24 00:00:00 2023-05-24 New
41 room1 2023-05-24 13:51:10 2023-05-24 Old
112 room1 2023-05-24 23:59:59 2023-05-24 New
If the sort order needs to be preserved, then there is a little more work to do.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论