检查事件是否以30秒的间隔发生。

huangapple go评论59阅读模式
英文:

check whether event occurred in 30-second intervals

问题

我有包含事件ID和事件发生时间戳的数据集。例如,2019年9月2日17:06。我想构建一个具有两个状态(无事件和事件)的马尔可夫链模型。为了避免构建连续时间的马尔可夫链,我想将时间段分为30秒,并检查在这30秒内是否发生事件。也许有人可以帮助我在R中如何实现它?谢谢!

我已经准备好了日期格式,并计算了两个事件之间的时间以及两个事件之间发生了多少次无事件。

data$timestamp <- as.POSIXct(data$timestamp, format="%m/%d/%Y %H:%M:%S")

nrow <- nrow(data)
for (i in 2:nrow) {
  data$diff[i] <- difftime(data$timestamp[i], data$timestamp[i-1], units="secs")
}
data$NUm <- round(data$diff/30)
英文:

I have the data set with event ID and timestamp when this event happened. For example at 9/2/2019 17:06. I want to build Markov chain model with two states noevent and event. To avoid building continuous time Markov chain, I want to split the period by 30 second and checking if in those 30 seconds event happened or not. Maybe someone could help me how to do it in R? Thank you!

I only prepared the date format and calculated the time between two events as well how many no events happened between two events.

data$timestamp &lt;- as.POSIXct(data$timestamp,format=&quot;%m/%d/%Y %H:%M:%S&quot;)

nrow &lt;- nrow(data)
for (i in 2:nrow) {
data$diff[i] &lt;- difftime(data$timestamp[i], data$timestamp[i-1], units=&quot;secs&quot;)}
data$NUm &lt;-round(data$diff/30)

答案1

得分: 0

tidyverse solution

使用lubridate::floor_date()来将时间戳舍入到30秒的间隔,并使用tidyr::complete()来填充没有事件的间隔:

library(dplyr)
library(tidyr)
library(lubridate)

data %>%
  mutate(timestamp = floor_date(timestamp, "30 seconds")) %>%
  complete(timestamp = full_seq(timestamp, 30)) %>%
  mutate(
    event = ifelse(!is.na(id), "yes", "no"),
    .keep = "unused"
  )
# A tibble: 8 × 2
  timestamp           event
  <dttm>              <chr>
1 2023-02-19 10:01:00 yes  
2 2023-02-19 10:01:30 no   
3 2023-02-19 10:02:00 yes  
4 2023-02-19 10:02:30 no   
5 2023-02-19 10:03:00 no   
6 2023-02-19 10:03:30 no   
7 2023-02-19 10:04:00 no   
8 2023-02-19 10:04:30 yes

Base R solution

与上面的逻辑类似,使用基本函数:

times <- as.POSIXlt(data$timestamp)
times$sec <- ifelse(times$sec < 30, 0, 30)
intervals <- seq(min(times), max(times), by = 30)
data.frame(
  intervals,
  event = ifelse(intervals %in% as.POSIXct(times), "yes", "no")
)
            intervals event
1 2023-02-19 10:01:00   yes
2 2023-02-19 10:01:30    no
3 2023-02-19 10:02:00   yes
4 2023-02-19 10:02:30    no
5 2023-02-19 10:03:00    no
6 2023-02-19 10:03:30    no
7 2023-02-19 10:04:00    no
8 2023-02-19 10:04:30   yes

示例数据

在未来,最好在您的问题中包含示例数据。对于这些解决方案,我使用了以下示例数据:

data <- data.frame(
  id = 1:3,
  timestamp = as.POSIXct(c(
    "2023-02-19 10:01:23",
    "2023-02-19 10:02:01",
    "2023-02-19 10:04:45"
  ))
)
英文:

tidyverse solution

Use lubridate::floor_date() to round to 30-second intervals and tidyr::complete() to fill in intervals with no events:

library(dplyr)
library(tidyr)
library(lubridate)

data %&gt;%
  mutate(timestamp = floor_date(timestamp, &quot;30 seconds&quot;)) %&gt;%
  complete(timestamp = full_seq(timestamp, 30)) %&gt;%
  mutate(
    event = ifelse(!is.na(id), &quot;yes&quot;, &quot;no&quot;),
    .keep = &quot;unused&quot;
  )
# A tibble: 8 &#215; 2
  timestamp           event
  &lt;dttm&gt;              &lt;chr&gt;
1 2023-02-19 10:01:00 yes  
2 2023-02-19 10:01:30 no   
3 2023-02-19 10:02:00 yes  
4 2023-02-19 10:02:30 no   
5 2023-02-19 10:03:00 no   
6 2023-02-19 10:03:30 no   
7 2023-02-19 10:04:00 no   
8 2023-02-19 10:04:30 yes

Base R solution

Similar logic as above, using base functions:

times &lt;- as.POSIXlt(data$timestamp)
times$sec &lt;- ifelse(times$sec &lt; 30, 0, 30)
intervals &lt;- seq(min(times), max(times), by = 30)
data.frame(
  intervals,
  event = ifelse(intervals %in% as.POSIXct(times), &quot;yes&quot;, &quot;no&quot;)
)
            intervals event
1 2023-02-19 10:01:00   yes
2 2023-02-19 10:01:30    no
3 2023-02-19 10:02:00   yes
4 2023-02-19 10:02:30    no
5 2023-02-19 10:03:00    no
6 2023-02-19 10:03:30    no
7 2023-02-19 10:04:00    no
8 2023-02-19 10:04:30   yes

Example data

In the future, it’s best if you include example data in your question. See How to make a great R reproducible example. For these solutions, I used:

data &lt;- data.frame(
  id = 1:3,
  timestamp = as.POSIXct(c(
    &quot;2023-02-19 10:01:23&quot;,
    &quot;2023-02-19 10:02:01&quot;,
    &quot;2023-02-19 10:04:45&quot;
  ))
)

huangapple
  • 本文由 发表于 2023年2月19日 21:02:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/75500332.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定