以组特定的开始和结束填补分组的tsibble中的空白

huangapple go评论97阅读模式
英文:

Fill gaps in a grouped tsibble with group-specific starts and ends

问题

我有一个tsibble,其中时间戳的观测数据已经汇总到30分钟的间隔。数据分为几个组,我想确保每个30分钟的组都出现在tsibble中,即使在那个时间段内没有观测到任何数据。

让我们回到我以前关于tsibble的问题中提到的观鸟的例子。假设我每天从早上8:00到下午18:00在某个地点观察鸭子和鹅,并记录每次观察的时间、观察到的鸟的种类以及观察到的鸟群的数量。

以下是示例代码:

library(tidyverse) # 包括lubridate
library(tsibble)

N <- 10
set.seed(42)

# 假设我们在8:00和18:00之间观察鸭子和鹅。
d       <- as_datetime("2023-03-08 08:00:00")
times   <- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3))))
nObs    <- 1 + rpois(length(times), lambda = 1)
birdIdx <- 1 + round(runif(length(times)))
birds   <- c("Duck", "Goose")[birdIdx]

# 观测数据的tibble
waterfowl <- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))

# 转换为tsibble(时间序列tibble)并以30分钟为单位汇总
waterfowl |
    as_tsibble(index = Timestamp) |
    group_by(Bird) |
    index_by(Interval = floor_date(Timestamp, "30 minute")) |
    summarize(`Total birds` = sum(Count)) -> waterfowlSumm

waterfowlSumm | print(n = Inf)

上述代码创建了一个tsibble,其中每个时间间隔为30分钟,并且对每个鸟类观察的总鸟数进行了汇总。

接下来,你想要填充缺失的时间间隔。你可以使用fill_gaps来实现这一点。以下是示例代码:

> waterfowlSumm | fill_gaps(`Total birds` = 0) | print(n = Inf)

这将填充缺失的时间间隔,将缺失的鸟数设置为0。

但是,由于你在早上8:00开始观察鸟类,下午18:00停止观察,你希望填充缺失的时间间隔,直到你实际观察到鸟类的时间结束。你可以像这样设置.start.end参数:

> waterfowlSumm | fill_gaps(`Total birds` = 0, .start = d, .end = d + hours(9) + minutes(30)) | print(n = Inf)

这将填充直到你实际观察的时间结束为止的时间间隔。

但是,如果你的数据有额外的分组变量,例如,你在不同地点观察鸟类,每个地点都有不同的观察者,并且观察者的工作时间不同,那么.start.end必须根据每个分组设置。你可以尝试以下代码:

> waterfowlSumm | group_by(Site) | mutate(Start = min(Interval), End = max(Interval)) | fill_gaps(`Total birds` = 0, .start = Start, .end = End)

这将根据每个地点的观察者的工作时间来填充缺失的时间间隔。

英文:

I've got a tsibble where timestamped observational data has been aggregated to 30-minute intervals. The data is in several groups, and I'd like to make sure that each 30-minute group appears in the tsibble, even when there were no observations in that time period.

Let's return to the birdwatching example from my previous question about tsibbles. Suppose I'm watching duck and geese at a certain location from 8:00 to 18:00 each day and recording, for each observation, a) the time, b) the type of bird observed, and c) the number of birds in the flock observed.

library(tidyverse) # includes lubridate
library(tsibble)

N &lt;- 10
set.seed(42)

# suppose we&#39;re observing ducks and geese between 8:00 and 18:00.
d       &lt;- as_datetime(&quot;2023-03-08 08:00:00&quot;)
times   &lt;- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3)))))
nObs    &lt;- 1 + rpois(length(times), lambda = 1)
birdIdx &lt;- 1 + round(runif(length(times)))
birds   &lt;- c(&quot;Duck&quot;, &quot;Goose&quot;)[birdIdx]

# Tibble of observations
waterfowl &lt;- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))

# Convert to tsibble (time series tibble) and aggregate on a 30-minute basis
waterfowl |&gt;
    as_tsibble(index = Timestamp) |&gt;
    group_by(Bird) |&gt;
    index_by(Interval = floor_date(Timestamp, &quot;30 minute&quot;)) |&gt;
    summarize(`Total birds` = sum(Count)) -&gt; waterfowlSumm

waterfowlSumm |&gt; print(n = Inf)

This gives

# A tsibble: 10 x 3 [30m] &lt;UTC&gt;
# Key:       Bird [2]
   Bird  Interval            `Total birds`
   &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt;
 1 Goose 2023-03-08 09:00:00             2
 2 Goose 2023-03-08 13:00:00             4
 3 Goose 2023-03-08 14:00:00             1
 4 Goose 2023-03-08 15:00:00             4
 5 Goose 2023-03-08 16:00:00             1
 6 Goose 2023-03-08 17:00:00             2
 7 Duck  2023-03-08 10:30:00             2
 8 Duck  2023-03-08 14:30:00             2
 9 Duck  2023-03-08 15:00:00             4
10 Duck  2023-03-08 17:00:00             2

What I'd like to do is fill missing intervals. I can use fill_gaps for this:

&gt; waterfowlSumm |&gt; fill_gaps(`Total birds` = 0) |&gt; print(n = Inf)
# A tsibble: 31 x 3 [30m] &lt;UTC&gt;
# Key:       Bird [2]
   Bird  Interval            `Total birds`
   &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt;
 1 Goose 2023-03-08 09:00:00             2
 2 Goose 2023-03-08 09:30:00             0
 3 Goose 2023-03-08 10:00:00             0
...
15 Goose 2023-03-08 16:00:00             1
16 Goose 2023-03-08 16:30:00             0
17 Goose 2023-03-08 17:00:00             2
18 Duck  2023-03-08 10:30:00             2
19 Duck  2023-03-08 11:00:00             0
20 Duck  2023-03-08 11:30:00             0
...
29 Duck  2023-03-08 16:00:00             0
30 Duck  2023-03-08 16:30:00             0
31 Duck  2023-03-08 17:00:00             2

However, since I start watching birds at 8:00 and stop at 18:00, I'd like to fill in missing intervals beyond the times where I actually observed birds. So I might do

&gt; waterfowlSumm |&gt; fill_gaps(`Total birds` = 0, .start = d, .end = d + hours(9) + minutes(30)) |&gt; print(n = Inf)
# A tsibble: 40 x 3 [30m] &lt;UTC&gt;
# Key:       Bird [2]
   Bird  Interval            `Total birds`
   &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt;
 1 Goose 2023-03-08 08:00:00             0
 2 Goose 2023-03-08 08:30:00             0
 3 Goose 2023-03-08 09:00:00             2
...
18 Goose 2023-03-08 16:30:00             0
19 Goose 2023-03-08 17:00:00             2
20 Goose 2023-03-08 17:30:00             0
21 Duck  2023-03-08 08:00:00             0
22 Duck  2023-03-08 08:30:00             0
23 Duck  2023-03-08 09:00:00             0
...
38 Duck  2023-03-08 16:30:00             0
39 Duck  2023-03-08 17:00:00             2
40 Duck  2023-03-08 17:30:00             0

This works. However, now suppose that my data has additional grouping variables --- say, I'm observing birds at several sites. Of course, since I can't be in two places at the same time, each site has a different observer. And different observers have different working hours, so .start and .end must be set on a per-group basis.

The start/end times are available in my data, but .start and .end apparently can't be pulled from the tsibble being operated on:

&gt; waterfowlSumm |&gt; mutate(Start = d, End = d + hours(9) + minutes(30)) |&gt; fill_gaps(`Total birds` = 0, .start = Start, .end = End)
Error in scan_gaps.tbl_ts(.data, .full = !!enquo(.full), .start = .start,  : 
  object &#39;Start&#39; not found

So my question is: how do I do this? I'd really like to be able to use grouping (in this example I only have one group to begin with, but in reality there are many) so I only have to invoke fill_gaps once, with the correct start/end being pulled from the tsibble.

Thanks!

答案1

得分: 1

fill_gaps() 函数将隐式缺失值转换为显式缺失值,基于每个时间序列的本地或整个数据集的起始和结束日期以及索引类别。如果在使用 fill_gaps() 时不指定 .start.end 日期,它将计算每个时间序列的时间范围,并根据数据的时间间隔填充任何缺失的时间点。这对于处理不同站点和鸟类的计数范围不同的问题应该是有效的。

然而,如果你处理的是跨越多天的数据,fill_gaps() 函数还会在工作日之间添加隐式的缺失时间点(因为间隔是30分钟,而数据在过夜时段缺失)。因此,你可能希望先使用NA来填充隐式缺失值,然后保留一个工作小时的数据集,可以将其连接到观测数据,并将NA转换为0(如果有人在工作)。例如:

library(tidyverse) # 包括 lubridate
library(tsibble)
#&gt; 
#&gt; 附加包: ‘tsibble’
#&gt; 包中的下列对象被遮盖自‘package:lubridate’:
#&gt; 
#&gt;     interval
#&gt; 包中的下列对象被遮盖自‘package:base’:
#&gt; 
#&gt;     intersect, setdiff, union
library(fable)
#&gt; 需要的包: ‘fabletools’

N &lt;- 10
set.seed(42)

# 假设我们观察了鸭子和雁在8:00至18:00之间。
d       &lt;- as_datetime("2023-03-08 08:00:00")
times   &lt;- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3))))
nObs    &lt;- 1 + rpois(length(times), lambda = 1)
birdIdx &lt;- 1 + round(runif(length(times)))
birds   &lt;- c("Duck", "Goose")[birdIdx]

# 观测数据的数据框
waterfowl &lt;- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))

# 添加第2天
waterfowl &lt;- bind_rows(waterfowl, waterfowl |&gt; mutate(Timestamp = Timestamp + days(1)))

# 转换为tsibble(时间序列数据框)并按30分钟的间隔聚合
waterfowl |&gt;
  as_tsibble(index = Timestamp) |&gt;
  group_by(Bird) |&gt;
  index_by(Interval = floor_date(Timestamp, "30 minute")) |&gt;
  summarize(`Total birds` = sum(Count)) -&gt; waterfowlSumm

waterfowlSumm |&gt; print(n = Inf)
#&gt; # 一个tsibble: 20 x 3 [30m] &lt;UTC&gt;
#&gt; # 关键字:       Bird [2]
#&gt;    Bird  Interval            `Total birds`
#&gt;    &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt;
#&gt;  1 Goose 2023-03-08 09:00:00             2
#&gt;  2 Goose 2023-03-08 13:00:00             4
#&gt;  3 Goose 2023-03-08 14:00:00             1
#&gt;  4 Goose 2023-03-08 15:00:00             4
#&gt;  5 Goose 2023-03-08 16:00:00             1
#&gt;  6 Goose 2023-03-08 17:00:00             2
#&gt;  7 Goose 2023-03-09 09:00:00             2
#&gt;  8 Goose 2023-03-09 13:00:00             4
#&gt;  9 Goose 2023-03-09 14:00:00             1
#&gt; 10 Goose 2023-03-09 15:00:00             4

<details>
<summary>英文:</summary>

The `fill_gaps()` function converts the implicit missing values into explicit missing values, based on either the local (per series) or global (per dataset) start and end dates and the index class.

Using `fill_gaps()` without specifying the `.start` and `.end` date will compute the time range for each series, and fill in any missing time points based on the data&#39;s time interval. This should work for your problem of different counting ranges for sites and birds.

However if you are working with multiple days, the `fill_gaps()` function will also add in the overnight hours between working days (as the interval is 30 minutes, and data is missing overnight). So you might want to instead fill implicit missing values with NA, and then maintain a working hours dataset that can be joined onto your observations data and used to convert NA to 0 if someone was working. For example:

``` r
library(tidyverse) # includes lubridate
library(tsibble)
#&gt; 
#&gt; Attaching package: &#39;tsibble&#39;
#&gt; The following object is masked from &#39;package:lubridate&#39;:
#&gt; 
#&gt;     interval
#&gt; The following objects are masked from &#39;package:base&#39;:
#&gt; 
#&gt;     intersect, setdiff, union
library(fable)
#&gt; Loading required package: fabletools

N &lt;- 10
set.seed(42)

# suppose we&#39;re observing ducks and geese between 8:00 and 18:00.
d       &lt;- as_datetime(&quot;2023-03-08 08:00:00&quot;)
times   &lt;- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3)))))
nObs    &lt;- 1 + rpois(length(times), lambda = 1)
birdIdx &lt;- 1 + round(runif(length(times)))
birds   &lt;- c(&quot;Duck&quot;, &quot;Goose&quot;)[birdIdx]

# Tibble of observations
waterfowl &lt;- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))

# Add day 2
waterfowl &lt;- bind_rows(waterfowl, waterfowl |&gt; mutate(Timestamp = Timestamp + days(1)))

# Convert to tsibble (time series tibble) and aggregate on a 30-minute basis
waterfowl |&gt;
  as_tsibble(index = Timestamp) |&gt;
  group_by(Bird) |&gt;
  index_by(Interval = floor_date(Timestamp, &quot;30 minute&quot;)) |&gt;
  summarize(`Total birds` = sum(Count)) -&gt; waterfowlSumm

waterfowlSumm |&gt; print(n = Inf)
#&gt; # A tsibble: 20 x 3 [30m] &lt;UTC&gt;
#&gt; # Key:       Bird [2]
#&gt;    Bird  Interval            `Total birds`
#&gt;    &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt;
#&gt;  1 Goose 2023-03-08 09:00:00             2
#&gt;  2 Goose 2023-03-08 13:00:00             4
#&gt;  3 Goose 2023-03-08 14:00:00             1
#&gt;  4 Goose 2023-03-08 15:00:00             4
#&gt;  5 Goose 2023-03-08 16:00:00             1
#&gt;  6 Goose 2023-03-08 17:00:00             2
#&gt;  7 Goose 2023-03-09 09:00:00             2
#&gt;  8 Goose 2023-03-09 13:00:00             4
#&gt;  9 Goose 2023-03-09 14:00:00             1
#&gt; 10 Goose 2023-03-09 15:00:00             4
#&gt; 11 Goose 2023-03-09 16:00:00             1
#&gt; 12 Goose 2023-03-09 17:00:00             2
#&gt; 13 Duck  2023-03-08 10:30:00             2
#&gt; 14 Duck  2023-03-08 14:30:00             2
#&gt; 15 Duck  2023-03-08 15:00:00             4
#&gt; 16 Duck  2023-03-08 17:00:00             2
#&gt; 17 Duck  2023-03-09 10:30:00             2
#&gt; 18 Duck  2023-03-09 14:30:00             2
#&gt; 19 Duck  2023-03-09 15:00:00             4
#&gt; 20 Duck  2023-03-09 17:00:00             2

# This adds 0 between the days
waterfowlSumm |&gt; fill_gaps(`Total birds` = 0) |&gt; print(n = Inf)
#&gt; # A tsibble: 127 x 3 [30m] &lt;UTC&gt;
#&gt; # Key:       Bird [2]
#&gt;     Bird  Interval            `Total birds`
#&gt;     &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt;
#&gt;   1 Goose 2023-03-08 09:00:00             2
#&gt;   2 Goose 2023-03-08 09:30:00             0
#&gt;   3 Goose 2023-03-08 10:00:00             0
#&gt;   4 Goose 2023-03-08 10:30:00             0
#&gt;   5 Goose 2023-03-08 11:00:00             0
#&gt;   6 Goose 2023-03-08 11:30:00             0
#&gt;   7 Goose 2023-03-08 12:00:00             0
#&gt;   8 Goose 2023-03-08 12:30:00             0
#&gt;   9 Goose 2023-03-08 13:00:00             4
#&gt;  10 Goose 2023-03-08 13:30:00             0
#&gt;  11 Goose 2023-03-08 14:00:00             1
#&gt;  12 Goose 2023-03-08 14:30:00             0
#&gt;  13 Goose 2023-03-08 15:00:00             4
#&gt;  14 Goose 2023-03-08 15:30:00             0
#&gt;  15 Goose 2023-03-08 16:00:00             1
#&gt;  16 Goose 2023-03-08 16:30:00             0
#&gt;  17 Goose 2023-03-08 17:00:00             2
#&gt;  18 Goose 2023-03-08 17:30:00             0
#&gt;  19 Goose 2023-03-08 18:00:00             0
#&gt;  20 Goose 2023-03-08 18:30:00             0
#&gt;  21 Goose 2023-03-08 19:00:00             0
#&gt;  22 Goose 2023-03-08 19:30:00             0
#&gt;  23 Goose 2023-03-08 20:00:00             0
#&gt;  24 Goose 2023-03-08 20:30:00             0
#&gt;  25 Goose 2023-03-08 21:00:00             0
#&gt;  26 Goose 2023-03-08 21:30:00             0
#&gt;  27 Goose 2023-03-08 22:00:00             0
#&gt;  28 Goose 2023-03-08 22:30:00             0
#&gt;  29 Goose 2023-03-08 23:00:00             0
#&gt;  30 Goose 2023-03-08 23:30:00             0
#&gt;  31 Goose 2023-03-09 00:00:00             0
#&gt;  32 Goose 2023-03-09 00:30:00             0
#&gt;  33 Goose 2023-03-09 01:00:00             0
#&gt;  34 Goose 2023-03-09 01:30:00             0
#&gt;  35 Goose 2023-03-09 02:00:00             0
#&gt;  36 Goose 2023-03-09 02:30:00             0
#&gt;  37 Goose 2023-03-09 03:00:00             0
#&gt;  38 Goose 2023-03-09 03:30:00             0
#&gt;  39 Goose 2023-03-09 04:00:00             0
#&gt;  40 Goose 2023-03-09 04:30:00             0
#&gt;  41 Goose 2023-03-09 05:00:00             0
#&gt;  42 Goose 2023-03-09 05:30:00             0
#&gt;  43 Goose 2023-03-09 06:00:00             0
#&gt;  44 Goose 2023-03-09 06:30:00             0
#&gt;  45 Goose 2023-03-09 07:00:00             0
#&gt;  46 Goose 2023-03-09 07:30:00             0
#&gt;  47 Goose 2023-03-09 08:00:00             0
#&gt;  48 Goose 2023-03-09 08:30:00             0
#&gt;  49 Goose 2023-03-09 09:00:00             2
#&gt;  50 Goose 2023-03-09 09:30:00             0
#&gt;  51 Goose 2023-03-09 10:00:00             0
#&gt;  52 Goose 2023-03-09 10:30:00             0
#&gt;  53 Goose 2023-03-09 11:00:00             0
#&gt;  54 Goose 2023-03-09 11:30:00             0
#&gt;  55 Goose 2023-03-09 12:00:00             0
#&gt;  56 Goose 2023-03-09 12:30:00             0
#&gt;  57 Goose 2023-03-09 13:00:00             4
#&gt;  58 Goose 2023-03-09 13:30:00             0
#&gt;  59 Goose 2023-03-09 14:00:00             1
#&gt;  60 Goose 2023-03-09 14:30:00             0
#&gt;  61 Goose 2023-03-09 15:00:00             4
#&gt;  62 Goose 2023-03-09 15:30:00             0
#&gt;  63 Goose 2023-03-09 16:00:00             1
#&gt;  64 Goose 2023-03-09 16:30:00             0
#&gt;  65 Goose 2023-03-09 17:00:00             2
#&gt;  66 Duck  2023-03-08 10:30:00             2
#&gt;  67 Duck  2023-03-08 11:00:00             0
#&gt;  68 Duck  2023-03-08 11:30:00             0
#&gt;  69 Duck  2023-03-08 12:00:00             0
#&gt;  70 Duck  2023-03-08 12:30:00             0
#&gt;  71 Duck  2023-03-08 13:00:00             0
#&gt;  72 Duck  2023-03-08 13:30:00             0
#&gt;  73 Duck  2023-03-08 14:00:00             0
#&gt;  74 Duck  2023-03-08 14:30:00             2
#&gt;  75 Duck  2023-03-08 15:00:00             4
#&gt;  76 Duck  2023-03-08 15:30:00             0
#&gt;  77 Duck  2023-03-08 16:00:00             0
#&gt;  78 Duck  2023-03-08 16:30:00             0
#&gt;  79 Duck  2023-03-08 17:00:00             2
#&gt;  80 Duck  2023-03-08 17:30:00             0
#&gt;  81 Duck  2023-03-08 18:00:00             0
#&gt;  82 Duck  2023-03-08 18:30:00             0
#&gt;  83 Duck  2023-03-08 19:00:00             0
#&gt;  84 Duck  2023-03-08 19:30:00             0
#&gt;  85 Duck  2023-03-08 20:00:00             0
#&gt;  86 Duck  2023-03-08 20:30:00             0
#&gt;  87 Duck  2023-03-08 21:00:00             0
#&gt;  88 Duck  2023-03-08 21:30:00             0
#&gt;  89 Duck  2023-03-08 22:00:00             0
#&gt;  90 Duck  2023-03-08 22:30:00             0
#&gt;  91 Duck  2023-03-08 23:00:00             0
#&gt;  92 Duck  2023-03-08 23:30:00             0
#&gt;  93 Duck  2023-03-09 00:00:00             0
#&gt;  94 Duck  2023-03-09 00:30:00             0
#&gt;  95 Duck  2023-03-09 01:00:00             0
#&gt;  96 Duck  2023-03-09 01:30:00             0
#&gt;  97 Duck  2023-03-09 02:00:00             0
#&gt;  98 Duck  2023-03-09 02:30:00             0
#&gt;  99 Duck  2023-03-09 03:00:00             0
#&gt; 100 Duck  2023-03-09 03:30:00             0
#&gt; 101 Duck  2023-03-09 04:00:00             0
#&gt; 102 Duck  2023-03-09 04:30:00             0
#&gt; 103 Duck  2023-03-09 05:00:00             0
#&gt; 104 Duck  2023-03-09 05:30:00             0
#&gt; 105 Duck  2023-03-09 06:00:00             0
#&gt; 106 Duck  2023-03-09 06:30:00             0
#&gt; 107 Duck  2023-03-09 07:00:00             0
#&gt; 108 Duck  2023-03-09 07:30:00             0
#&gt; 109 Duck  2023-03-09 08:00:00             0
#&gt; 110 Duck  2023-03-09 08:30:00             0
#&gt; 111 Duck  2023-03-09 09:00:00             0
#&gt; 112 Duck  2023-03-09 09:30:00             0
#&gt; 113 Duck  2023-03-09 10:00:00             0
#&gt; 114 Duck  2023-03-09 10:30:00             2
#&gt; 115 Duck  2023-03-09 11:00:00             0
#&gt; 116 Duck  2023-03-09 11:30:00             0
#&gt; 117 Duck  2023-03-09 12:00:00             0
#&gt; 118 Duck  2023-03-09 12:30:00             0
#&gt; 119 Duck  2023-03-09 13:00:00             0
#&gt; 120 Duck  2023-03-09 13:30:00             0
#&gt; 121 Duck  2023-03-09 14:00:00             0
#&gt; 122 Duck  2023-03-09 14:30:00             2
#&gt; 123 Duck  2023-03-09 15:00:00             4
#&gt; 124 Duck  2023-03-09 15:30:00             0
#&gt; 125 Duck  2023-03-09 16:00:00             0
#&gt; 126 Duck  2023-03-09 16:30:00             0
#&gt; 127 Duck  2023-03-09 17:00:00             2

# Instead consider using NA, and adding 0 after
waterfowlSumm |&gt; fill_gaps() |&gt; print(n = Inf)
#&gt; # A tsibble: 127 x 3 [30m] &lt;UTC&gt;
#&gt; # Key:       Bird [2]
#&gt;     Bird  Interval            `Total birds`
#&gt;     &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt;
#&gt;   1 Goose 2023-03-08 09:00:00             2
#&gt;   2 Goose 2023-03-08 09:30:00            NA
#&gt;   3 Goose 2023-03-08 10:00:00            NA
#&gt;   4 Goose 2023-03-08 10:30:00            NA
#&gt;   5 Goose 2023-03-08 11:00:00            NA
#&gt;   6 Goose 2023-03-08 11:30:00            NA
#&gt;   7 Goose 2023-03-08 12:00:00            NA
#&gt;   8 Goose 2023-03-08 12:30:00            NA
#&gt;   9 Goose 2023-03-08 13:00:00             4
#&gt;  10 Goose 2023-03-08 13:30:00            NA
#&gt;  11 Goose 2023-03-08 14:00:00             1
#&gt;  12 Goose 2023-03-08 14:30:00            NA
#&gt;  13 Goose 2023-03-08 15:00:00             4
#&gt;  14 Goose 2023-03-08 15:30:00            NA
#&gt;  15 Goose 2023-03-08 16:00:00             1
#&gt;  16 Goose 2023-03-08 16:30:00            NA
#&gt;  17 Goose 2023-03-08 17:00:00             2
#&gt;  18 Goose 2023-03-08 17:30:00            NA
#&gt;  19 Goose 2023-03-08 18:00:00            NA
#&gt;  20 Goose 2023-03-08 18:30:00            NA
#&gt;  21 Goose 2023-03-08 19:00:00            NA
#&gt;  22 Goose 2023-03-08 19:30:00            NA
#&gt;  23 Goose 2023-03-08 20:00:00            NA
#&gt;  24 Goose 2023-03-08 20:30:00            NA
#&gt;  25 Goose 2023-03-08 21:00:00            NA
#&gt;  26 Goose 2023-03-08 21:30:00            NA
#&gt;  27 Goose 2023-03-08 22:00:00            NA
#&gt;  28 Goose 2023-03-08 22:30:00            NA
#&gt;  29 Goose 2023-03-08 23:00:00            NA
#&gt;  30 Goose 2023-03-08 23:30:00            NA
#&gt;  31 Goose 2023-03-09 00:00:00            NA
#&gt;  32 Goose 2023-03-09 00:30:00            NA
#&gt;  33 Goose 2023-03-09 01:00:00            NA
#&gt;  34 Goose 2023-03-09 01:30:00            NA
#&gt;  35 Goose 2023-03-09 02:00:00            NA
#&gt;  36 Goose 2023-03-09 02:30:00            NA
#&gt;  37 Goose 2023-03-09 03:00:00            NA
#&gt;  38 Goose 2023-03-09 03:30:00            NA
#&gt;  39 Goose 2023-03-09 04:00:00            NA
#&gt;  40 Goose 2023-03-09 04:30:00            NA
#&gt;  41 Goose 2023-03-09 05:00:00            NA
#&gt;  42 Goose 2023-03-09 05:30:00            NA
#&gt;  43 Goose 2023-03-09 06:00:00            NA
#&gt;  44 Goose 2023-03-09 06:30:00            NA
#&gt;  45 Goose 2023-03-09 07:00:00            NA
#&gt;  46 Goose 2023-03-09 07:30:00            NA
#&gt;  47 Goose 2023-03-09 08:00:00            NA
#&gt;  48 Goose 2023-03-09 08:30:00            NA
#&gt;  49 Goose 2023-03-09 09:00:00             2
#&gt;  50 Goose 2023-03-09 09:30:00            NA
#&gt;  51 Goose 2023-03-09 10:00:00            NA
#&gt;  52 Goose 2023-03-09 10:30:00            NA
#&gt;  53 Goose 2023-03-09 11:00:00            NA
#&gt;  54 Goose 2023-03-09 11:30:00            NA
#&gt;  55 Goose 2023-03-09 12:00:00            NA
#&gt;  56 Goose 2023-03-09 12:30:00            NA
#&gt;  57 Goose 2023-03-09 13:00:00             4
#&gt;  58 Goose 2023-03-09 13:30:00            NA
#&gt;  59 Goose 2023-03-09 14:00:00             1
#&gt;  60 Goose 2023-03-09 14:30:00            NA
#&gt;  61 Goose 2023-03-09 15:00:00             4
#&gt;  62 Goose 2023-03-09 15:30:00            NA
#&gt;  63 Goose 2023-03-09 16:00:00             1
#&gt;  64 Goose 2023-03-09 16:30:00            NA
#&gt;  65 Goose 2023-03-09 17:00:00             2
#&gt;  66 Duck  2023-03-08 10:30:00             2
#&gt;  67 Duck  2023-03-08 11:00:00            NA
#&gt;  68 Duck  2023-03-08 11:30:00            NA
#&gt;  69 Duck  2023-03-08 12:00:00            NA
#&gt;  70 Duck  2023-03-08 12:30:00            NA
#&gt;  71 Duck  2023-03-08 13:00:00            NA
#&gt;  72 Duck  2023-03-08 13:30:00            NA
#&gt;  73 Duck  2023-03-08 14:00:00            NA
#&gt;  74 Duck  2023-03-08 14:30:00             2
#&gt;  75 Duck  2023-03-08 15:00:00             4
#&gt;  76 Duck  2023-03-08 15:30:00            NA
#&gt;  77 Duck  2023-03-08 16:00:00            NA
#&gt;  78 Duck  2023-03-08 16:30:00            NA
#&gt;  79 Duck  2023-03-08 17:00:00             2
#&gt;  80 Duck  2023-03-08 17:30:00            NA
#&gt;  81 Duck  2023-03-08 18:00:00            NA
#&gt;  82 Duck  2023-03-08 18:30:00            NA
#&gt;  83 Duck  2023-03-08 19:00:00            NA
#&gt;  84 Duck  2023-03-08 19:30:00            NA
#&gt;  85 Duck  2023-03-08 20:00:00            NA
#&gt;  86 Duck  2023-03-08 20:30:00            NA
#&gt;  87 Duck  2023-03-08 21:00:00            NA
#&gt;  88 Duck  2023-03-08 21:30:00            NA
#&gt;  89 Duck  2023-03-08 22:00:00            NA
#&gt;  90 Duck  2023-03-08 22:30:00            NA
#&gt;  91 Duck  2023-03-08 23:00:00            NA
#&gt;  92 Duck  2023-03-08 23:30:00            NA
#&gt;  93 Duck  2023-03-09 00:00:00            NA
#&gt;  94 Duck  2023-03-09 00:30:00            NA
#&gt;  95 Duck  2023-03-09 01:00:00            NA
#&gt;  96 Duck  2023-03-09 01:30:00            NA
#&gt;  97 Duck  2023-03-09 02:00:00            NA
#&gt;  98 Duck  2023-03-09 02:30:00            NA
#&gt;  99 Duck  2023-03-09 03:00:00            NA
#&gt; 100 Duck  2023-03-09 03:30:00            NA
#&gt; 101 Duck  2023-03-09 04:00:00            NA
#&gt; 102 Duck  2023-03-09 04:30:00            NA
#&gt; 103 Duck  2023-03-09 05:00:00            NA
#&gt; 104 Duck  2023-03-09 05:30:00            NA
#&gt; 105 Duck  2023-03-09 06:00:00            NA
#&gt; 106 Duck  2023-03-09 06:30:00            NA
#&gt; 107 Duck  2023-03-09 07:00:00            NA
#&gt; 108 Duck  2023-03-09 07:30:00            NA
#&gt; 109 Duck  2023-03-09 08:00:00            NA
#&gt; 110 Duck  2023-03-09 08:30:00            NA
#&gt; 111 Duck  2023-03-09 09:00:00            NA
#&gt; 112 Duck  2023-03-09 09:30:00            NA
#&gt; 113 Duck  2023-03-09 10:00:00            NA
#&gt; 114 Duck  2023-03-09 10:30:00             2
#&gt; 115 Duck  2023-03-09 11:00:00            NA
#&gt; 116 Duck  2023-03-09 11:30:00            NA
#&gt; 117 Duck  2023-03-09 12:00:00            NA
#&gt; 118 Duck  2023-03-09 12:30:00            NA
#&gt; 119 Duck  2023-03-09 13:00:00            NA
#&gt; 120 Duck  2023-03-09 13:30:00            NA
#&gt; 121 Duck  2023-03-09 14:00:00            NA
#&gt; 122 Duck  2023-03-09 14:30:00             2
#&gt; 123 Duck  2023-03-09 15:00:00             4
#&gt; 124 Duck  2023-03-09 15:30:00            NA
#&gt; 125 Duck  2023-03-09 16:00:00            NA
#&gt; 126 Duck  2023-03-09 16:30:00            NA
#&gt; 127 Duck  2023-03-09 17:00:00             2

# Then add 0 based on working hours
# I&#39;m adding this with a mutate(), but if it was more complicated you could left join a &#39;working time&#39; dataset.
waterfowl_complete &lt;- waterfowlSumm |&gt; 
  fill_gaps() |&gt; 
  mutate(
    working = between(hour(Interval), 9, 17),
    `Total birds` = case_when(
      !is.na(`Total birds`) ~ `Total birds`,
      working ~ 0,
      TRUE ~ NA_real_
    )
  )

waterfowl_complete |&gt; 
  print(n=Inf)
#&gt; # A tsibble: 127 x 4 [30m] &lt;UTC&gt;
#&gt; # Key:       Bird [2]
#&gt;     Bird  Interval            `Total birds` working
#&gt;     &lt;fct&gt; &lt;dttm&gt;                      &lt;dbl&gt; &lt;lgl&gt;  
#&gt;   1 Goose 2023-03-08 09:00:00             2 TRUE   
#&gt;   2 Goose 2023-03-08 09:30:00             0 TRUE   
#&gt;   3 Goose 2023-03-08 10:00:00             0 TRUE   
#&gt;   4 Goose 2023-03-08 10:30:00             0 TRUE   
#&gt;   5 Goose 2023-03-08 11:00:00             0 TRUE   
#&gt;   6 Goose 2023-03-08 11:30:00             0 TRUE   
#&gt;   7 Goose 2023-03-08 12:00:00             0 TRUE   
#&gt;   8 Goose 2023-03-08 12:30:00             0 TRUE   
#&gt;   9 Goose 2023-03-08 13:00:00             4 TRUE   
#&gt;  10 Goose 2023-03-08 13:30:00             0 TRUE   
#&gt;  11 Goose 2023-03-08 14:00:00             1 TRUE   
#&gt;  12 Goose 2023-03-08 14:30:00             0 TRUE   
#&gt;  13 Goose 2023-03-08 15:00:00             4 TRUE   
#&gt;  14 Goose 2023-03-08 15:30:00             0 TRUE   
#&gt;  15 Goose 2023-03-08 16:00:00             1 TRUE   
#&gt;  16 Goose 2023-03-08 16:30:00             0 TRUE   
#&gt;  17 Goose 2023-03-08 17:00:00             2 TRUE   
#&gt;  18 Goose 2023-03-08 17:30:00             0 TRUE   
#&gt;  19 Goose 2023-03-08 18:00:00            NA FALSE  
#&gt;  20 Goose 2023-03-08 18:30:00            NA FALSE  
#&gt;  21 Goose 2023-03-08 19:00:00            NA FALSE  
#&gt;  22 Goose 2023-03-08 19:30:00            NA FALSE  
#&gt;  23 Goose 2023-03-08 20:00:00            NA FALSE  
#&gt;  24 Goose 2023-03-08 20:30:00            NA FALSE  
#&gt;  25 Goose 2023-03-08 21:00:00            NA FALSE  
#&gt;  26 Goose 2023-03-08 21:30:00            NA FALSE  
#&gt;  27 Goose 2023-03-08 22:00:00            NA FALSE  
#&gt;  28 Goose 2023-03-08 22:30:00            NA FALSE  
#&gt;  29 Goose 2023-03-08 23:00:00            NA FALSE  
#&gt;  30 Goose 2023-03-08 23:30:00            NA FALSE  
#&gt;  31 Goose 2023-03-09 00:00:00            NA FALSE  
#&gt;  32 Goose 2023-03-09 00:30:00            NA FALSE  
#&gt;  33 Goose 2023-03-09 01:00:00            NA FALSE  
#&gt;  34 Goose 2023-03-09 01:30:00            NA FALSE  
#&gt;  35 Goose 2023-03-09 02:00:00            NA FALSE  
#&gt;  36 Goose 2023-03-09 02:30:00            NA FALSE  
#&gt;  37 Goose 2023-03-09 03:00:00            NA FALSE  
#&gt;  38 Goose 2023-03-09 03:30:00            NA FALSE  
#&gt;  39 Goose 2023-03-09 04:00:00            NA FALSE  
#&gt;  40 Goose 2023-03-09 04:30:00            NA FALSE  
#&gt;  41 Goose 2023-03-09 05:00:00            NA FALSE  
#&gt;  42 Goose 2023-03-09 05:30:00            NA FALSE  
#&gt;  43 Goose 2023-03-09 06:00:00            NA FALSE  
#&gt;  44 Goose 2023-03-09 06:30:00            NA FALSE  
#&gt;  45 Goose 2023-03-09 07:00:00            NA FALSE  
#&gt;  46 Goose 2023-03-09 07:30:00            NA FALSE  
#&gt;  47 Goose 2023-03-09 08:00:00            NA FALSE  
#&gt;  48 Goose 2023-03-09 08:30:00            NA FALSE  
#&gt;  49 Goose 2023-03-09 09:00:00             2 TRUE   
#&gt;  50 Goose 2023-03-09 09:30:00             0 TRUE   
#&gt;  51 Goose 2023-03-09 10:00:00             0 TRUE   
#&gt;  52 Goose 2023-03-09 10:30:00             0 TRUE   
#&gt;  53 Goose 2023-03-09 11:00:00             0 TRUE   
#&gt;  54 Goose 2023-03-09 11:30:00             0 TRUE   
#&gt;  55 Goose 2023-03-09 12:00:00             0 TRUE   
#&gt;  56 Goose 2023-03-09 12:30:00             0 TRUE   
#&gt;  57 Goose 2023-03-09 13:00:00             4 TRUE   
#&gt;  58 Goose 2023-03-09 13:30:00             0 TRUE   
#&gt;  59 Goose 2023-03-09 14:00:00             1 TRUE   
#&gt;  60 Goose 2023-03-09 14:30:00             0 TRUE   
#&gt;  61 Goose 2023-03-09 15:00:00             4 TRUE   
#&gt;  62 Goose 2023-03-09 15:30:00             0 TRUE   
#&gt;  63 Goose 2023-03-09 16:00:00             1 TRUE   
#&gt;  64 Goose 2023-03-09 16:30:00             0 TRUE   
#&gt;  65 Goose 2023-03-09 17:00:00             2 TRUE   
#&gt;  66 Duck  2023-03-08 10:30:00             2 TRUE   
#&gt;  67 Duck  2023-03-08 11:00:00             0 TRUE   
#&gt;  68 Duck  2023-03-08 11:30:00             0 TRUE   
#&gt;  69 Duck  2023-03-08 12:00:00             0 TRUE   
#&gt;  70 Duck  2023-03-08 12:30:00             0 TRUE   
#&gt;  71 Duck  2023-03-08 13:00:00             0 TRUE   
#&gt;  72 Duck  2023-03-08 13:30:00             0 TRUE   
#&gt;  73 Duck  2023-03-08 14:00:00             0 TRUE   
#&gt;  74 Duck  2023-03-08 14:30:00             2 TRUE   
#&gt;  75 Duck  2023-03-08 15:00:00             4 TRUE   
#&gt;  76 Duck  2023-03-08 15:30:00             0 TRUE   
#&gt;  77 Duck  2023-03-08 16:00:00             0 TRUE   
#&gt;  78 Duck  2023-03-08 16:30:00             0 TRUE   
#&gt;  79 Duck  2023-03-08 17:00:00             2 TRUE   
#&gt;  80 Duck  2023-03-08 17:30:00             0 TRUE   
#&gt;  81 Duck  2023-03-08 18:00:00            NA FALSE  
#&gt;  82 Duck  2023-03-08 18:30:00            NA FALSE  
#&gt;  83 Duck  2023-03-08 19:00:00            NA FALSE  
#&gt;  84 Duck  2023-03-08 19:30:00            NA FALSE  
#&gt;  85 Duck  2023-03-08 20:00:00            NA FALSE  
#&gt;  86 Duck  2023-03-08 20:30:00            NA FALSE  
#&gt;  87 Duck  2023-03-08 21:00:00            NA FALSE  
#&gt;  88 Duck  2023-03-08 21:30:00            NA FALSE  
#&gt;  89 Duck  2023-03-08 22:00:00            NA FALSE  
#&gt;  90 Duck  2023-03-08 22:30:00            NA FALSE  
#&gt;  91 Duck  2023-03-08 23:00:00            NA FALSE  
#&gt;  92 Duck  2023-03-08 23:30:00            NA FALSE  
#&gt;  93 Duck  2023-03-09 00:00:00            NA FALSE  
#&gt;  94 Duck  2023-03-09 00:30:00            NA FALSE  
#&gt;  95 Duck  2023-03-09 01:00:00            NA FALSE  
#&gt;  96 Duck  2023-03-09 01:30:00            NA FALSE  
#&gt;  97 Duck  2023-03-09 02:00:00            NA FALSE  
#&gt;  98 Duck  2023-03-09 02:30:00            NA FALSE  
#&gt;  99 Duck  2023-03-09 03:00:00            NA FALSE  
#&gt; 100 Duck  2023-03-09 03:30:00            NA FALSE  
#&gt; 101 Duck  2023-03-09 04:00:00            NA FALSE  
#&gt; 102 Duck  2023-03-09 04:30:00            NA FALSE  
#&gt; 103 Duck  2023-03-09 05:00:00            NA FALSE  
#&gt; 104 Duck  2023-03-09 05:30:00            NA FALSE  
#&gt; 105 Duck  2023-03-09 06:00:00            NA FALSE  
#&gt; 106 Duck  2023-03-09 06:30:00            NA FALSE  
#&gt; 107 Duck  2023-03-09 07:00:00            NA FALSE  
#&gt; 108 Duck  2023-03-09 07:30:00            NA FALSE  
#&gt; 109 Duck  2023-03-09 08:00:00            NA FALSE  
#&gt; 110 Duck  2023-03-09 08:30:00            NA FALSE  
#&gt; 111 Duck  2023-03-09 09:00:00             0 TRUE   
#&gt; 112 Duck  2023-03-09 09:30:00             0 TRUE   
#&gt; 113 Duck  2023-03-09 10:00:00             0 TRUE   
#&gt; 114 Duck  2023-03-09 10:30:00             2 TRUE   
#&gt; 115 Duck  2023-03-09 11:00:00             0 TRUE   
#&gt; 116 Duck  2023-03-09 11:30:00             0 TRUE   
#&gt; 117 Duck  2023-03-09 12:00:00             0 TRUE   
#&gt; 118 Duck  2023-03-09 12:30:00             0 TRUE   
#&gt; 119 Duck  2023-03-09 13:00:00             0 TRUE   
#&gt; 120 Duck  2023-03-09 13:30:00             0 TRUE   
#&gt; 121 Duck  2023-03-09 14:00:00             0 TRUE   
#&gt; 122 Duck  2023-03-09 14:30:00             2 TRUE   
#&gt; 123 Duck  2023-03-09 15:00:00             4 TRUE   
#&gt; 124 Duck  2023-03-09 15:30:00             0 TRUE   
#&gt; 125 Duck  2023-03-09 16:00:00             0 TRUE   
#&gt; 126 Duck  2023-03-09 16:30:00             0 TRUE   
#&gt; 127 Duck  2023-03-09 17:00:00             2 TRUE
waterfowl_complete |&gt; 
  autoplot(`Total birds`)

以组特定的开始和结束填补分组的tsibble中的空白<!-- -->

<sup>Created on 2023-03-10 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年3月9日 18:34:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75683407.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定