以组特定的开始和结束填补分组的tsibble中的空白

huangapple go评论136阅读模式
英文:

Fill gaps in a grouped tsibble with group-specific starts and ends

问题

我有一个tsibble,其中时间戳的观测数据已经汇总到30分钟的间隔。数据分为几个组,我想确保每个30分钟的组都出现在tsibble中,即使在那个时间段内没有观测到任何数据。

让我们回到我以前关于tsibble的问题中提到的观鸟的例子。假设我每天从早上8:00到下午18:00在某个地点观察鸭子和鹅,并记录每次观察的时间、观察到的鸟的种类以及观察到的鸟群的数量。

以下是示例代码:

  1. library(tidyverse) # 包括lubridate
  2. library(tsibble)
  3. N <- 10
  4. set.seed(42)
  5. # 假设我们在8:00和18:00之间观察鸭子和鹅。
  6. d <- as_datetime("2023-03-08 08:00:00")
  7. times <- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3))))
  8. nObs <- 1 + rpois(length(times), lambda = 1)
  9. birdIdx <- 1 + round(runif(length(times)))
  10. birds <- c("Duck", "Goose")[birdIdx]
  11. # 观测数据的tibble
  12. waterfowl <- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))
  13. # 转换为tsibble(时间序列tibble)并以30分钟为单位汇总
  14. waterfowl |
  15. as_tsibble(index = Timestamp) |
  16. group_by(Bird) |
  17. index_by(Interval = floor_date(Timestamp, "30 minute")) |
  18. summarize(`Total birds` = sum(Count)) -> waterfowlSumm
  19. waterfowlSumm | print(n = Inf)

上述代码创建了一个tsibble,其中每个时间间隔为30分钟,并且对每个鸟类观察的总鸟数进行了汇总。

接下来,你想要填充缺失的时间间隔。你可以使用fill_gaps来实现这一点。以下是示例代码:

  1. > waterfowlSumm | fill_gaps(`Total birds` = 0) | print(n = Inf)

这将填充缺失的时间间隔,将缺失的鸟数设置为0。

但是,由于你在早上8:00开始观察鸟类,下午18:00停止观察,你希望填充缺失的时间间隔,直到你实际观察到鸟类的时间结束。你可以像这样设置.start.end参数:

  1. > waterfowlSumm | fill_gaps(`Total birds` = 0, .start = d, .end = d + hours(9) + minutes(30)) | print(n = Inf)

这将填充直到你实际观察的时间结束为止的时间间隔。

但是,如果你的数据有额外的分组变量,例如,你在不同地点观察鸟类,每个地点都有不同的观察者,并且观察者的工作时间不同,那么.start.end必须根据每个分组设置。你可以尝试以下代码:

  1. > waterfowlSumm | group_by(Site) | mutate(Start = min(Interval), End = max(Interval)) | fill_gaps(`Total birds` = 0, .start = Start, .end = End)

这将根据每个地点的观察者的工作时间来填充缺失的时间间隔。

英文:

I've got a tsibble where timestamped observational data has been aggregated to 30-minute intervals. The data is in several groups, and I'd like to make sure that each 30-minute group appears in the tsibble, even when there were no observations in that time period.

Let's return to the birdwatching example from my previous question about tsibbles. Suppose I'm watching duck and geese at a certain location from 8:00 to 18:00 each day and recording, for each observation, a) the time, b) the type of bird observed, and c) the number of birds in the flock observed.

  1. library(tidyverse) # includes lubridate
  2. library(tsibble)
  3. N &lt;- 10
  4. set.seed(42)
  5. # suppose we&#39;re observing ducks and geese between 8:00 and 18:00.
  6. d &lt;- as_datetime(&quot;2023-03-08 08:00:00&quot;)
  7. times &lt;- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3)))))
  8. nObs &lt;- 1 + rpois(length(times), lambda = 1)
  9. birdIdx &lt;- 1 + round(runif(length(times)))
  10. birds &lt;- c(&quot;Duck&quot;, &quot;Goose&quot;)[birdIdx]
  11. # Tibble of observations
  12. waterfowl &lt;- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))
  13. # Convert to tsibble (time series tibble) and aggregate on a 30-minute basis
  14. waterfowl |&gt;
  15. as_tsibble(index = Timestamp) |&gt;
  16. group_by(Bird) |&gt;
  17. index_by(Interval = floor_date(Timestamp, &quot;30 minute&quot;)) |&gt;
  18. summarize(`Total birds` = sum(Count)) -&gt; waterfowlSumm
  19. waterfowlSumm |&gt; print(n = Inf)

This gives

  1. # A tsibble: 10 x 3 [30m] &lt;UTC&gt;
  2. # Key: Bird [2]
  3. Bird Interval `Total birds`
  4. &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt;
  5. 1 Goose 2023-03-08 09:00:00 2
  6. 2 Goose 2023-03-08 13:00:00 4
  7. 3 Goose 2023-03-08 14:00:00 1
  8. 4 Goose 2023-03-08 15:00:00 4
  9. 5 Goose 2023-03-08 16:00:00 1
  10. 6 Goose 2023-03-08 17:00:00 2
  11. 7 Duck 2023-03-08 10:30:00 2
  12. 8 Duck 2023-03-08 14:30:00 2
  13. 9 Duck 2023-03-08 15:00:00 4
  14. 10 Duck 2023-03-08 17:00:00 2

What I'd like to do is fill missing intervals. I can use fill_gaps for this:

  1. &gt; waterfowlSumm |&gt; fill_gaps(`Total birds` = 0) |&gt; print(n = Inf)
  2. # A tsibble: 31 x 3 [30m] &lt;UTC&gt;
  3. # Key: Bird [2]
  4. Bird Interval `Total birds`
  5. &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt;
  6. 1 Goose 2023-03-08 09:00:00 2
  7. 2 Goose 2023-03-08 09:30:00 0
  8. 3 Goose 2023-03-08 10:00:00 0
  9. ...
  10. 15 Goose 2023-03-08 16:00:00 1
  11. 16 Goose 2023-03-08 16:30:00 0
  12. 17 Goose 2023-03-08 17:00:00 2
  13. 18 Duck 2023-03-08 10:30:00 2
  14. 19 Duck 2023-03-08 11:00:00 0
  15. 20 Duck 2023-03-08 11:30:00 0
  16. ...
  17. 29 Duck 2023-03-08 16:00:00 0
  18. 30 Duck 2023-03-08 16:30:00 0
  19. 31 Duck 2023-03-08 17:00:00 2

However, since I start watching birds at 8:00 and stop at 18:00, I'd like to fill in missing intervals beyond the times where I actually observed birds. So I might do

  1. &gt; waterfowlSumm |&gt; fill_gaps(`Total birds` = 0, .start = d, .end = d + hours(9) + minutes(30)) |&gt; print(n = Inf)
  2. # A tsibble: 40 x 3 [30m] &lt;UTC&gt;
  3. # Key: Bird [2]
  4. Bird Interval `Total birds`
  5. &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt;
  6. 1 Goose 2023-03-08 08:00:00 0
  7. 2 Goose 2023-03-08 08:30:00 0
  8. 3 Goose 2023-03-08 09:00:00 2
  9. ...
  10. 18 Goose 2023-03-08 16:30:00 0
  11. 19 Goose 2023-03-08 17:00:00 2
  12. 20 Goose 2023-03-08 17:30:00 0
  13. 21 Duck 2023-03-08 08:00:00 0
  14. 22 Duck 2023-03-08 08:30:00 0
  15. 23 Duck 2023-03-08 09:00:00 0
  16. ...
  17. 38 Duck 2023-03-08 16:30:00 0
  18. 39 Duck 2023-03-08 17:00:00 2
  19. 40 Duck 2023-03-08 17:30:00 0

This works. However, now suppose that my data has additional grouping variables --- say, I'm observing birds at several sites. Of course, since I can't be in two places at the same time, each site has a different observer. And different observers have different working hours, so .start and .end must be set on a per-group basis.

The start/end times are available in my data, but .start and .end apparently can't be pulled from the tsibble being operated on:

  1. &gt; waterfowlSumm |&gt; mutate(Start = d, End = d + hours(9) + minutes(30)) |&gt; fill_gaps(`Total birds` = 0, .start = Start, .end = End)
  2. Error in scan_gaps.tbl_ts(.data, .full = !!enquo(.full), .start = .start, :
  3. object &#39;Start&#39; not found

So my question is: how do I do this? I'd really like to be able to use grouping (in this example I only have one group to begin with, but in reality there are many) so I only have to invoke fill_gaps once, with the correct start/end being pulled from the tsibble.

Thanks!

答案1

得分: 1

fill_gaps() 函数将隐式缺失值转换为显式缺失值,基于每个时间序列的本地或整个数据集的起始和结束日期以及索引类别。如果在使用 fill_gaps() 时不指定 .start.end 日期,它将计算每个时间序列的时间范围,并根据数据的时间间隔填充任何缺失的时间点。这对于处理不同站点和鸟类的计数范围不同的问题应该是有效的。

然而,如果你处理的是跨越多天的数据,fill_gaps() 函数还会在工作日之间添加隐式的缺失时间点(因为间隔是30分钟,而数据在过夜时段缺失)。因此,你可能希望先使用NA来填充隐式缺失值,然后保留一个工作小时的数据集,可以将其连接到观测数据,并将NA转换为0(如果有人在工作)。例如:

  1. library(tidyverse) # 包括 lubridate
  2. library(tsibble)
  3. #&gt;
  4. #&gt; 附加包: ‘tsibble’
  5. #&gt; 包中的下列对象被遮盖自‘package:lubridate’:
  6. #&gt;
  7. #&gt; interval
  8. #&gt; 包中的下列对象被遮盖自‘package:base’:
  9. #&gt;
  10. #&gt; intersect, setdiff, union
  11. library(fable)
  12. #&gt; 需要的包: ‘fabletools’
  13. N &lt;- 10
  14. set.seed(42)
  15. # 假设我们观察了鸭子和雁在8:00至18:00之间。
  16. d &lt;- as_datetime("2023-03-08 08:00:00")
  17. times &lt;- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3))))
  18. nObs &lt;- 1 + rpois(length(times), lambda = 1)
  19. birdIdx &lt;- 1 + round(runif(length(times)))
  20. birds &lt;- c("Duck", "Goose")[birdIdx]
  21. # 观测数据的数据框
  22. waterfowl &lt;- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))
  23. # 添加第2天
  24. waterfowl &lt;- bind_rows(waterfowl, waterfowl |&gt; mutate(Timestamp = Timestamp + days(1)))
  25. # 转换为tsibble(时间序列数据框)并按30分钟的间隔聚合
  26. waterfowl |&gt;
  27. as_tsibble(index = Timestamp) |&gt;
  28. group_by(Bird) |&gt;
  29. index_by(Interval = floor_date(Timestamp, "30 minute")) |&gt;
  30. summarize(`Total birds` = sum(Count)) -&gt; waterfowlSumm
  31. waterfowlSumm |&gt; print(n = Inf)
  32. #&gt; # 一个tsibble: 20 x 3 [30m] &lt;UTC&gt;
  33. #&gt; # 关键字: Bird [2]
  34. #&gt; Bird Interval `Total birds`
  35. #&gt; &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt;
  36. #&gt; 1 Goose 2023-03-08 09:00:00 2
  37. #&gt; 2 Goose 2023-03-08 13:00:00 4
  38. #&gt; 3 Goose 2023-03-08 14:00:00 1
  39. #&gt; 4 Goose 2023-03-08 15:00:00 4
  40. #&gt; 5 Goose 2023-03-08 16:00:00 1
  41. #&gt; 6 Goose 2023-03-08 17:00:00 2
  42. #&gt; 7 Goose 2023-03-09 09:00:00 2
  43. #&gt; 8 Goose 2023-03-09 13:00:00 4
  44. #&gt; 9 Goose 2023-03-09 14:00:00 1
  45. #&gt; 10 Goose 2023-03-09 15:00:00 4
  46. <details>
  47. <summary>英文:</summary>
  48. The `fill_gaps()` function converts the implicit missing values into explicit missing values, based on either the local (per series) or global (per dataset) start and end dates and the index class.
  49. Using `fill_gaps()` without specifying the `.start` and `.end` date will compute the time range for each series, and fill in any missing time points based on the data&#39;s time interval. This should work for your problem of different counting ranges for sites and birds.
  50. However if you are working with multiple days, the `fill_gaps()` function will also add in the overnight hours between working days (as the interval is 30 minutes, and data is missing overnight). So you might want to instead fill implicit missing values with NA, and then maintain a working hours dataset that can be joined onto your observations data and used to convert NA to 0 if someone was working. For example:
  51. ``` r
  52. library(tidyverse) # includes lubridate
  53. library(tsibble)
  54. #&gt;
  55. #&gt; Attaching package: &#39;tsibble&#39;
  56. #&gt; The following object is masked from &#39;package:lubridate&#39;:
  57. #&gt;
  58. #&gt; interval
  59. #&gt; The following objects are masked from &#39;package:base&#39;:
  60. #&gt;
  61. #&gt; intersect, setdiff, union
  62. library(fable)
  63. #&gt; Loading required package: fabletools
  64. N &lt;- 10
  65. set.seed(42)
  66. # suppose we&#39;re observing ducks and geese between 8:00 and 18:00.
  67. d &lt;- as_datetime(&quot;2023-03-08 08:00:00&quot;)
  68. times &lt;- d + seconds(unique(round(sort(runif(N, min = 0, max = 36e3)))))
  69. nObs &lt;- 1 + rpois(length(times), lambda = 1)
  70. birdIdx &lt;- 1 + round(runif(length(times)))
  71. birds &lt;- c(&quot;Duck&quot;, &quot;Goose&quot;)[birdIdx]
  72. # Tibble of observations
  73. waterfowl &lt;- tibble(Timestamp = times, Count = nObs, Bird = as_factor(birds))
  74. # Add day 2
  75. waterfowl &lt;- bind_rows(waterfowl, waterfowl |&gt; mutate(Timestamp = Timestamp + days(1)))
  76. # Convert to tsibble (time series tibble) and aggregate on a 30-minute basis
  77. waterfowl |&gt;
  78. as_tsibble(index = Timestamp) |&gt;
  79. group_by(Bird) |&gt;
  80. index_by(Interval = floor_date(Timestamp, &quot;30 minute&quot;)) |&gt;
  81. summarize(`Total birds` = sum(Count)) -&gt; waterfowlSumm
  82. waterfowlSumm |&gt; print(n = Inf)
  83. #&gt; # A tsibble: 20 x 3 [30m] &lt;UTC&gt;
  84. #&gt; # Key: Bird [2]
  85. #&gt; Bird Interval `Total birds`
  86. #&gt; &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt;
  87. #&gt; 1 Goose 2023-03-08 09:00:00 2
  88. #&gt; 2 Goose 2023-03-08 13:00:00 4
  89. #&gt; 3 Goose 2023-03-08 14:00:00 1
  90. #&gt; 4 Goose 2023-03-08 15:00:00 4
  91. #&gt; 5 Goose 2023-03-08 16:00:00 1
  92. #&gt; 6 Goose 2023-03-08 17:00:00 2
  93. #&gt; 7 Goose 2023-03-09 09:00:00 2
  94. #&gt; 8 Goose 2023-03-09 13:00:00 4
  95. #&gt; 9 Goose 2023-03-09 14:00:00 1
  96. #&gt; 10 Goose 2023-03-09 15:00:00 4
  97. #&gt; 11 Goose 2023-03-09 16:00:00 1
  98. #&gt; 12 Goose 2023-03-09 17:00:00 2
  99. #&gt; 13 Duck 2023-03-08 10:30:00 2
  100. #&gt; 14 Duck 2023-03-08 14:30:00 2
  101. #&gt; 15 Duck 2023-03-08 15:00:00 4
  102. #&gt; 16 Duck 2023-03-08 17:00:00 2
  103. #&gt; 17 Duck 2023-03-09 10:30:00 2
  104. #&gt; 18 Duck 2023-03-09 14:30:00 2
  105. #&gt; 19 Duck 2023-03-09 15:00:00 4
  106. #&gt; 20 Duck 2023-03-09 17:00:00 2
  107. # This adds 0 between the days
  108. waterfowlSumm |&gt; fill_gaps(`Total birds` = 0) |&gt; print(n = Inf)
  109. #&gt; # A tsibble: 127 x 3 [30m] &lt;UTC&gt;
  110. #&gt; # Key: Bird [2]
  111. #&gt; Bird Interval `Total birds`
  112. #&gt; &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt;
  113. #&gt; 1 Goose 2023-03-08 09:00:00 2
  114. #&gt; 2 Goose 2023-03-08 09:30:00 0
  115. #&gt; 3 Goose 2023-03-08 10:00:00 0
  116. #&gt; 4 Goose 2023-03-08 10:30:00 0
  117. #&gt; 5 Goose 2023-03-08 11:00:00 0
  118. #&gt; 6 Goose 2023-03-08 11:30:00 0
  119. #&gt; 7 Goose 2023-03-08 12:00:00 0
  120. #&gt; 8 Goose 2023-03-08 12:30:00 0
  121. #&gt; 9 Goose 2023-03-08 13:00:00 4
  122. #&gt; 10 Goose 2023-03-08 13:30:00 0
  123. #&gt; 11 Goose 2023-03-08 14:00:00 1
  124. #&gt; 12 Goose 2023-03-08 14:30:00 0
  125. #&gt; 13 Goose 2023-03-08 15:00:00 4
  126. #&gt; 14 Goose 2023-03-08 15:30:00 0
  127. #&gt; 15 Goose 2023-03-08 16:00:00 1
  128. #&gt; 16 Goose 2023-03-08 16:30:00 0
  129. #&gt; 17 Goose 2023-03-08 17:00:00 2
  130. #&gt; 18 Goose 2023-03-08 17:30:00 0
  131. #&gt; 19 Goose 2023-03-08 18:00:00 0
  132. #&gt; 20 Goose 2023-03-08 18:30:00 0
  133. #&gt; 21 Goose 2023-03-08 19:00:00 0
  134. #&gt; 22 Goose 2023-03-08 19:30:00 0
  135. #&gt; 23 Goose 2023-03-08 20:00:00 0
  136. #&gt; 24 Goose 2023-03-08 20:30:00 0
  137. #&gt; 25 Goose 2023-03-08 21:00:00 0
  138. #&gt; 26 Goose 2023-03-08 21:30:00 0
  139. #&gt; 27 Goose 2023-03-08 22:00:00 0
  140. #&gt; 28 Goose 2023-03-08 22:30:00 0
  141. #&gt; 29 Goose 2023-03-08 23:00:00 0
  142. #&gt; 30 Goose 2023-03-08 23:30:00 0
  143. #&gt; 31 Goose 2023-03-09 00:00:00 0
  144. #&gt; 32 Goose 2023-03-09 00:30:00 0
  145. #&gt; 33 Goose 2023-03-09 01:00:00 0
  146. #&gt; 34 Goose 2023-03-09 01:30:00 0
  147. #&gt; 35 Goose 2023-03-09 02:00:00 0
  148. #&gt; 36 Goose 2023-03-09 02:30:00 0
  149. #&gt; 37 Goose 2023-03-09 03:00:00 0
  150. #&gt; 38 Goose 2023-03-09 03:30:00 0
  151. #&gt; 39 Goose 2023-03-09 04:00:00 0
  152. #&gt; 40 Goose 2023-03-09 04:30:00 0
  153. #&gt; 41 Goose 2023-03-09 05:00:00 0
  154. #&gt; 42 Goose 2023-03-09 05:30:00 0
  155. #&gt; 43 Goose 2023-03-09 06:00:00 0
  156. #&gt; 44 Goose 2023-03-09 06:30:00 0
  157. #&gt; 45 Goose 2023-03-09 07:00:00 0
  158. #&gt; 46 Goose 2023-03-09 07:30:00 0
  159. #&gt; 47 Goose 2023-03-09 08:00:00 0
  160. #&gt; 48 Goose 2023-03-09 08:30:00 0
  161. #&gt; 49 Goose 2023-03-09 09:00:00 2
  162. #&gt; 50 Goose 2023-03-09 09:30:00 0
  163. #&gt; 51 Goose 2023-03-09 10:00:00 0
  164. #&gt; 52 Goose 2023-03-09 10:30:00 0
  165. #&gt; 53 Goose 2023-03-09 11:00:00 0
  166. #&gt; 54 Goose 2023-03-09 11:30:00 0
  167. #&gt; 55 Goose 2023-03-09 12:00:00 0
  168. #&gt; 56 Goose 2023-03-09 12:30:00 0
  169. #&gt; 57 Goose 2023-03-09 13:00:00 4
  170. #&gt; 58 Goose 2023-03-09 13:30:00 0
  171. #&gt; 59 Goose 2023-03-09 14:00:00 1
  172. #&gt; 60 Goose 2023-03-09 14:30:00 0
  173. #&gt; 61 Goose 2023-03-09 15:00:00 4
  174. #&gt; 62 Goose 2023-03-09 15:30:00 0
  175. #&gt; 63 Goose 2023-03-09 16:00:00 1
  176. #&gt; 64 Goose 2023-03-09 16:30:00 0
  177. #&gt; 65 Goose 2023-03-09 17:00:00 2
  178. #&gt; 66 Duck 2023-03-08 10:30:00 2
  179. #&gt; 67 Duck 2023-03-08 11:00:00 0
  180. #&gt; 68 Duck 2023-03-08 11:30:00 0
  181. #&gt; 69 Duck 2023-03-08 12:00:00 0
  182. #&gt; 70 Duck 2023-03-08 12:30:00 0
  183. #&gt; 71 Duck 2023-03-08 13:00:00 0
  184. #&gt; 72 Duck 2023-03-08 13:30:00 0
  185. #&gt; 73 Duck 2023-03-08 14:00:00 0
  186. #&gt; 74 Duck 2023-03-08 14:30:00 2
  187. #&gt; 75 Duck 2023-03-08 15:00:00 4
  188. #&gt; 76 Duck 2023-03-08 15:30:00 0
  189. #&gt; 77 Duck 2023-03-08 16:00:00 0
  190. #&gt; 78 Duck 2023-03-08 16:30:00 0
  191. #&gt; 79 Duck 2023-03-08 17:00:00 2
  192. #&gt; 80 Duck 2023-03-08 17:30:00 0
  193. #&gt; 81 Duck 2023-03-08 18:00:00 0
  194. #&gt; 82 Duck 2023-03-08 18:30:00 0
  195. #&gt; 83 Duck 2023-03-08 19:00:00 0
  196. #&gt; 84 Duck 2023-03-08 19:30:00 0
  197. #&gt; 85 Duck 2023-03-08 20:00:00 0
  198. #&gt; 86 Duck 2023-03-08 20:30:00 0
  199. #&gt; 87 Duck 2023-03-08 21:00:00 0
  200. #&gt; 88 Duck 2023-03-08 21:30:00 0
  201. #&gt; 89 Duck 2023-03-08 22:00:00 0
  202. #&gt; 90 Duck 2023-03-08 22:30:00 0
  203. #&gt; 91 Duck 2023-03-08 23:00:00 0
  204. #&gt; 92 Duck 2023-03-08 23:30:00 0
  205. #&gt; 93 Duck 2023-03-09 00:00:00 0
  206. #&gt; 94 Duck 2023-03-09 00:30:00 0
  207. #&gt; 95 Duck 2023-03-09 01:00:00 0
  208. #&gt; 96 Duck 2023-03-09 01:30:00 0
  209. #&gt; 97 Duck 2023-03-09 02:00:00 0
  210. #&gt; 98 Duck 2023-03-09 02:30:00 0
  211. #&gt; 99 Duck 2023-03-09 03:00:00 0
  212. #&gt; 100 Duck 2023-03-09 03:30:00 0
  213. #&gt; 101 Duck 2023-03-09 04:00:00 0
  214. #&gt; 102 Duck 2023-03-09 04:30:00 0
  215. #&gt; 103 Duck 2023-03-09 05:00:00 0
  216. #&gt; 104 Duck 2023-03-09 05:30:00 0
  217. #&gt; 105 Duck 2023-03-09 06:00:00 0
  218. #&gt; 106 Duck 2023-03-09 06:30:00 0
  219. #&gt; 107 Duck 2023-03-09 07:00:00 0
  220. #&gt; 108 Duck 2023-03-09 07:30:00 0
  221. #&gt; 109 Duck 2023-03-09 08:00:00 0
  222. #&gt; 110 Duck 2023-03-09 08:30:00 0
  223. #&gt; 111 Duck 2023-03-09 09:00:00 0
  224. #&gt; 112 Duck 2023-03-09 09:30:00 0
  225. #&gt; 113 Duck 2023-03-09 10:00:00 0
  226. #&gt; 114 Duck 2023-03-09 10:30:00 2
  227. #&gt; 115 Duck 2023-03-09 11:00:00 0
  228. #&gt; 116 Duck 2023-03-09 11:30:00 0
  229. #&gt; 117 Duck 2023-03-09 12:00:00 0
  230. #&gt; 118 Duck 2023-03-09 12:30:00 0
  231. #&gt; 119 Duck 2023-03-09 13:00:00 0
  232. #&gt; 120 Duck 2023-03-09 13:30:00 0
  233. #&gt; 121 Duck 2023-03-09 14:00:00 0
  234. #&gt; 122 Duck 2023-03-09 14:30:00 2
  235. #&gt; 123 Duck 2023-03-09 15:00:00 4
  236. #&gt; 124 Duck 2023-03-09 15:30:00 0
  237. #&gt; 125 Duck 2023-03-09 16:00:00 0
  238. #&gt; 126 Duck 2023-03-09 16:30:00 0
  239. #&gt; 127 Duck 2023-03-09 17:00:00 2
  240. # Instead consider using NA, and adding 0 after
  241. waterfowlSumm |&gt; fill_gaps() |&gt; print(n = Inf)
  242. #&gt; # A tsibble: 127 x 3 [30m] &lt;UTC&gt;
  243. #&gt; # Key: Bird [2]
  244. #&gt; Bird Interval `Total birds`
  245. #&gt; &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt;
  246. #&gt; 1 Goose 2023-03-08 09:00:00 2
  247. #&gt; 2 Goose 2023-03-08 09:30:00 NA
  248. #&gt; 3 Goose 2023-03-08 10:00:00 NA
  249. #&gt; 4 Goose 2023-03-08 10:30:00 NA
  250. #&gt; 5 Goose 2023-03-08 11:00:00 NA
  251. #&gt; 6 Goose 2023-03-08 11:30:00 NA
  252. #&gt; 7 Goose 2023-03-08 12:00:00 NA
  253. #&gt; 8 Goose 2023-03-08 12:30:00 NA
  254. #&gt; 9 Goose 2023-03-08 13:00:00 4
  255. #&gt; 10 Goose 2023-03-08 13:30:00 NA
  256. #&gt; 11 Goose 2023-03-08 14:00:00 1
  257. #&gt; 12 Goose 2023-03-08 14:30:00 NA
  258. #&gt; 13 Goose 2023-03-08 15:00:00 4
  259. #&gt; 14 Goose 2023-03-08 15:30:00 NA
  260. #&gt; 15 Goose 2023-03-08 16:00:00 1
  261. #&gt; 16 Goose 2023-03-08 16:30:00 NA
  262. #&gt; 17 Goose 2023-03-08 17:00:00 2
  263. #&gt; 18 Goose 2023-03-08 17:30:00 NA
  264. #&gt; 19 Goose 2023-03-08 18:00:00 NA
  265. #&gt; 20 Goose 2023-03-08 18:30:00 NA
  266. #&gt; 21 Goose 2023-03-08 19:00:00 NA
  267. #&gt; 22 Goose 2023-03-08 19:30:00 NA
  268. #&gt; 23 Goose 2023-03-08 20:00:00 NA
  269. #&gt; 24 Goose 2023-03-08 20:30:00 NA
  270. #&gt; 25 Goose 2023-03-08 21:00:00 NA
  271. #&gt; 26 Goose 2023-03-08 21:30:00 NA
  272. #&gt; 27 Goose 2023-03-08 22:00:00 NA
  273. #&gt; 28 Goose 2023-03-08 22:30:00 NA
  274. #&gt; 29 Goose 2023-03-08 23:00:00 NA
  275. #&gt; 30 Goose 2023-03-08 23:30:00 NA
  276. #&gt; 31 Goose 2023-03-09 00:00:00 NA
  277. #&gt; 32 Goose 2023-03-09 00:30:00 NA
  278. #&gt; 33 Goose 2023-03-09 01:00:00 NA
  279. #&gt; 34 Goose 2023-03-09 01:30:00 NA
  280. #&gt; 35 Goose 2023-03-09 02:00:00 NA
  281. #&gt; 36 Goose 2023-03-09 02:30:00 NA
  282. #&gt; 37 Goose 2023-03-09 03:00:00 NA
  283. #&gt; 38 Goose 2023-03-09 03:30:00 NA
  284. #&gt; 39 Goose 2023-03-09 04:00:00 NA
  285. #&gt; 40 Goose 2023-03-09 04:30:00 NA
  286. #&gt; 41 Goose 2023-03-09 05:00:00 NA
  287. #&gt; 42 Goose 2023-03-09 05:30:00 NA
  288. #&gt; 43 Goose 2023-03-09 06:00:00 NA
  289. #&gt; 44 Goose 2023-03-09 06:30:00 NA
  290. #&gt; 45 Goose 2023-03-09 07:00:00 NA
  291. #&gt; 46 Goose 2023-03-09 07:30:00 NA
  292. #&gt; 47 Goose 2023-03-09 08:00:00 NA
  293. #&gt; 48 Goose 2023-03-09 08:30:00 NA
  294. #&gt; 49 Goose 2023-03-09 09:00:00 2
  295. #&gt; 50 Goose 2023-03-09 09:30:00 NA
  296. #&gt; 51 Goose 2023-03-09 10:00:00 NA
  297. #&gt; 52 Goose 2023-03-09 10:30:00 NA
  298. #&gt; 53 Goose 2023-03-09 11:00:00 NA
  299. #&gt; 54 Goose 2023-03-09 11:30:00 NA
  300. #&gt; 55 Goose 2023-03-09 12:00:00 NA
  301. #&gt; 56 Goose 2023-03-09 12:30:00 NA
  302. #&gt; 57 Goose 2023-03-09 13:00:00 4
  303. #&gt; 58 Goose 2023-03-09 13:30:00 NA
  304. #&gt; 59 Goose 2023-03-09 14:00:00 1
  305. #&gt; 60 Goose 2023-03-09 14:30:00 NA
  306. #&gt; 61 Goose 2023-03-09 15:00:00 4
  307. #&gt; 62 Goose 2023-03-09 15:30:00 NA
  308. #&gt; 63 Goose 2023-03-09 16:00:00 1
  309. #&gt; 64 Goose 2023-03-09 16:30:00 NA
  310. #&gt; 65 Goose 2023-03-09 17:00:00 2
  311. #&gt; 66 Duck 2023-03-08 10:30:00 2
  312. #&gt; 67 Duck 2023-03-08 11:00:00 NA
  313. #&gt; 68 Duck 2023-03-08 11:30:00 NA
  314. #&gt; 69 Duck 2023-03-08 12:00:00 NA
  315. #&gt; 70 Duck 2023-03-08 12:30:00 NA
  316. #&gt; 71 Duck 2023-03-08 13:00:00 NA
  317. #&gt; 72 Duck 2023-03-08 13:30:00 NA
  318. #&gt; 73 Duck 2023-03-08 14:00:00 NA
  319. #&gt; 74 Duck 2023-03-08 14:30:00 2
  320. #&gt; 75 Duck 2023-03-08 15:00:00 4
  321. #&gt; 76 Duck 2023-03-08 15:30:00 NA
  322. #&gt; 77 Duck 2023-03-08 16:00:00 NA
  323. #&gt; 78 Duck 2023-03-08 16:30:00 NA
  324. #&gt; 79 Duck 2023-03-08 17:00:00 2
  325. #&gt; 80 Duck 2023-03-08 17:30:00 NA
  326. #&gt; 81 Duck 2023-03-08 18:00:00 NA
  327. #&gt; 82 Duck 2023-03-08 18:30:00 NA
  328. #&gt; 83 Duck 2023-03-08 19:00:00 NA
  329. #&gt; 84 Duck 2023-03-08 19:30:00 NA
  330. #&gt; 85 Duck 2023-03-08 20:00:00 NA
  331. #&gt; 86 Duck 2023-03-08 20:30:00 NA
  332. #&gt; 87 Duck 2023-03-08 21:00:00 NA
  333. #&gt; 88 Duck 2023-03-08 21:30:00 NA
  334. #&gt; 89 Duck 2023-03-08 22:00:00 NA
  335. #&gt; 90 Duck 2023-03-08 22:30:00 NA
  336. #&gt; 91 Duck 2023-03-08 23:00:00 NA
  337. #&gt; 92 Duck 2023-03-08 23:30:00 NA
  338. #&gt; 93 Duck 2023-03-09 00:00:00 NA
  339. #&gt; 94 Duck 2023-03-09 00:30:00 NA
  340. #&gt; 95 Duck 2023-03-09 01:00:00 NA
  341. #&gt; 96 Duck 2023-03-09 01:30:00 NA
  342. #&gt; 97 Duck 2023-03-09 02:00:00 NA
  343. #&gt; 98 Duck 2023-03-09 02:30:00 NA
  344. #&gt; 99 Duck 2023-03-09 03:00:00 NA
  345. #&gt; 100 Duck 2023-03-09 03:30:00 NA
  346. #&gt; 101 Duck 2023-03-09 04:00:00 NA
  347. #&gt; 102 Duck 2023-03-09 04:30:00 NA
  348. #&gt; 103 Duck 2023-03-09 05:00:00 NA
  349. #&gt; 104 Duck 2023-03-09 05:30:00 NA
  350. #&gt; 105 Duck 2023-03-09 06:00:00 NA
  351. #&gt; 106 Duck 2023-03-09 06:30:00 NA
  352. #&gt; 107 Duck 2023-03-09 07:00:00 NA
  353. #&gt; 108 Duck 2023-03-09 07:30:00 NA
  354. #&gt; 109 Duck 2023-03-09 08:00:00 NA
  355. #&gt; 110 Duck 2023-03-09 08:30:00 NA
  356. #&gt; 111 Duck 2023-03-09 09:00:00 NA
  357. #&gt; 112 Duck 2023-03-09 09:30:00 NA
  358. #&gt; 113 Duck 2023-03-09 10:00:00 NA
  359. #&gt; 114 Duck 2023-03-09 10:30:00 2
  360. #&gt; 115 Duck 2023-03-09 11:00:00 NA
  361. #&gt; 116 Duck 2023-03-09 11:30:00 NA
  362. #&gt; 117 Duck 2023-03-09 12:00:00 NA
  363. #&gt; 118 Duck 2023-03-09 12:30:00 NA
  364. #&gt; 119 Duck 2023-03-09 13:00:00 NA
  365. #&gt; 120 Duck 2023-03-09 13:30:00 NA
  366. #&gt; 121 Duck 2023-03-09 14:00:00 NA
  367. #&gt; 122 Duck 2023-03-09 14:30:00 2
  368. #&gt; 123 Duck 2023-03-09 15:00:00 4
  369. #&gt; 124 Duck 2023-03-09 15:30:00 NA
  370. #&gt; 125 Duck 2023-03-09 16:00:00 NA
  371. #&gt; 126 Duck 2023-03-09 16:30:00 NA
  372. #&gt; 127 Duck 2023-03-09 17:00:00 2
  373. # Then add 0 based on working hours
  374. # I&#39;m adding this with a mutate(), but if it was more complicated you could left join a &#39;working time&#39; dataset.
  375. waterfowl_complete &lt;- waterfowlSumm |&gt;
  376. fill_gaps() |&gt;
  377. mutate(
  378. working = between(hour(Interval), 9, 17),
  379. `Total birds` = case_when(
  380. !is.na(`Total birds`) ~ `Total birds`,
  381. working ~ 0,
  382. TRUE ~ NA_real_
  383. )
  384. )
  385. waterfowl_complete |&gt;
  386. print(n=Inf)
  387. #&gt; # A tsibble: 127 x 4 [30m] &lt;UTC&gt;
  388. #&gt; # Key: Bird [2]
  389. #&gt; Bird Interval `Total birds` working
  390. #&gt; &lt;fct&gt; &lt;dttm&gt; &lt;dbl&gt; &lt;lgl&gt;
  391. #&gt; 1 Goose 2023-03-08 09:00:00 2 TRUE
  392. #&gt; 2 Goose 2023-03-08 09:30:00 0 TRUE
  393. #&gt; 3 Goose 2023-03-08 10:00:00 0 TRUE
  394. #&gt; 4 Goose 2023-03-08 10:30:00 0 TRUE
  395. #&gt; 5 Goose 2023-03-08 11:00:00 0 TRUE
  396. #&gt; 6 Goose 2023-03-08 11:30:00 0 TRUE
  397. #&gt; 7 Goose 2023-03-08 12:00:00 0 TRUE
  398. #&gt; 8 Goose 2023-03-08 12:30:00 0 TRUE
  399. #&gt; 9 Goose 2023-03-08 13:00:00 4 TRUE
  400. #&gt; 10 Goose 2023-03-08 13:30:00 0 TRUE
  401. #&gt; 11 Goose 2023-03-08 14:00:00 1 TRUE
  402. #&gt; 12 Goose 2023-03-08 14:30:00 0 TRUE
  403. #&gt; 13 Goose 2023-03-08 15:00:00 4 TRUE
  404. #&gt; 14 Goose 2023-03-08 15:30:00 0 TRUE
  405. #&gt; 15 Goose 2023-03-08 16:00:00 1 TRUE
  406. #&gt; 16 Goose 2023-03-08 16:30:00 0 TRUE
  407. #&gt; 17 Goose 2023-03-08 17:00:00 2 TRUE
  408. #&gt; 18 Goose 2023-03-08 17:30:00 0 TRUE
  409. #&gt; 19 Goose 2023-03-08 18:00:00 NA FALSE
  410. #&gt; 20 Goose 2023-03-08 18:30:00 NA FALSE
  411. #&gt; 21 Goose 2023-03-08 19:00:00 NA FALSE
  412. #&gt; 22 Goose 2023-03-08 19:30:00 NA FALSE
  413. #&gt; 23 Goose 2023-03-08 20:00:00 NA FALSE
  414. #&gt; 24 Goose 2023-03-08 20:30:00 NA FALSE
  415. #&gt; 25 Goose 2023-03-08 21:00:00 NA FALSE
  416. #&gt; 26 Goose 2023-03-08 21:30:00 NA FALSE
  417. #&gt; 27 Goose 2023-03-08 22:00:00 NA FALSE
  418. #&gt; 28 Goose 2023-03-08 22:30:00 NA FALSE
  419. #&gt; 29 Goose 2023-03-08 23:00:00 NA FALSE
  420. #&gt; 30 Goose 2023-03-08 23:30:00 NA FALSE
  421. #&gt; 31 Goose 2023-03-09 00:00:00 NA FALSE
  422. #&gt; 32 Goose 2023-03-09 00:30:00 NA FALSE
  423. #&gt; 33 Goose 2023-03-09 01:00:00 NA FALSE
  424. #&gt; 34 Goose 2023-03-09 01:30:00 NA FALSE
  425. #&gt; 35 Goose 2023-03-09 02:00:00 NA FALSE
  426. #&gt; 36 Goose 2023-03-09 02:30:00 NA FALSE
  427. #&gt; 37 Goose 2023-03-09 03:00:00 NA FALSE
  428. #&gt; 38 Goose 2023-03-09 03:30:00 NA FALSE
  429. #&gt; 39 Goose 2023-03-09 04:00:00 NA FALSE
  430. #&gt; 40 Goose 2023-03-09 04:30:00 NA FALSE
  431. #&gt; 41 Goose 2023-03-09 05:00:00 NA FALSE
  432. #&gt; 42 Goose 2023-03-09 05:30:00 NA FALSE
  433. #&gt; 43 Goose 2023-03-09 06:00:00 NA FALSE
  434. #&gt; 44 Goose 2023-03-09 06:30:00 NA FALSE
  435. #&gt; 45 Goose 2023-03-09 07:00:00 NA FALSE
  436. #&gt; 46 Goose 2023-03-09 07:30:00 NA FALSE
  437. #&gt; 47 Goose 2023-03-09 08:00:00 NA FALSE
  438. #&gt; 48 Goose 2023-03-09 08:30:00 NA FALSE
  439. #&gt; 49 Goose 2023-03-09 09:00:00 2 TRUE
  440. #&gt; 50 Goose 2023-03-09 09:30:00 0 TRUE
  441. #&gt; 51 Goose 2023-03-09 10:00:00 0 TRUE
  442. #&gt; 52 Goose 2023-03-09 10:30:00 0 TRUE
  443. #&gt; 53 Goose 2023-03-09 11:00:00 0 TRUE
  444. #&gt; 54 Goose 2023-03-09 11:30:00 0 TRUE
  445. #&gt; 55 Goose 2023-03-09 12:00:00 0 TRUE
  446. #&gt; 56 Goose 2023-03-09 12:30:00 0 TRUE
  447. #&gt; 57 Goose 2023-03-09 13:00:00 4 TRUE
  448. #&gt; 58 Goose 2023-03-09 13:30:00 0 TRUE
  449. #&gt; 59 Goose 2023-03-09 14:00:00 1 TRUE
  450. #&gt; 60 Goose 2023-03-09 14:30:00 0 TRUE
  451. #&gt; 61 Goose 2023-03-09 15:00:00 4 TRUE
  452. #&gt; 62 Goose 2023-03-09 15:30:00 0 TRUE
  453. #&gt; 63 Goose 2023-03-09 16:00:00 1 TRUE
  454. #&gt; 64 Goose 2023-03-09 16:30:00 0 TRUE
  455. #&gt; 65 Goose 2023-03-09 17:00:00 2 TRUE
  456. #&gt; 66 Duck 2023-03-08 10:30:00 2 TRUE
  457. #&gt; 67 Duck 2023-03-08 11:00:00 0 TRUE
  458. #&gt; 68 Duck 2023-03-08 11:30:00 0 TRUE
  459. #&gt; 69 Duck 2023-03-08 12:00:00 0 TRUE
  460. #&gt; 70 Duck 2023-03-08 12:30:00 0 TRUE
  461. #&gt; 71 Duck 2023-03-08 13:00:00 0 TRUE
  462. #&gt; 72 Duck 2023-03-08 13:30:00 0 TRUE
  463. #&gt; 73 Duck 2023-03-08 14:00:00 0 TRUE
  464. #&gt; 74 Duck 2023-03-08 14:30:00 2 TRUE
  465. #&gt; 75 Duck 2023-03-08 15:00:00 4 TRUE
  466. #&gt; 76 Duck 2023-03-08 15:30:00 0 TRUE
  467. #&gt; 77 Duck 2023-03-08 16:00:00 0 TRUE
  468. #&gt; 78 Duck 2023-03-08 16:30:00 0 TRUE
  469. #&gt; 79 Duck 2023-03-08 17:00:00 2 TRUE
  470. #&gt; 80 Duck 2023-03-08 17:30:00 0 TRUE
  471. #&gt; 81 Duck 2023-03-08 18:00:00 NA FALSE
  472. #&gt; 82 Duck 2023-03-08 18:30:00 NA FALSE
  473. #&gt; 83 Duck 2023-03-08 19:00:00 NA FALSE
  474. #&gt; 84 Duck 2023-03-08 19:30:00 NA FALSE
  475. #&gt; 85 Duck 2023-03-08 20:00:00 NA FALSE
  476. #&gt; 86 Duck 2023-03-08 20:30:00 NA FALSE
  477. #&gt; 87 Duck 2023-03-08 21:00:00 NA FALSE
  478. #&gt; 88 Duck 2023-03-08 21:30:00 NA FALSE
  479. #&gt; 89 Duck 2023-03-08 22:00:00 NA FALSE
  480. #&gt; 90 Duck 2023-03-08 22:30:00 NA FALSE
  481. #&gt; 91 Duck 2023-03-08 23:00:00 NA FALSE
  482. #&gt; 92 Duck 2023-03-08 23:30:00 NA FALSE
  483. #&gt; 93 Duck 2023-03-09 00:00:00 NA FALSE
  484. #&gt; 94 Duck 2023-03-09 00:30:00 NA FALSE
  485. #&gt; 95 Duck 2023-03-09 01:00:00 NA FALSE
  486. #&gt; 96 Duck 2023-03-09 01:30:00 NA FALSE
  487. #&gt; 97 Duck 2023-03-09 02:00:00 NA FALSE
  488. #&gt; 98 Duck 2023-03-09 02:30:00 NA FALSE
  489. #&gt; 99 Duck 2023-03-09 03:00:00 NA FALSE
  490. #&gt; 100 Duck 2023-03-09 03:30:00 NA FALSE
  491. #&gt; 101 Duck 2023-03-09 04:00:00 NA FALSE
  492. #&gt; 102 Duck 2023-03-09 04:30:00 NA FALSE
  493. #&gt; 103 Duck 2023-03-09 05:00:00 NA FALSE
  494. #&gt; 104 Duck 2023-03-09 05:30:00 NA FALSE
  495. #&gt; 105 Duck 2023-03-09 06:00:00 NA FALSE
  496. #&gt; 106 Duck 2023-03-09 06:30:00 NA FALSE
  497. #&gt; 107 Duck 2023-03-09 07:00:00 NA FALSE
  498. #&gt; 108 Duck 2023-03-09 07:30:00 NA FALSE
  499. #&gt; 109 Duck 2023-03-09 08:00:00 NA FALSE
  500. #&gt; 110 Duck 2023-03-09 08:30:00 NA FALSE
  501. #&gt; 111 Duck 2023-03-09 09:00:00 0 TRUE
  502. #&gt; 112 Duck 2023-03-09 09:30:00 0 TRUE
  503. #&gt; 113 Duck 2023-03-09 10:00:00 0 TRUE
  504. #&gt; 114 Duck 2023-03-09 10:30:00 2 TRUE
  505. #&gt; 115 Duck 2023-03-09 11:00:00 0 TRUE
  506. #&gt; 116 Duck 2023-03-09 11:30:00 0 TRUE
  507. #&gt; 117 Duck 2023-03-09 12:00:00 0 TRUE
  508. #&gt; 118 Duck 2023-03-09 12:30:00 0 TRUE
  509. #&gt; 119 Duck 2023-03-09 13:00:00 0 TRUE
  510. #&gt; 120 Duck 2023-03-09 13:30:00 0 TRUE
  511. #&gt; 121 Duck 2023-03-09 14:00:00 0 TRUE
  512. #&gt; 122 Duck 2023-03-09 14:30:00 2 TRUE
  513. #&gt; 123 Duck 2023-03-09 15:00:00 4 TRUE
  514. #&gt; 124 Duck 2023-03-09 15:30:00 0 TRUE
  515. #&gt; 125 Duck 2023-03-09 16:00:00 0 TRUE
  516. #&gt; 126 Duck 2023-03-09 16:30:00 0 TRUE
  517. #&gt; 127 Duck 2023-03-09 17:00:00 2 TRUE
  518. waterfowl_complete |&gt;
  519. autoplot(`Total birds`)

以组特定的开始和结束填补分组的tsibble中的空白<!-- -->

<sup>Created on 2023-03-10 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年3月9日 18:34:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75683407.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定