问题

我有一个动物数据集，它们被无线电追踪了一年。然而，无线电追踪事件不均匀，有时动物每周被追踪3次，有时每月只有一次。

在这张图片中，我有3种不同动物的ID。如果你看动物ID 0-10，你可以看到在值列下有一个57。这意味着连续追踪日之间有57天的间隔。

这是代码：

date = c("2015-05-01","2015-05-04","2015-05-05","2015-07-01","2015-07-02","2015-07-05","2015-07-06",
 "2015-05-01","2015-05-04","2015-05-05","2015-05-27","2015-05-28","2015-06-05","2015-06-06",
 "2015-05-01","2015-05-02","2015-05-03","2015-05-04","2015-05-05","2015-05-06","2015-05-07")

ID = c("0-10","0-10","0-10","0-10","0-10","0-10","0-10",
 "0-2","0-2","0-2","0-2","0-2","0-2","0-2",
 "0-8","0-8","0-8","0-8","0-8","0-8","0-8")

data2015v2 = data.frame(date,ID)
data2015v2$date = as.Date(data2015v2$date)

delta.days.2015 = with(data2015v2, tapply(date, ID, FUN = function (x) as.integer(diff(x))))

我想知道哪些动物有大于14天的间隔，而不必一个个列表查看。我认为我需要使用循环，但我不知道如何设置一个。感谢任何帮助。

英文:

I have a dataset of animals that were radio-tracked for a year. However, the radio-tracking events were uneven and sometimes animals were tracked 3 times a week, and sometimes only once a month.

I have provided a dummy dataset of relevant columns.

In this picture, I have the IDs of 3 different animals. If you take animal ID 0-10, you can see that under the value column there is a 57. This means there is a gap of 57 days between consecutive tracking days.

The code is as follows:

    date = c(&quot;2015-05-01&quot;,&quot;2015-05-04&quot;,&quot;2015-05-05&quot;,&quot;2015-07-01&quot;,&quot;2015-07-02&quot;,&quot;2015-07-05&quot;,&quot;2015-07-06&quot;,
     &quot;2015-05-01&quot;,&quot;2015-05-04&quot;,&quot;2015-05-05&quot;,&quot;2015-05-27&quot;,&quot;2015-05-28&quot;,&quot;2015-06-05&quot;,&quot;2015-06-06&quot;,
     &quot;2015-05-01&quot;,&quot;2015-05-02&quot;,&quot;2015-05-03&quot;,&quot;2015-05-04&quot;,&quot;2015-05-05&quot;,&quot;2015-05-06&quot;,&quot;2015-05-07&quot;)

    ID = c(&quot;0-10&quot;,&quot;0-10&quot;,&quot;0-10&quot;,&quot;0-10&quot;,&quot;0-10&quot;,&quot;0-10&quot;,&quot;0-10&quot;,
   &quot;0-2&quot;,&quot;0-2&quot;,&quot;0-2&quot;,&quot;0-2&quot;,&quot;0-2&quot;,&quot;0-2&quot;,&quot;0-2&quot;,
   &quot;0-8&quot;,&quot;0-8&quot;,&quot;0-8&quot;,&quot;0-8&quot;,&quot;0-8&quot;,&quot;0-8&quot;,&quot;0-8&quot;)


    data2015v2 = data.frame(date,ID)
    data2015v2$date = as.Date(data2015v2$date)

    delta.days.2015 = with(data2015v2, tapply(date, ID, FUN = function (x) as.integer(diff(x))))

I want to know which animals have gaps longer than 14 days, without having to go over each list one by one. I think I need to use a loop, but I don't know how to set up one. Any help is appreciated.

答案1

得分: 1

使用dplyr，您可以根据动物ID进行分组，并使用summarize函数来仅包括最大间隔（以天为单位）的数据：

library(dplyr)
library(lubridate)

data2015v2 %>%
  mutate(date = ymd(date)) %>%
  group_by(ID) %>%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE))
#> # A tibble: 3 × 2
#>   ID    max_gap
#>   <chr> <drtn> 
#> 1 0-10  57 days
#> 2 0-2   22 days
#> 3 0-8   1 days

要使结果数据框仅包括监测间隔超过14天的ID，您可以在max_gap列上使用filter：

data2015v2 %>%
  mutate(date = ymd(date)) %>%
  group_by(ID) %>%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE)) %>%
  filter(max_gap > 14)
#> # A tibble: 2 × 2
#>   ID    max_gap
#>   <chr> <drtn> 
#> 1 0-10  57 days
#> 2 0-2   22 days

^{创建于2023-05-11，使用reprex v2.0.2}

英文:

With dplyr you could group_by animal ID, and summarize the data to include just the maximum gap (in days):

library(dplyr)
library(lubridate)


data2015v2 %&gt;%
  mutate(date = ymd(date)) %&gt;%
  group_by(ID) %&gt;%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE))
#&gt; # A tibble: 3 &#215; 2
#&gt;   ID    max_gap
#&gt;   &lt;chr&gt; &lt;drtn&gt; 
#&gt; 1 0-10  57 days
#&gt; 2 0-2   22 days
#&gt; 3 0-8    1 days

To have the resulting data frame only include IDs where there was a gap in monitoring that exceeded 14 days, you could filter on the max_gap column:

data2015v2 %&gt;%
  mutate(date = ymd(date)) %&gt;%
  group_by(ID) %&gt;%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE)) %&gt;%
  filter(max_gap &gt; 14)
#&gt; # A tibble: 2 &#215; 2
#&gt;   ID    max_gap
#&gt;   &lt;chr&gt; &lt;drtn&gt; 
#&gt; 1 0-10  57 days
#&gt; 2 0-2   22 days

<sup>Created on 2023-05-11 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中根据条件获取特定数据的方法是什么？

问题

答案1

如何在for循环中设置`stat_function`以绘制两个正态分布图，中心和方差参数。

使用多个字符串更直观地筛选单个 R Shiny 数据表列

我如下翻译：如何在R中计算变量在一段时间内更改之间的平均天数？

一个会输出直到公元3000年的季度的函数。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论