在R中根据条件获取特定数据的方法是什么?

huangapple go评论62阅读模式
英文:

How to obtain specific data based on a condition in a list in R?

问题

我有一个动物数据集,它们被无线电追踪了一年。然而,无线电追踪事件不均匀,有时动物每周被追踪3次,有时每月只有一次。

在这张图片中,我有3种不同动物的ID。如果你看动物ID 0-10,你可以看到在值列下有一个57。这意味着连续追踪日之间有57天的间隔。

这是代码:

date = c("2015-05-01","2015-05-04","2015-05-05","2015-07-01","2015-07-02","2015-07-05","2015-07-06",
 "2015-05-01","2015-05-04","2015-05-05","2015-05-27","2015-05-28","2015-06-05","2015-06-06",
 "2015-05-01","2015-05-02","2015-05-03","2015-05-04","2015-05-05","2015-05-06","2015-05-07")

ID = c("0-10","0-10","0-10","0-10","0-10","0-10","0-10",
 "0-2","0-2","0-2","0-2","0-2","0-2","0-2",
 "0-8","0-8","0-8","0-8","0-8","0-8","0-8")

data2015v2 = data.frame(date,ID)
data2015v2$date = as.Date(data2015v2$date)

delta.days.2015 = with(data2015v2, tapply(date, ID, FUN = function (x) as.integer(diff(x))))

我想知道哪些动物有大于14天的间隔,而不必一个个列表查看。我认为我需要使用循环,但我不知道如何设置一个。感谢任何帮助。

英文:

I have a dataset of animals that were radio-tracked for a year. However, the radio-tracking events were uneven and sometimes animals were tracked 3 times a week, and sometimes only once a month.

I have provided a dummy dataset of relevant columns.

In this picture, I have the IDs of 3 different animals. If you take animal ID 0-10, you can see that under the value column there is a 57. This means there is a gap of 57 days between consecutive tracking days.

在R中根据条件获取特定数据的方法是什么?

The code is as follows:

    date = c("2015-05-01","2015-05-04","2015-05-05","2015-07-01","2015-07-02","2015-07-05","2015-07-06",
     "2015-05-01","2015-05-04","2015-05-05","2015-05-27","2015-05-28","2015-06-05","2015-06-06",
     "2015-05-01","2015-05-02","2015-05-03","2015-05-04","2015-05-05","2015-05-06","2015-05-07")

    ID = c("0-10","0-10","0-10","0-10","0-10","0-10","0-10",
   "0-2","0-2","0-2","0-2","0-2","0-2","0-2",
   "0-8","0-8","0-8","0-8","0-8","0-8","0-8")


    data2015v2 = data.frame(date,ID)
    data2015v2$date = as.Date(data2015v2$date)

    delta.days.2015 = with(data2015v2, tapply(date, ID, FUN = function (x) as.integer(diff(x))))

I want to know which animals have gaps longer than 14 days, without having to go over each list one by one. I think I need to use a loop, but I don't know how to set up one. Any help is appreciated.

答案1

得分: 1

使用dplyr,您可以根据动物ID进行分组,并使用summarize函数来仅包括最大间隔(以天为单位)的数据:

library(dplyr)
library(lubridate)

data2015v2 %>%
  mutate(date = ymd(date)) %>%
  group_by(ID) %>%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE))
#> # A tibble: 3 × 2
#>   ID    max_gap
#>   <chr> <drtn> 
#> 1 0-10  57 days
#> 2 0-2   22 days
#> 3 0-8   1 days

要使结果数据框仅包括监测间隔超过14天的ID,您可以在max_gap列上使用filter

data2015v2 %>%
  mutate(date = ymd(date)) %>%
  group_by(ID) %>%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE)) %>%
  filter(max_gap > 14)
#> # A tibble: 2 × 2
#>   ID    max_gap
#>   <chr> <drtn> 
#> 1 0-10  57 days
#> 2 0-2   22 days

创建于2023-05-11,使用reprex v2.0.2

英文:

With dplyr you could group_by animal ID, and summarize the data to include just the maximum gap (in days):

library(dplyr)
library(lubridate)


data2015v2 %&gt;%
  mutate(date = ymd(date)) %&gt;%
  group_by(ID) %&gt;%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE))
#&gt; # A tibble: 3 &#215; 2
#&gt;   ID    max_gap
#&gt;   &lt;chr&gt; &lt;drtn&gt; 
#&gt; 1 0-10  57 days
#&gt; 2 0-2   22 days
#&gt; 3 0-8    1 days

To have the resulting data frame only include IDs where there was a gap in monitoring that exceeded 14 days, you could filter on the max_gap column:

data2015v2 %&gt;%
  mutate(date = ymd(date)) %&gt;%
  group_by(ID) %&gt;%
  summarize(max_gap = max(date - lag(date), na.rm = TRUE)) %&gt;%
  filter(max_gap &gt; 14)
#&gt; # A tibble: 2 &#215; 2
#&gt;   ID    max_gap
#&gt;   &lt;chr&gt; &lt;drtn&gt; 
#&gt; 1 0-10  57 days
#&gt; 2 0-2   22 days

<sup>Created on 2023-05-11 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年5月11日 08:55:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76223466.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定