问题

以下是您要翻译的内容：

"A subset of my data looks like this but with many more groupings (IDs):

 ID                          time                class    
   &lt;chr&gt;                       &lt;dttm&gt;              &lt;fct&gt;    
 1 BBR-b172021-M_fall_winter_4 2022-11-01 19:03:31 migrating
 2 BBR-b172021-M_fall_winter_4 2022-11-04 22:03:33 migrating 
 3 BBR-b172021-M_fall_winter_4 2022-11-07 18:03:34 migrating 
 4 BBR-b172021-M_fall_winter_4 2022-11-08 21:03:34 stopover 
 5 BBR-b172021-M_fall_winter_4 2022-11-10 21:03:39 stopover 
 6 BBR-b172021-M_fall_winter_4 2022-11-14 18:03:37 migrating 
 7 BBR-b172021-M_fall_winter_4 2022-11-17 06:04:08 migrating 
 8 BBR-b172021-M_fall_winter_4 2022-11-18 06:04:08 stopover 
 9 BBR-b172021-M_fall_winter_4 2022-11-19 00:03:41 winter 
10 BBR-b172021-M_fall_winter_4 2022-11-27 00:03:51 winter 
11 LINWR-b1282020-M_fall_winter_3 2022-01-14 11:00:08 migrating
12 LINWR-b1282020-M_fall_winter_3 2022-01-15 13:59:45 stopover
13 LINWR-b1282020-M_fall_winter_3 2022-01-20 02:59:54 stopover
14 LINWR-b1282020-M_fall_winter_3 2022-01-21 03:00:14 migrating
15 LINWR-b1282020-M_fall_winter_3 2022-01-21 16:59:47 stopover
16 LINWR-b1282020-M_fall_winter_3 2022-01-22 16:59:45 winter

I am trying to create unique columns either through mapping or group_by and mutate but I don't know where to begin. I would like several new columns describing unique sequential events, their sum, and their duration. New columns added to the dataframe I would suspect would look something like this:


newcols &lt;- data.frame(unique_class = c(&quot;migrating1&quot;, &quot;migrating1&quot;, &quot;migrating1&quot;, &quot;stopover1&quot;, 
                                       &quot;stopover1&quot;, &quot;migrating2&quot;, &quot;migrating2&quot;, &quot;stopover2&quot;, 
                                       &quot;winter1&quot;, &quot;winter1&quot;, &quot;migrating1&quot;, &quot;stopover1&quot;, 
                                       &quot;stopover1&quot;, &quot;migrating2&quot;, &quot;stopover2&quot;, &quot;winter1&quot;),
                      migrate_sum = c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
                      stopover_sum = c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
                      winter_sum = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1),
                      event_duration = c(6,6,6,2,2,3,3,0,8,8,0,5,5,0,0,0))

...where event_duration column would equate to time in days or hours. I know I need to group_by(ID) but and mutate() but not sure how to get the unique classes or lagged duration of times for each class. Any help appreciated."

NOT SURE WHERE TO PUT THIS SO EDITING MY QUESTION: I TRIED @AKRUN SOLUTION BUT IT DIDN'T QUITE WORK. IT PRODUCED THE UNIQUE_CLASS WELL BUT SUMMARIES ARE NOT INCORRECT. HERE'S AN EXAMPLE OF A DATAFRAME PRODUCED USING SOLUTION BELOW AND SUBSET BY UNIQUE ID: fall_mig2 %>% filter(BirdsID_season == "BBR-b432021-M_fall_winter_4") %>% select(BirdsID_season, x, y, time, unique_class, class, stopover_sum) slice_head <- fall_mig2 %>% filter(BirdsID_season == "BBR-b432021-M_fall_winter_4") %>% slice_head(n = 10) slice_tail <- fall_mig2 %>% filter(BirdsID_season == "BBR-b432021-M_fall_winter_4") %>% slice_tail(n = 10) bind_rows(slice_head, slice_tail) %>% select(BirdsID_season, x, y, time, stopover_sum) and the result:

 BirdsID_season                  x     y time                unique_class class     stopover_sum
   &lt;chr&gt;                       &lt;dbl&gt; &lt;dbl&gt; &lt;dttm&gt;              &lt;chr&gt;        &lt;chr&gt;            &lt;int&gt;
 1 BBR-b432021-M_fall_winter_4 -99.2  48.1 2022-11-09 19:09:01 migrating1   migrating            3
 2 BBR-b432021-M_fall_winter_4 -99.2  48.1 2022-11-09 21:08:36 migrating1   migrating            3
 3 BBR-b432021-M_fall_winter_4 -99.2  48.1 2022-11-09 23:08:55 migrating1   migrating            3
 4 BBR-b432021-M_fall_winter_4 -99.2  48.1 2022-11-10 01:09:11 migrating1   migrating            3
 5 BBR-b432021-M_fall_winter_4 -99.2  48.1 2022-11-10 03:08:50 migrating1   migrating            3
 6 BBR-b432021-M_fall_winter_4 -99.2  48.1 2022-11-

<details>
<summary>英文:</summary>

A subset of my data looks like this but with many more groupings (IDs):

ID time class
<chr> <dttm> <fct>
1 BBR-b172021-M_fall_winter_4 2022-11-01 19:03:31 migrating
2 BBR-b172021-M_fall_winter_4 2022-11-04 22:03:33 migrating
3 BBR-b172021-M_fall_winter_4 2022-11-07 18:03:34 migrating
4 BBR-b172021-M_fall_winter_4 2022-11-08 21:03:34 stopover
5 BBR-b172021-M_fall_winter_4 2022-11-10 21:03:39 stopover
6 BBR-b172021-M_fall_winter_4 2022-11-14 18:03:37 migrating
7 BBR-b172021-M_fall_winter_4 2022-11-17 06:04:08 migrating
8 BBR-b172021-M_fall_winter_4 2022-11-18 06:04:08 stopover
9 BBR-b172021-M_fall_winter_4 2022-11-19 00:03:41 winter
10 BBR-b172021-M_fall_winter_4 2022-11-27 00:03:51 winter
11 LINWR-b1282020-M_fall_winter_3 2022-01-14 11:00:08 migrating
12 LINWR-b1282020-M_fall_winter_3 2022-01-15 13:59:45 stopover
13 LINWR-b1282020-M_fall_winter_3 2022-01-20 02:59:54 stopover
14 LINWR-b1282020-M_fall_winter_3 2022-01-21 03:00:14 migrating
15 LINWR-b1282020-M_fall_winter_3 2022-01-21 16:59:47 stopover
16 LINWR-b1282020-M_fall_winter_3 2022-01-22 16:59:45 winter

I am trying to create unique columns either through `mapping` or `group_by` and `mutate` but I don&#39;t know where to begin. I would like several new columns describing unique sequential events, their sum, and their duration. New columns added to the dataframe I would suspect would look something like this:

newcols <- data.frame(unique_class = c("migrating1", "migrating1", "migrating1", "stopover1",
"stopover1", "migrating2", "migrating2", "stopover2",
"winter1", "winter1", "migrating1", "stopover1",
"stopover1", "migrating2", "stopover2", "winter1"),
migrate_sum = c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
stopover_sum = c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
winter_sum = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1),
event_duration = c(6,6,6,2,2,3,3,0,8,8,0,5,5,0,0,0))


...where event_duration column would equate to time in days or hours. I know I need to `group_by(ID)` but and `mutate()` but not sure how to get the unique classes or lagged duration of times for each class. Any help appreciated.


NOT SURE WHERE TO PUT THIS SO EDITING MY QUESTION: I TRIED @AKRUN SOLUTION BUT IT DIDN&#39;T QUITE WORK. IT PRODUCED THE UNIQUE_CLASS WELL BUT SUMMARIES ARE NOT INCORRECT. HERE&#39;S AN EXAMPLE OF A DATAFRAME PRODUCED USING SOLUTION BELOW AND SUBSET BY UNIQUE ID: `fall_mig2 %&gt;% filter(BirdsID_season == &quot;BBR-b432021-M_fall_winter_4&quot;) %&gt;% select(BirdsID_season, x, y, time, unique_class, class, stopover_sum)
slice_head &lt;-  fall_mig2 %&gt;% filter(BirdsID_season == &quot;BBR-b432021-M_fall_winter_4&quot;) %&gt;% slice_head(n = 10)
slice_tail &lt;- fall_mig2 %&gt;% filter(BirdsID_season == &quot;BBR-b432021-M_fall_winter_4&quot;) %&gt;% slice_tail(n = 10)
bind_rows(slice_head, slice_tail) %&gt;% select(BirdsID_season, x, y, time, stopover_sum)` and the result:

BirdsID_season x y time unique_class class stopover_sum
<chr> <dbl> <dbl> <dttm> <chr> <chr> <int>
1 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-09 19:09:01 migrating1 migrating 3
2 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-09 21:08:36 migrating1 migrating 3
3 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-09 23:08:55 migrating1 migrating 3
4 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-10 01:09:11 migrating1 migrating 3
5 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-10 03:08:50 migrating1 migrating 3
6 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-10 05:09:06 migrating1 migrating 3
7 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-10 07:08:43 migrating1 migrating 3
8 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-10 09:08:54 migrating1 migrating 3
9 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-10 11:09:07 migrating1 migrating 3
10 BBR-b432021-M_fall_winter_4 -99.2 48.1 2022-11-10 13:08:39 migrating1 migrating 3
11 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-13 23:08:30 winter1 winter 1
12 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 01:08:45 winter1 winter 1
13 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 03:08:45 winter1 winter 1
14 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 05:08:26 winter1 winter 1
15 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 07:08:22 winter1 winter 1
16 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 09:08:45 winter1 winter 1
17 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 11:08:54 winter1 winter 1
18 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 13:08:19 winter1 winter 1
19 BBR-b432021-M_fall_winter_4 -89.3 36.7 2022-12-14 15:08:47 winter1 winter 1
20 BBR-b432021-M_fall_winter_4 -89.4 36.7 2022-12-14 17:08:19 winter1 winter 1


`stopover_sum` should be 1 (which is in the middle of the subsetted df)
. I&#39;m not sure where the 3 is coming from. Trying to dissect the solution now.

</details>


# 答案1
**得分**: 1

以下是代码的翻译部分：

我们可以创建一个按'Class'分组的run-length-id列，将'time'转换为`Date`类，然后再按'ID'、'class'分组，获取'grp'中不同元素的数量（`n_distinct`），还有'unique_class'是由'paste'函数创建的，将'class'与'unique'的'grp'索引组合起来。然后再次按'ID'、'unique_class'分组，计算'event_duration'，即`max/min`日期值之间的天数，选择感兴趣的列，使用`pivot_wider`将数据重塑为'wide'形式，使用`fill`将'_sum'中的值填充到前一个非NA值。

请注意，由于代码中包含许多特殊字符和函数，因此翻译时保留了原始的函数和列名，以确保准确性。

<details>
<summary>英文:</summary>

We may create a run-length-id column grouped by &#39;Class&#39;, convert the &#39;time&#39; to `Date` class, then grouped by &#39;ID&#39;, &#39;class&#39;, get the number of distinct (`n_distinct`) elements in &#39;grp&#39;, as well as the `unique_class` is created by `paste`ing the &#39;class&#39; with the `unique` &#39;grp&#39; indexes. Do a second grouping by &#39;ID&#39;, &#39;unique_class&#39; to calculate the &#39;event_duration&#39; ie. the number of days between the `max/min` &#39;date&#39; values, select the columns of interest, reshape to &#39;wide&#39; with `pivot_wider` and `fill` the values in the `_sum` to previous non-NA values

library(dplyr)
library(lubridate)
library(tidyr)
library(stringr)
library(data.table)
df1 %>%
mutate(grp = rleid(class), date = as.Date(ymd_hms(time))) %>%
group_by(ID, class) %>%
mutate(Count = n_distinct(grp),
unique_class = str_c(class, match(grp, unique(grp)))) %>%
group_by(ID, unique_class) %>%
mutate(event_duration = as.integer(max(date) - min(date))) %>%
ungroup %>%
transmute(rn = row_number(), class = str_c(class, '_sum'),
Count, unique_class, event_duration) %>%
pivot_wider(names_from = class, values_from = Count) %>%
fill(ends_with("_sum"), .direction = "downup") %>%
select(-rn) %>%
relocate(event_duration, .after = last_col())

-output

A tibble: 16 × 5

unique_class migrating_sum stopover_sum winter_sum event_duration
<chr> <int> <int> <int> <int>
1 migrating1 2 2 1 6
2 migrating1 2 2 1 6
3 migrating1 2 2 1 6
4 stopover1 2 2 1 2
5 stopover1 2 2 1 2
6 migrating2 2 2 1 3
7 migrating2 2 2 1 3
8 stopover2 2 2 1 0
9 winter1 2 2 1 8
10 winter1 2 2 1 8
11 migrating1 2 2 1 0
12 stopover1 2 2 1 5
13 stopover1 2 2 1 5
14 migrating2 2 2 1 0
15 stopover2 2 2 1 0
16 winter1 2 2 1 0


### data

df1 <- structure(list(ID = c("BBR-b172021-M_fall_winter_4",
"BBR-b172021-M_fall_winter_4",
"BBR-b172021-M_fall_winter_4", "BBR-b172021-M_fall_winter_4",
"BBR-b172021-M_fall_winter_4", "BBR-b172021-M_fall_winter_4",
"BBR-b172021-M_fall_winter_4", "BBR-b172021-M_fall_winter_4",
"BBR-b172021-M_fall_winter_4", "BBR-b172021-M_fall_winter_4",
"LINWR-b1282020-M_fall_winter_3", "LINWR-b1282020-M_fall_winter_3",
"LINWR-b1282020-M_fall_winter_3", "LINWR-b1282020-M_fall_winter_3",
"LINWR-b1282020-M_fall_winter_3", "LINWR-b1282020-M_fall_winter_3"
), time = c("2022-11-01 19:03:31", "2022-11-04 22:03:33", "2022-11-07 18:03:34",
"2022-11-08 21:03:34", "2022-11-10 21:03:39", "2022-11-14 18:03:37",
"2022-11-17 06:04:08", "2022-11-18 06:04:08", "2022-11-19 00:03:41",
"2022-11-27 00:03:51", "2022-01-14 11:00:08", "2022-01-15 13:59:45",
"2022-01-20 02:59:54", "2022-01-21 03:00:14", "2022-01-21 16:59:47",
"2022-01-22 16:59:45"), class = c("migrating", "migrating", "migrating",
"stopover", "stopover", "migrating", "migrating", "stopover",
"winter", "winter", "migrating", "stopover", "stopover", "migrating",
"stopover", "winter")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16"))


</details>



# 答案2
**得分**: 1

感谢@akrun。我的问题/预期输出措辞不当；您的解决方案确切地完成了我所要求的。我应该明确指出我想保留整个数据集。为了实现这一点，我对您的解决方案进行了调整，使用`mutate`而不是`transmute`，并添加了持续时间列，使用相同的`pivot_wider`过程。我复制了`class`和`event_duration`列，以便保留它们。尽管有些笨拙，但仍然有效。再次感谢。以下应该是最终的解决方案：

<details>
<summary>英文:</summary>

Again thank you @akrun. My question/intended output was a poorly worded; you&#39;re solution did exactly what I asked. I should have specified that I wanted to retain the entire dataset. To do so, I adjusted your solution with `mutate` instead of `transmute` and added duration columns, using the same `pivot_wider` procedure. I duplicated `class` and `event_duration` columns so I could retain them. Admittedly, quite clumsy but effective nonetheless. Thanks again. Below should be the final solution:

df <- df %>%
mutate(grp = data.table::rleid(class),
date = as.Date(ymd_hms(time))) %>%
group_by(ID, class) %>%
mutate(count = n_distinct(grp),
unique_class = str_c(class, match(grp, unique(grp)))) %>%
group_by(ID, unique_class) %>%
mutate(event_duration = difftime(max(time), min(time), units = "days")) %>%
ungroup() %>%
mutate(class = str_c(class, '_sum')) %>%
pivot_wider(names_from = class, values_from = count) %>%
relocate(event_duration, .after = last_col()) %>%
mutate(class = str_sub(unique_class, start = 1, end = -2),
class2 = class,
event_duration2 = event_duration) %>%
pivot_wider(names_from = class2, values_from = event_duration2) %>%
mutate(mig_dur = migrating,
stop_dur = stopover,
winter_dur = winter) %>%
dplyr::select(-migrating, -stopover, -winter)


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中，为每个分组创建新列以存储唯一事件，然后按组计算事件数量。

问题

A tibble: 16 × 5

ggplot2：在数据集旋转后删除图例中的字母顺序

R Shiny selectInput – 预选列表仅以输入字段中的字母开头。

更新数据框中的名称以始终保持相同。

ggplot2绘图出现错误: “由于纬度* pi而引起的错误”

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论