英文:
Assign column values based on condition and list with dplyr
问题
我有一个包含日期列和其他数据的tibble。我想根据日期落在特定范围内来添加一个标签列。我可以使用mutate
和嵌套条件来实现,例如:
data <- data %>%
mutate(season = ifelse(date < "2020-03-01", "Winter 20",
ifelse(date < "2020-07-01", "Spring 20",
ifelse(date < "2020-11-01", "Fall 20", "Winter 21")))
但这种方法似乎有点不够优雅且不够灵活。理想情况下,我想能够指定一个命名列表,例如:
season_breaks <- c("Winter 20" = "2020-03-01",
"Spring 20" = "2020-07-01",
"Fall 20" = "2020-11-01",
"Winter 21" = "2021-03-01")
然后使用这个单独指定的列表来修改tibble,添加新的season
列。(我在各种其他地方使用相同的日期截止日期集合,这就是为什么将它作为一个单独的列表很有帮助;而且如果需要,以后在一个地方进行修改更容易。)
是否有办法创建这样一个新列?
英文:
I have a tibble with a column of dates along with other data. I would like to add a column with labels depending on dates falling within certain ranges. I could do this with mutate
and nested conditionals, for instance:
data <- data %>%
mutate(season = ifelse(date < "2020-03-01", "Winter 20",
ifelse(date < "2020-07-01", "Spring 20",
ifelse(date < "2020-11-01", "Fall 20", "Winter 21")
But this seems somewhat inelegant and also inflexible. Ideally, I would like to be able to specify a named list, e.g.
season_breaks <- c("Winter 20" = "2020-03-01",
"Spring 20" = "2020-07-01",
"Fall 20" = "2020-11-01",
"Winter 21" = "2021-03-01")
And use this separately specified list to modify the tibble with the new season
column. (I use this same set of date cutoffs in various other places, which is why it's helpful to have it as a separate list; also it's easier to modify in one place later if needed.)
Is there a way to create a new column like this?
答案1
得分: 2
以下是使用cut
的解决方案:
season_breaks <- c("Winter 20" = as.Date("2020-03-01"),
"Spring 20" = as.Date("2020-07-01"),
"Fall 20" = as.Date("2020-11-01"),
"Winter 21" = as.Date("2021-03-01"))
data %>%
mutate(season = cut(date,
breaks = c(as.Date("1970-01-01"), unname(season_breaks)),
labels = names(season_breaks)))
date season
1 2020-01-01 Winter 20
2 2020-04-01 Spring 20
3 2020-08-01 Fall 20
4 2020-12-01 Winter 21
英文:
Here is solution using cut
:
season_breaks <- c("Winter 20" = as.Date("2020-03-01"),
"Spring 20" = as.Date("2020-07-01"),
"Fall 20" = as.Date("2020-11-01"),
"Winter 21" = as.Date("2021-03-01"))
data %>%
mutate(season = cut(date,
breaks = c(as.Date("1970-01-01"), unname(season_breaks)),
labels = names(season_breaks)))
date season
<date> <fct>
1 2020-01-01 Winter 20
2 2020-04-01 Spring 20
3 2020-08-01 Fall 20
4 2020-12-01 Winter 21
答案2
得分: 1
使用 dplyr 的新 (>= v1.1.0) [rolling join](https://dplyr.tidyverse.org/reference/join_by.html) 功能:
library(dplyr)
制作季节分割表
season_breaks <- data.frame(
season = names(season_breaks),
end_date = as.Date(unname(season_breaks))
)
data %>%
left_join(season_breaks, join_by(closest(date < end_date))) %>%
select(!end_date)
date season
1 2020-01-15 Winter 20
2 2020-04-15 Spring 20
3 2020-07-15 Fall 20
4 2020-10-15 Fall 20
5 2021-01-15 Winter 21
*示例数据:*
data <- data.frame(
date = seq(as.Date("2020-01-15"), length.out = 5, by = "3 months")
)
英文:
Using dplyr’s new (>= v1.1.0) rolling join feature:
library(dplyr)
# make table of season breaks
season_breaks <- data.frame(
season = names(season_breaks),
end_date = as.Date(unname(season_breaks))
)
data %>%
left_join(season_breaks, join_by(closest(date < end_date))) %>%
select(!end_date)
date season
1 2020-01-15 Winter 20
2 2020-04-15 Spring 20
3 2020-07-15 Fall 20
4 2020-10-15 Fall 20
5 2021-01-15 Winter 21
Example data:
data <- data.frame(
date = seq(as.Date("2020-01-15"), length.out = 5, by = "3 months")
)
答案3
得分: 0
示例数据:
library(tidyverse)
data <- data.frame(date = as.Date(c("2020-02-01", "2020-06-01", "2020-10-01", "2020-11-01", "2021-02-01")))
我只为您展示代码,因为您可能有不同的定义季节的方法:
按季度定义季节:
data %>% mutate(quarter = paste(quarter(date) %>% case_match(1 ~ "春季", 2 ~ "夏季", 3 ~ "秋季", 4 ~ "冬季"), year(date)))
date season
1 2020-02-01 春季 2020
2 2020-06-01 夏季 2020
3 2020-10-01 冬季 2020
4 2020-11-01 冬季 2020
5 2021-02-01 春季 2021
按月份定义季节:
您可能需要调整值以适应您对季节的定义。
data %>%
mutate(season = paste(month(date) %>% case_match(
c(3:5) ~ "春季",
c(6:8) ~ "夏季",
c(9:10) ~ "秋季",
c(11, 12, 1, 2) ~ "冬季"
), year(date)))
date season
1 2020-02-01 冬季 2020
2 2020-06-01 夏季 2020
3 2020-10-01 秋季 2020
4 2020-11-01 冬季 2020
5 2021-02-01 冬季 2021
英文:
Example data:
library(tidyverse)
data <- data.frame(date = as.Date(c("2020-02-01", "2020-06-01", "2020-10-01", "2020-11-01", "2021-02-01")))
I just show you the code as you may have different ways to define season:
Define season by quarter:
data %>% mutate(quarter = paste(quarter(date) %>% case_match(1 ~ "Spring", 2 ~ "Summer", 3 ~ "Fall", 4 ~ "Winter"), year(date)))
date season
1 2020-02-01 Spring 2020
2 2020-06-01 Summer 2020
3 2020-10-01 Winter 2020
4 2020-11-01 Winter 2020
5 2021-02-01 Spring 2021
>
Define season by month:
You may need to adjust values to fit your definition of reasons.
data %>%
mutate(season = paste(month(date) %>% case_match(
c(3:5) ~ "Spring",
c(6:8) ~ "Summer",
c(9:10) ~ "Fall",
c(11, 12, 1, 2) ~ "Winter"
), year(date)))
date season
1 2020-02-01 Winter 2020
2 2020-06-01 Summer 2020
3 2020-10-01 Fall 2020
4 2020-11-01 Winter 2020
5 2021-02-01 Winter 2021
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论