英文:
dplyr: swap end date with start date if end date is less than start date
问题
我有一个包含起始日期和结束日期的数据框,并且我正在使用difftime
函数计算它们之间的时间差。然而,有些起始日期大于结束日期,导致了负的时间差。在这种情况下,我需要交换起始日期和结束日期。我应该如何做?
以下是一个示例数据框:
df1 <- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] <- c('date_start','date_end','diff')
df1$date_start <- c(as.Date('2004-11-09'),
as.Date('2020-01-01'),
as.Date('1992-09-01'))
df1$date_end <- c(as.Date('2005-11-09'),
as.Date('2010-12-31'),
as.Date('2006-10-31'))
df1$diff <- difftime(df1$date_end,df1$date_start)
df1
正确的数据框应该如下所示(第二行中的起始日期和结束日期已经交换):
date_start date_end diff
1 2004-11-09 2005-11-09 365 days
2 2010-12-31 2020-01-01 3288 days
3 1992-09-01 2006-10-31 5173 days
英文:
I have a data frame with start and end dates, and I am calculating the difference in time between them using the difftime
function. However, some of my start dates are greater than my end dates, which results in a negative time difference. I need to swap the start date for the end date and vice versa when this happens. How can I do this?
Here is an example data frame:
df1 <- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] <- c('date_start','date_end','diff')
df1$date_start <- c(as.Date('2004-11-09'),
as.Date('2020-01-01'),
as.Date('1992-09-01'))
df1$date_end <- c(as.Date('2005-11-09'),
as.Date('2010-12-31'),
as.Date('2006-10-31'))
df1$diff <- difftime(df1$date_end,df1$date_start)
df1
The correct data frame should look like this (with the start and end dates swapped in the second row):
date_start date_end diff
1 2004-11-09 2005-11-09 365 days
2 2010-12-31 2020-01-01 3288 days
3 1992-09-01 2006-10-31 5173 days
答案1
得分: 1
df1 <- df1 |>
transform(date_start = pmin(date_start, date_end),
date_end = pmax(date_start, date_end))
(base::transform
类似于dplyr::mutate
,但一个区别是每个项都是基于传入数据计算的,而不是基于“在到达该点的修改后的数据”。 使用mutate
时,代码不会像这样工作,因为date_start
在第一步后会被覆盖。)
英文:
df1 <- df1 |>
transform(date_start = pmin(date_start, date_end),
date_end = pmax(date_start, date_end))
(base::transform
is similar to dplyr::mutate
, but one difference is that each term is calculated based on the incoming data, not on "the data as it exists after modifications to that point." The code would not work like this with mutate
since date_start
would be overwritten after the first step.)
答案2
得分: 0
要在开始日期大于结束日期的情况下交换它们,您可以使用条件逻辑。
df1 <- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] <- c('date_start', 'date_end', 'diff')
df1$date_start <- c(as.Date('2004-11-09'),
as.Date('2020-01-01'),
as.Date('1992-09-01'))
df1$date_end <- c(as.Date('2005-11-09'),
as.Date('2010-12-31'),
as.Date('2006-10-31'))
# 当开始日期大于结束日期时,交换开始和结束日期
df1[df1$date_start > df1$date_end, c('date_start', 'date_end')] <- df1[df1$date_start > df1$date_end, c('date_end', 'date_start')]
df1$diff <- difftime(df1$date_end, df1$date_start)
df1
这段代码检查date_start
是否大于date_end
,并使用索引交换这些行的值。
英文:
To swap the start and end dates in cases where the start date is greater than the end date, you can use conditional logic.
df1 <- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] <- c('date_start', 'date_end', 'diff')
df1$date_start <- c(as.Date('2004-11-09'),
as.Date('2020-01-01'),
as.Date('1992-09-01'))
df1$date_end <- c(as.Date('2005-11-09'),
as.Date('2010-12-31'),
as.Date('2006-10-31'))
# Swap start and end dates when start > end
df1[df1$date_start > df1$date_end, c('date_start', 'date_end')] <- df1[df1$date_start > df1$date_end, c('date_end', 'date_start')]
df1$diff <- difftime(df1$date_end, df1$date_start)
df1
date_start date_end diff
1 2004-11-09 2005-11-09 365 days
2 2010-12-31 2020-01-01 3288 days
3 1992-09-01 2006-10-31 5173 days
This code checks if the date_start is greater than the date_end and swaps the values for those rows using indexing.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论