dplyr:如果结束日期小于开始日期,则交换结束日期和开始日期。

huangapple go评论58阅读模式
英文:

dplyr: swap end date with start date if end date is less than start date

问题

我有一个包含起始日期和结束日期的数据框,并且我正在使用difftime函数计算它们之间的时间差。然而,有些起始日期大于结束日期,导致了负的时间差。在这种情况下,我需要交换起始日期和结束日期。我应该如何做?

以下是一个示例数据框:

df1 <- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] <- c('date_start','date_end','diff')
df1$date_start <- c(as.Date('2004-11-09'),
                    as.Date('2020-01-01'),
                    as.Date('1992-09-01'))
df1$date_end <- c(as.Date('2005-11-09'),
                    as.Date('2010-12-31'),
                    as.Date('2006-10-31'))
df1$diff <- difftime(df1$date_end,df1$date_start)
df1

正确的数据框应该如下所示(第二行中的起始日期和结束日期已经交换):

  date_start   date_end      diff
1 2004-11-09 2005-11-09  365 days
2 2010-12-31 2020-01-01 3288 days
3 1992-09-01 2006-10-31 5173 days
英文:

I have a data frame with start and end dates, and I am calculating the difference in time between them using the difftime function. However, some of my start dates are greater than my end dates, which results in a negative time difference. I need to swap the start date for the end date and vice versa when this happens. How can I do this?

Here is an example data frame:

df1 &lt;- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] &lt;- c(&#39;date_start&#39;,&#39;date_end&#39;,&#39;diff&#39;)
df1$date_start &lt;- c(as.Date(&#39;2004-11-09&#39;),
                    as.Date(&#39;2020-01-01&#39;),
                    as.Date(&#39;1992-09-01&#39;))
df1$date_end &lt;- c(as.Date(&#39;2005-11-09&#39;),
                    as.Date(&#39;2010-12-31&#39;),
                    as.Date(&#39;2006-10-31&#39;))
df1$diff &lt;- difftime(df1$date_end,df1$date_start)
df1

The correct data frame should look like this (with the start and end dates swapped in the second row):

  date_start   date_end      diff
1 2004-11-09 2005-11-09  365 days
2 2010-12-31 2020-01-01 3288 days
3 1992-09-01 2006-10-31 5173 days

答案1

得分: 1

df1 <- df1 |>
transform(date_start = pmin(date_start, date_end),
date_end = pmax(date_start, date_end))

base::transform类似于dplyr::mutate,但一个区别是每个项都是基于传入数据计算的,而不是基于“在到达该点的修改后的数据”。 使用mutate时,代码不会像这样工作,因为date_start在第一步后会被覆盖。)

英文:
df1 &lt;- df1 |&gt; 
  transform(date_start = pmin(date_start, date_end),
            date_end   = pmax(date_start, date_end))

(base::transform is similar to dplyr::mutate, but one difference is that each term is calculated based on the incoming data, not on "the data as it exists after modifications to that point." The code would not work like this with mutate since date_start would be overwritten after the first step.)

答案2

得分: 0

要在开始日期大于结束日期的情况下交换它们,您可以使用条件逻辑。

df1 <- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] <- c('date_start', 'date_end', 'diff')
df1$date_start <- c(as.Date('2004-11-09'),
                    as.Date('2020-01-01'),
                    as.Date('1992-09-01'))
df1$date_end <- c(as.Date('2005-11-09'),
                  as.Date('2010-12-31'),
                  as.Date('2006-10-31'))

# 当开始日期大于结束日期时,交换开始和结束日期
df1[df1$date_start > df1$date_end, c('date_start', 'date_end')] <- df1[df1$date_start > df1$date_end, c('date_end', 'date_start')]

df1$diff <- difftime(df1$date_end, df1$date_start)
df1

这段代码检查date_start是否大于date_end,并使用索引交换这些行的值。

英文:

To swap the start and end dates in cases where the start date is greater than the end date, you can use conditional logic.

df1 &lt;- data.frame(matrix(ncol = 3, nrow = 3))
colnames(df1)[1:3] &lt;- c(&#39;date_start&#39;, &#39;date_end&#39;, &#39;diff&#39;)
df1$date_start &lt;- c(as.Date(&#39;2004-11-09&#39;),
                    as.Date(&#39;2020-01-01&#39;),
                    as.Date(&#39;1992-09-01&#39;))
df1$date_end &lt;- c(as.Date(&#39;2005-11-09&#39;),
                  as.Date(&#39;2010-12-31&#39;),
                  as.Date(&#39;2006-10-31&#39;))

# Swap start and end dates when start &gt; end
df1[df1$date_start &gt; df1$date_end, c(&#39;date_start&#39;, &#39;date_end&#39;)] &lt;- df1[df1$date_start &gt; df1$date_end, c(&#39;date_end&#39;, &#39;date_start&#39;)]

df1$diff &lt;- difftime(df1$date_end, df1$date_start)
   df1
  date_start   date_end      diff
1 2004-11-09 2005-11-09  365 days
2 2010-12-31 2020-01-01 3288 days
3 1992-09-01 2006-10-31 5173 days

This code checks if the date_start is greater than the date_end and swaps the values for those rows using indexing.

huangapple
  • 本文由 发表于 2023年6月16日 05:01:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76485485.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定