`R difftime` 输出取决于输入格式(`as.character()` 包装器与否)。

huangapple go评论52阅读模式
英文:

R difftime output different depending on input formats (as.character() wrapper vs without)

问题

在这个示例中,difftime 函数的输出在输入日期时是否使用 as.character 包装的情况下略有不同。原因是 difftime 处理输入的方式略有不同,但这两种方式都可以被认为是正确的,只是在精度上略有不同。

在这个示例中,两种方式的输出日期差异只有约0.04天,这是由于不同方式处理日期的微小差异造成的。这个差异通常是微不足道的,对于大多数应用来说都不会有影响。因此,你可以根据你的具体需求选择其中一种方式,都可以被接受。

英文:

example data:

test <- structure(list(date1 = structure(c(1632745800, 1632745800), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), date2 = structure(c(1641468180, 1641468180), tzone = "UTC", class = c("POSIXct", 
"POSIXt"))), row.names = c(NA, -2L), class = c("tbl_df", "tbl", 
"data.frame"))

Is there a reason why the output of difftime differs based on whether the inputs are wrapped by as.character or not? For example:

library(tidyverse)

test <- structure(list(date1 = structure(c(1632745800, 1632745800), 
                                         tzone = "UTC", class = c("POSIXct", "POSIXt")), 
                       date2 = structure(c(1641468180, 1641468180), tzone = "UTC", class = c("POSIXct", "POSIXt"))), 
                  row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"))

test %>% mutate(date_diff = difftime(date2, date1, units = "days"), 
date_diff2 = difftime(as.character(date2), as.character(date1), units = "days")) %>% 
  print.data.frame()
#>                 date1               date2     date_diff    date_diff2
#> 1 2021-09-27 12:30:00 2022-01-06 11:23:00 100.9535 days 100.9951 days
#> 2 2021-09-27 12:30:00 2022-01-06 11:23:00 100.9535 days 100.9951 days

It only differs by ~0.04 in this case, but is there a reason why? And which one would be considered correct? Thank you!

答案1

得分: 3

字符转换是有损的,因为您会失去时区信息。
您的原始日期时间被指定为协调世界时(UTC)。如果您使用 as.character() 并重新解析它们,它们将被解释为您的本地时间,似乎其中一个日期使用夏令时,另一个日期则不使用夏令时,导致额外的一小时差异。

x <- as.POSIXct(1632745800, tz = "UTC")
y <- as.POSIXct(1641468180, tz = "UTC")

x
#> [1] "2021-09-27 12:30:00 UTC"
as.character(x)
#> [1] "2021-09-27 12:30:00"
as.POSIXct(as.character(x))
#> [1] "2021-09-27 12:30:00 BST"
as.POSIXct(as.character(y))
#> [1] "2022-01-06 11:23:00 GMT"
英文:

The conversion to character is lossy because you lose the time zone infromation.
Your original datetimes are specified to be in UTC. If you
use as.character() and reparse them, they get interpreted as your local time,
where it seems like one of the dates uses daylight savings and the other does not, resulting in an additional one hour difference.

x &lt;- as.POSIXct(1632745800, tz = &quot;UTC&quot;)
y &lt;- as.POSIXct(1641468180, tz = &quot;UTC&quot;)

x
#&gt; [1] &quot;2021-09-27 12:30:00 UTC&quot;
as.character(x)
#&gt; [1] &quot;2021-09-27 12:30:00&quot;
as.POSIXct(as.character(x))
#&gt; [1] &quot;2021-09-27 12:30:00 BST&quot;
as.POSIXct(as.character(y))
#&gt; [1] &quot;2022-01-06 11:23:00 GMT&quot;

答案2

得分: 1

这是由于在从字符串转换时使用了as.POSIXct函数的机器本地特定时间。期望使用原始的日期时间对象。

英文:

It's due to the macine local-specific time of the as.POSIXct usage when converting from the string. Using the original datetime object is desired.

huangapple
  • 本文由 发表于 2023年6月9日 05:42:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76435874.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定