hablar::dte() Issue in converting a datetime of class POSIXct to a date

huangapple go评论109阅读模式
英文:

hablar::dte() Issue in converting a datetime of class POSIXct to a date

问题

在R 4.2.3中,我发现将dte()应用于类别为"POSIXct"的日期时间会使日期减少一天。以下问题摘自https://github.com/davidsjoberg/hablar/issues/17;更多信息请参见链接。

感谢您允许我快速更改变量的类别的包。我发现将dte()应用于类别为"POSIXct"的日期时间会使日期减少一天。请参见下面的示例。

  1. library(readxl)
  2. library(hablar)
  3. library(tidyselect)
  4. library(magrittr)
  5. A <- read_excel(
  6. readxl_example("deaths.xlsx"),
  7. range = "arts!A5:F15",
  8. .name_repair = "universal"
  9. )
  10. class(A$Date.of.birth)
  11. # [1] "POSIXct" "POSIXt"
  12. # A tibble: 10 × 6
  13. # Name Profession Age Has.kids Date.of.birth Date.of.death
  14. # <chr> <chr> <dbl> <lgl> <dttm> <dttm>
  15. # 1 David Bowie musician 69 TRUE 1947-01-08 00:00:00 2016-01-10 00:00:00
  16. # 2 Carrie Fisher actor 60 TRUE 1956-10-21 00:00:00 2016-12-27 00:00:00
  17. # 3 Chuck Berry musician 90 TRUE 1926-10-18 00:00:00 2017-03-18 00:00:00
  18. # 4 Bill Paxton actor 61 TRUE 1955-05-17 00:00:00 2017-02-25 00:00:00
  19. # 5 Prince musician 57 TRUE 1958-06-07 00:00:00 2016-04-21 00:00:00
  20. # 6 Alan Rickman actor 69 FALSE 1946-02-21 00:00:00 2016-01-14 00:00:00
  21. # 7 Florence Henderson actor 82 TRUE 1934-02-14 00:00:00 2016-11-24 00:00:00
  22. # 8 Harper Lee author 89 FALSE 1926-04-28 00:00:00 2016-02-19 00:00:00
  23. # 9 Zsa Zsa G&#225;bor actor 99 TRUE 1917-02-06 00:00:00 2016-12-18 00:00:00
  24. # 10 George Michael musician 53 FALSE 1963-06-25 00:00:00 2016-12-25 00:00:00
  25. # … with abbreviated variable names

在这个示例中,与Bowie的出生日期为"1947-01-08"不同,日期变成了"1947-01-07"。所有这些音乐家的日期都是如此。

我知道readxl包正确读取了数据,因为这是数据来自的Excel表。 Excel和生成的R tibble之间的日期完全匹配。

包和版本使用的版本:

  1. R版本4.2.32023-03-15 ucrt
  2. 平台:x86_64-w64-mingw32/x6464位)
  3. 运行在:Windows 10 x64(版本19045),RStudio 2023.3.0.386
  4. 区域设置:LC_COLLATE=English_United States.utf8
  5. LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United
  6. States.utf8 LC_NUMERIC=C
  7. LC_TIME=English_United States.utf8
  8. 包版本:base64enc_0.1.3 bslib_0.4.2 cachem_1.0.7
  9. callr_3.7.3 cellranger_1.1.0 cli_3.6.0 clipr_0.8.0
  10. compiler_4.2.3 cpp11_0.4.3 crayon_1.5.2 digest_0.6.31
  11. dplyr_1.1.0 ellipsis_0.3.2 evaluate_0.20 fansi_1.0.4
  12. fastmap_1.1.1 fs_1.6.1 generics_0.1.3 glue_1.6.2
  13. graphics_4.2.3 grDevices_4.2.3 hablar_0.3.2 highr_0.10
  14. hms_1.1.2 htmltools_0.5.4 jquerylib_0.1.4 jsonlite_1.8.4
  15. knitr_1.42 lifecycle_1.0.3 lubridate_1.9.2 magrittr_2.0.3
  16. memoise_2.0.1 methods_4.2.3 mime_0.12 pillar_1.8.1
  17. pkgconfig_2.0.3 prettyunits_1.1.1 processx_3.8.0 progress_1.2.2
  18. ps_1.7.3 purrr_1.0.1 R6_2.5.1 rappdirs_0.3.3
  19. readxl_1.4.2 rematch_1.0.1 reprex_2.0.2 rlang_1.1.0
  20. rmarkdown_2.20 rstudioapi_0.14 sass_0.4.5 stats_4.2.3
  21. stringi_1.7.12 stringr_1.5.0 tibble_3.2.1 tidyselect_1.2.0
  22. timechange_0.2.0 tinytex_0.44 tools_4.2.3 utf8_1.2.3
  23. utils_4.2.3 vctrs_0.6.0 withr_2.5.0 xfun_0.37
  24. yaml_2.3.7
英文:

In R 4.2.3, I found that applying dte() to date times of a class "POSIXct" makes the day one less. The following issue is copied from https://github.com/davidsjoberg/hablar/issues/17; please see the link for more information.

> Thank you for allowing a package that quickly allows me to change
> classes of variables. I found that applying dte() to date times of a
> class "POSIXct" makes the day be one less. Please see the example
> below.
>
> {r} library(readxl) library(hablar) library(tidyselect)
&gt; library(magrittr)
&gt;
&gt; A &lt;- read_excel( readxl_example(&quot;deaths.xlsx&quot;), range =
&gt; &quot;arts!A5:F15&quot;, .name_repair = &quot;universal&quot; )
&gt; #&gt; New names:
&gt; #&gt; • `Has kids` -&gt; `Has.kids`
&gt; #&gt; • `Date of birth` -&gt; `Date.of.birth`
&gt; #&gt; • `Date of death` -&gt; `Date.of.death` class(A$Date.of.birth)
&gt; #&gt; [1] &quot;POSIXct&quot; &quot;POSIXt&quot; A
&gt; #&gt; # A tibble: 10 &#215; 6
&gt; #&gt; Name Profe…&#185; Age Has.k…&#178; Date.of.birth Date.of.death
&gt; #&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;lgl&gt; &lt;dttm&gt; &lt;dttm&gt;
&gt; #&gt; 1 David Bowie musici… 69 TRUE 1947-01-08 00:00:00 2016-01-10 00:00:00
&gt; #&gt; 2 Carrie Fisher actor 60 TRUE 1956-10-21 00:00:00 2016-12-27 00:00:00
&gt; #&gt; 3 Chuck Berry musici… 90 TRUE 1926-10-18 00:00:00 2017-03-18 00:00:00
&gt; #&gt; 4 Bill Paxton actor 61 TRUE 1955-05-17 00:00:00 2017-02-25 00:00:00
&gt; #&gt; 5 Prince musici… 57 TRUE 1958-06-07 00:00:00 2016-04-21 00:00:00
&gt; #&gt; 6 Alan Rickman actor 69 FALSE 1946-02-21 00:00:00 2016-01-14 00:00:00
&gt; #&gt; 7 Florence Hende… actor 82 TRUE 1934-02-14 00:00:00 2016-11-24 00:00:00
&gt; #&gt; 8 Harper Lee author 89 FALSE 1926-04-28 00:00:00 2016-02-19 00:00:00
&gt; #&gt; 9 Zsa Zsa G&#225;bor actor 99 TRUE 1917-02-06 00:00:00 2016-12-18 00:00:00
&gt; #&gt; 10 George Michael musici… 53 FALSE 1963-06-25 00:00:00 2016-12-25 00:00:00
&gt; #&gt; # … with abbreviated variable names &#185;​Profession, &#178;​Has.kids A %&gt;% hablar::convert(dte(starts_with(&quot;Date&quot;)))
&gt; #&gt; # A tibble: 10 &#215; 6
&gt; #&gt; Name Profession Age Has.kids Date.of.birth Date.of.death
&gt; #&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;lgl&gt; &lt;date&gt; &lt;date&gt;
&gt; #&gt; 1 David Bowie musician 69 TRUE 1947-01-07 2016-01-09
&gt; #&gt; 2 Carrie Fisher actor 60 TRUE 1956-10-20 2016-12-26
&gt; #&gt; 3 Chuck Berry musician 90 TRUE 1926-10-17 2017-03-17
&gt; #&gt; 4 Bill Paxton actor 61 TRUE 1955-05-16 2017-02-24
&gt; #&gt; 5 Prince musician 57 TRUE 1958-06-06 2016-04-20
&gt; #&gt; 6 Alan Rickman actor 69 FALSE 1946-02-20 2016-01-13
&gt; #&gt; 7 Florence Henderson actor 82 TRUE 1934-02-13 2016-11-23
&gt; #&gt; 8 Harper Lee author 89 FALSE 1926-04-27 2016-02-18
&gt; #&gt; 9 Zsa Zsa G&#225;bor actor 99 TRUE 1917-02-05 2016-12-17
&gt; #&gt; 10 George Michael musician 53 FALSE 1963-06-24 2016-12-24 Created on 2023-03-27 with [reprex
&gt; v2.0.2](https://reprex.tidyverse.org/)

>
> For example, instead of the date of birth for Bowie being
> `1947-01-08", the day becomes "1947-01-07". The same is true for all
> dates of these musicians.
>
> I know that package readxl read the data right as this is the
> excel sheet that the data came from. The dates match identically
> between the excel and the resulting R tibble. <img width="502"
> alt="Annotation 2023-03-27 093609"
> src="https://user-images.githubusercontent.com/17706062/227954203-ae72f540-730a-4f1d-b472-57cdce131c16.png">
>
> The packages and versions used: R version 4.2.3 (2023-03-15 ucrt)
&gt; Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10
&gt; x64 (build 19045), RStudio 2023.3.0.386
&gt;
&gt; Locale: LC_COLLATE=English_United States.utf8
&gt; LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United
&gt; States.utf8 LC_NUMERIC=C
&gt; LC_TIME=English_United States.utf8
&gt;
&gt; Package version: base64enc_0.1.3 bslib_0.4.2 cachem_1.0.7
&gt; callr_3.7.3 cellranger_1.1.0 cli_3.6.0 clipr_0.8.0
&gt; compiler_4.2.3 cpp11_0.4.3 crayon_1.5.2 digest_0.6.31
&gt; dplyr_1.1.0 ellipsis_0.3.2 evaluate_0.20 fansi_1.0.4
&gt; fastmap_1.1.1 fs_1.6.1 generics_0.1.3 glue_1.6.2
&gt; graphics_4.2.3 grDevices_4.2.3 hablar_0.3.2 highr_0.10
&gt; hms_1.1.2 htmltools_0.5.4 jquerylib_0.1.4 jsonlite_1.8.4
&gt; knitr_1.42 lifecycle_1.0.3 lubridate_1.9.2 magrittr_2.0.3
&gt; memoise_2.0.1 methods_4.2.3 mime_0.12 pillar_1.8.1
&gt; pkgconfig_2.0.3 prettyunits_1.1.1 processx_3.8.0 progress_1.2.2
&gt; ps_1.7.3 purrr_1.0.1 R6_2.5.1 rappdirs_0.3.3
&gt; readxl_1.4.2 rematch_1.0.1 reprex_2.0.2 rlang_1.1.0
&gt; rmarkdown_2.20 rstudioapi_0.14 sass_0.4.5 stats_4.2.3
&gt; stringi_1.7.12 stringr_1.5.0 tibble_3.2.1 tidyselect_1.2.0
&gt; timechange_0.2.0 tinytex_0.44 tools_4.2.3 utf8_1.2.3
&gt; utils_4.2.3 vctrs_0.6.0 withr_2.5.0 xfun_0.37
&gt; yaml_2.3.7

答案1

得分: -1

请参阅 https://github.com/davidsjoberg/hablar/issues/17 获取可能的答案。以下是内容,以防链接页面失效:

由于某种原因,strftime 会删除具有午夜时间的日期的一天。根据文档,R 版本 4.2.0 及其后版本已经进行了更改:

strftimeformat.POSIXlt 的包装器,它和 format.POSIXct 首先通过调用 as.POSIXlt 将日期转换为类 "POSIXlt"(因此它们也适用于类 "Date")。注意,只有该转换依赖于时区。自从 R 版本 4.2.0 以来,as.POSIXlt() 转换现在对非有限数值 -Inf、Inf、NA 和 NaN 进行了不同处理(以前都被视为 NA),并且 POSIXlt 的 format() 方法现在将这些不同的非有限时间和日期与类型 double 类似地处理。

对于属于 POSIXct 类的变量,使用 as.Date() 可以解决问题,因此不需要检查 POSIXct 类。我没有写入权限来提取请求。

  1. as_reliable_dte <- function (.x, ...){
  2. if (any(class(.x) == "Date")) {
  3. return(.x)
  4. }
  5. if (is.logical(.x)) {
  6. stop("Logical vectors can't be converted to date.")
  7. }
  8. if (is.factor(.x)) {
  9. .x <- as.character(.x)
  10. }
  11. # if (any(class(.x) == "POSIXct")) {
  12. # .x <- strftime(.x)
  13. # }
  14. if (TRUE) {
  15. return(as.Date(.x, ...))
  16. }
  17. }

注意:函数 as_reliable_dte() 是由 dte() 调用的内部函数。

  1. dte <- function (...,
  2. .args = list()) {
  3. list(vars = dplyr::quos(...), fun =
  4. ~as_reliable_dte(., !!!.args))
  5. }
  6. A <- read_excel(
  7. readxl_example("deaths.xlsx"),
  8. range = "arts!A5:F15",
  9. .name_repair = "universal"
  10. )
  11. A %>%
  12. hablar::convert(dte(starts_with("Date")))
  1. # A tibble: 10 x 6
  2. Name Profession Age Has.kids Date.of.birth Date.of.death
  3. <chr> <chr> <dbl> <lgl> <date> <date>
  4. 1 David Bowie musician 69 TRUE 1947-01-08 2016-01-10
  5. 2 Carrie Fisher actor 60 TRUE 1956-10-21 2016-12-27
  6. 3 Chuck Berry musician 90 TRUE 1926-10-18 2017-03-18
  7. 4 Bill Paxton actor 61 TRUE 1955-05-17 2017-02-25
  8. 5 Prince musician 57 TRUE 1958-06-07 2016-04-21
  9. 6 Alan Rickman actor 69 FALSE 1946-02-21 2016-01-14
  10. 7 Florence Henderson actor 82 TRUE 1934-02-14 2016-11-24
  11. 8 Harper Lee author 89 FALSE 1926-04-28 2016-02-19
  12. 9 Zsa Zsa Gábor actor 99 TRUE 1917-02-06 2016-12-18
  13. 10 George Michael musician 53 FALSE 1963-06-25 2016-12-25
英文:

Please see https://github.com/davidsjoberg/hablar/issues/17 for a potential answer. The contents are shown below in case the linked page becomes invalidated:

> For some reason, strftime removes a day for dates that have times at
> midnight. According to the documentation, changes have been made in R
> versions 4.2.0 and following:
>
> > strftime is a wrapper for format.POSIXlt, and it and format.POSIXct first convert to class "POSIXlt" by calling
> as.POSIXlt
> (so they also work for class
> "Date"). Note
> that only that conversion depends on the time zone. Since R version
> 4.2.0, that as.POSIXlt() conversion now treats the non-finite numeric -Inf, Inf, NA and NaN differently (where previously all were treated as NA) and also the format() method for POSIXlt now treats these
> different non-finite times and dates analogously to type
> double.
>
> Using as.Date() for variables belonging to the POSIXct class solves
> the problem, so the checking of the POSIXct class is not needed. I
> don't have the writing permissions to pull a request.
>
> {r} as_reliable_dte &lt;- function (.x, ...){
&gt; if (any(class(.x) == &quot;Date&quot;)) {
&gt; return(.x)
&gt; }
&gt; if (is.logical(.x)) {
&gt; stop(&quot;Logical vectors can&#39;t be converted to date.&quot;)
&gt; }
&gt; if (is.factor(.x)) {
&gt; .x &lt;- as.character(.x)
&gt; }
&gt; # if (any(class(.x) == &quot;POSIXct&quot;)) {
&gt; # .x &lt;- strftime(.x)
&gt; # }
&gt; if (TRUE) {
&gt; return(as.Date(.x, ...))
&gt; } }

>
> Note to other users: The function as_reliable_dte() is an internal
> function that is called by dte(). {r} dte &lt;- function (...,
&gt; .args = list()) { list(vars = dplyr::quos(...), fun =
&gt; ~as_reliable_dte(., !!!.args)) }
&gt;
&gt;
&gt; A &lt;- read_excel( readxl_example(&quot;deaths.xlsx&quot;), range =
&gt; &quot;arts!A5:F15&quot;, .name_repair = &quot;universal&quot; ) A %&gt;%
&gt; hablar::convert(dte(starts_with(&quot;Date&quot;)))

&gt; # A tibble: 10 &#215; 6 Name Profession Age Has.kids Date.of.birth Date.of.death &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt;
&gt; &lt;lgl&gt; &lt;date&gt; &lt;date&gt; 1 David Bowie musician
&gt; 69 TRUE 1947-01-08 2016-01-10 2 Carrie Fisher actor
&gt; 60 TRUE 1956-10-21 2016-12-27 3 Chuck Berry musician
&gt; 90 TRUE 1926-10-18 2017-03-18 4 Bill Paxton actor
&gt; 61 TRUE 1955-05-17 2017-02-25 5 Prince musician
&gt; 57 TRUE 1958-06-07 2016-04-21 6 Alan Rickman actor
&gt; 69 FALSE 1946-02-21 2016-01-14 7 Florence Henderson actor
&gt; 82 TRUE 1934-02-14 2016-11-24 8 Harper Lee author
&gt; 89 FALSE 1926-04-28 2016-02-19 9 Zsa Zsa G&#225;bor actor
&gt; 99 TRUE 1917-02-06 2016-12-18 10 George Michael musician
&gt; 53 FALSE 1963-06-25 2016-12-25

huangapple
  • 本文由 发表于 2023年7月18日 04:59:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76708031.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定