英文:
hablar::dte() Issue in converting a datetime of class POSIXct to a date
问题
在R 4.2.3中,我发现将dte()应用于类别为"POSIXct"的日期时间会使日期减少一天。以下问题摘自https://github.com/davidsjoberg/hablar/issues/17;更多信息请参见链接。
感谢您允许我快速更改变量的类别的包。我发现将
dte()
应用于类别为"POSIXct"的日期时间会使日期减少一天。请参见下面的示例。
library(readxl)
library(hablar)
library(tidyselect)
library(magrittr)
A <- read_excel(
readxl_example("deaths.xlsx"),
range = "arts!A5:F15",
.name_repair = "universal"
)
class(A$Date.of.birth)
# [1] "POSIXct" "POSIXt"
# A tibble: 10 × 6
# Name Profession Age Has.kids Date.of.birth Date.of.death
# <chr> <chr> <dbl> <lgl> <dttm> <dttm>
# 1 David Bowie musician 69 TRUE 1947-01-08 00:00:00 2016-01-10 00:00:00
# 2 Carrie Fisher actor 60 TRUE 1956-10-21 00:00:00 2016-12-27 00:00:00
# 3 Chuck Berry musician 90 TRUE 1926-10-18 00:00:00 2017-03-18 00:00:00
# 4 Bill Paxton actor 61 TRUE 1955-05-17 00:00:00 2017-02-25 00:00:00
# 5 Prince musician 57 TRUE 1958-06-07 00:00:00 2016-04-21 00:00:00
# 6 Alan Rickman actor 69 FALSE 1946-02-21 00:00:00 2016-01-14 00:00:00
# 7 Florence Henderson actor 82 TRUE 1934-02-14 00:00:00 2016-11-24 00:00:00
# 8 Harper Lee author 89 FALSE 1926-04-28 00:00:00 2016-02-19 00:00:00
# 9 Zsa Zsa Gábor actor 99 TRUE 1917-02-06 00:00:00 2016-12-18 00:00:00
# 10 George Michael musician 53 FALSE 1963-06-25 00:00:00 2016-12-25 00:00:00
# … with abbreviated variable names
在这个示例中,与Bowie的出生日期为"1947-01-08"不同,日期变成了"1947-01-07"。所有这些音乐家的日期都是如此。
我知道readxl包正确读取了数据,因为这是数据来自的Excel表。 Excel和生成的R tibble之间的日期完全匹配。
包和版本使用的版本:
R版本4.2.3(2023-03-15 ucrt)
平台:x86_64-w64-mingw32/x64(64位)
运行在:Windows 10 x64(版本19045),RStudio 2023.3.0.386
区域设置:LC_COLLATE=English_United States.utf8
LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United
States.utf8 LC_NUMERIC=C
LC_TIME=English_United States.utf8
包版本:base64enc_0.1.3 bslib_0.4.2 cachem_1.0.7
callr_3.7.3 cellranger_1.1.0 cli_3.6.0 clipr_0.8.0
compiler_4.2.3 cpp11_0.4.3 crayon_1.5.2 digest_0.6.31
dplyr_1.1.0 ellipsis_0.3.2 evaluate_0.20 fansi_1.0.4
fastmap_1.1.1 fs_1.6.1 generics_0.1.3 glue_1.6.2
graphics_4.2.3 grDevices_4.2.3 hablar_0.3.2 highr_0.10
hms_1.1.2 htmltools_0.5.4 jquerylib_0.1.4 jsonlite_1.8.4
knitr_1.42 lifecycle_1.0.3 lubridate_1.9.2 magrittr_2.0.3
memoise_2.0.1 methods_4.2.3 mime_0.12 pillar_1.8.1
pkgconfig_2.0.3 prettyunits_1.1.1 processx_3.8.0 progress_1.2.2
ps_1.7.3 purrr_1.0.1 R6_2.5.1 rappdirs_0.3.3
readxl_1.4.2 rematch_1.0.1 reprex_2.0.2 rlang_1.1.0
rmarkdown_2.20 rstudioapi_0.14 sass_0.4.5 stats_4.2.3
stringi_1.7.12 stringr_1.5.0 tibble_3.2.1 tidyselect_1.2.0
timechange_0.2.0 tinytex_0.44 tools_4.2.3 utf8_1.2.3
utils_4.2.3 vctrs_0.6.0 withr_2.5.0 xfun_0.37
yaml_2.3.7
英文:
In R 4.2.3, I found that applying dte() to date times of a class "POSIXct" makes the day one less. The following issue is copied from https://github.com/davidsjoberg/hablar/issues/17; please see the link for more information.
> Thank you for allowing a package that quickly allows me to change
> classes of variables. I found that applying dte()
to date times of a
> class "POSIXct" makes the day be one less. Please see the example
> below.
>
> {r} library(readxl) library(hablar) library(tidyselect)
> library(magrittr)
>
> A <- read_excel( readxl_example("deaths.xlsx"), range =
> "arts!A5:F15", .name_repair = "universal" )
> #> New names:
> #> • `Has kids` -> `Has.kids`
> #> • `Date of birth` -> `Date.of.birth`
> #> • `Date of death` -> `Date.of.death` class(A$Date.of.birth)
> #> [1] "POSIXct" "POSIXt" A
> #> # A tibble: 10 × 6
> #> Name Profe…¹ Age Has.k…² Date.of.birth Date.of.death
> #> <chr> <chr> <dbl> <lgl> <dttm> <dttm>
> #> 1 David Bowie musici… 69 TRUE 1947-01-08 00:00:00 2016-01-10 00:00:00
> #> 2 Carrie Fisher actor 60 TRUE 1956-10-21 00:00:00 2016-12-27 00:00:00
> #> 3 Chuck Berry musici… 90 TRUE 1926-10-18 00:00:00 2017-03-18 00:00:00
> #> 4 Bill Paxton actor 61 TRUE 1955-05-17 00:00:00 2017-02-25 00:00:00
> #> 5 Prince musici… 57 TRUE 1958-06-07 00:00:00 2016-04-21 00:00:00
> #> 6 Alan Rickman actor 69 FALSE 1946-02-21 00:00:00 2016-01-14 00:00:00
> #> 7 Florence Hende… actor 82 TRUE 1934-02-14 00:00:00 2016-11-24 00:00:00
> #> 8 Harper Lee author 89 FALSE 1926-04-28 00:00:00 2016-02-19 00:00:00
> #> 9 Zsa Zsa Gábor actor 99 TRUE 1917-02-06 00:00:00 2016-12-18 00:00:00
> #> 10 George Michael musici… 53 FALSE 1963-06-25 00:00:00 2016-12-25 00:00:00
> #> # … with abbreviated variable names ¹Profession, ²Has.kids A %>% hablar::convert(dte(starts_with("Date")))
> #> # A tibble: 10 × 6
> #> Name Profession Age Has.kids Date.of.birth Date.of.death
> #> <chr> <chr> <dbl> <lgl> <date> <date>
> #> 1 David Bowie musician 69 TRUE 1947-01-07 2016-01-09
> #> 2 Carrie Fisher actor 60 TRUE 1956-10-20 2016-12-26
> #> 3 Chuck Berry musician 90 TRUE 1926-10-17 2017-03-17
> #> 4 Bill Paxton actor 61 TRUE 1955-05-16 2017-02-24
> #> 5 Prince musician 57 TRUE 1958-06-06 2016-04-20
> #> 6 Alan Rickman actor 69 FALSE 1946-02-20 2016-01-13
> #> 7 Florence Henderson actor 82 TRUE 1934-02-13 2016-11-23
> #> 8 Harper Lee author 89 FALSE 1926-04-27 2016-02-18
> #> 9 Zsa Zsa Gábor actor 99 TRUE 1917-02-05 2016-12-17
> #> 10 George Michael musician 53 FALSE 1963-06-24 2016-12-24 Created on 2023-03-27 with [reprex
> v2.0.2](https://reprex.tidyverse.org/)
>
> For example, instead of the date of birth for Bowie being
> `1947-01-08", the day becomes "1947-01-07". The same is true for all
> dates of these musicians.
>
> I know that package readxl read the data right as this is the
> excel sheet that the data came from. The dates match identically
> between the excel and the resulting R tibble. <img width="502"
> alt="Annotation 2023-03-27 093609"
> src="https://user-images.githubusercontent.com/17706062/227954203-ae72f540-730a-4f1d-b472-57cdce131c16.png">
>
> The packages and versions used: R version 4.2.3 (2023-03-15 ucrt)
> Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10
> x64 (build 19045), RStudio 2023.3.0.386
>
> Locale: LC_COLLATE=English_United States.utf8
> LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United
> States.utf8 LC_NUMERIC=C
> LC_TIME=English_United States.utf8
>
> Package version: base64enc_0.1.3 bslib_0.4.2 cachem_1.0.7
> callr_3.7.3 cellranger_1.1.0 cli_3.6.0 clipr_0.8.0
> compiler_4.2.3 cpp11_0.4.3 crayon_1.5.2 digest_0.6.31
> dplyr_1.1.0 ellipsis_0.3.2 evaluate_0.20 fansi_1.0.4
> fastmap_1.1.1 fs_1.6.1 generics_0.1.3 glue_1.6.2
> graphics_4.2.3 grDevices_4.2.3 hablar_0.3.2 highr_0.10
> hms_1.1.2 htmltools_0.5.4 jquerylib_0.1.4 jsonlite_1.8.4
> knitr_1.42 lifecycle_1.0.3 lubridate_1.9.2 magrittr_2.0.3
> memoise_2.0.1 methods_4.2.3 mime_0.12 pillar_1.8.1
> pkgconfig_2.0.3 prettyunits_1.1.1 processx_3.8.0 progress_1.2.2
> ps_1.7.3 purrr_1.0.1 R6_2.5.1 rappdirs_0.3.3
> readxl_1.4.2 rematch_1.0.1 reprex_2.0.2 rlang_1.1.0
> rmarkdown_2.20 rstudioapi_0.14 sass_0.4.5 stats_4.2.3
> stringi_1.7.12 stringr_1.5.0 tibble_3.2.1 tidyselect_1.2.0
> timechange_0.2.0 tinytex_0.44 tools_4.2.3 utf8_1.2.3
> utils_4.2.3 vctrs_0.6.0 withr_2.5.0 xfun_0.37
> yaml_2.3.7
答案1
得分: -1
请参阅 https://github.com/davidsjoberg/hablar/issues/17 获取可能的答案。以下是内容,以防链接页面失效:
由于某种原因,
strftime
会删除具有午夜时间的日期的一天。根据文档,R 版本 4.2.0 及其后版本已经进行了更改:
strftime
是format.POSIXlt
的包装器,它和format.POSIXct
首先通过调用 as.POSIXlt 将日期转换为类 "POSIXlt"(因此它们也适用于类 "Date")。注意,只有该转换依赖于时区。自从 R 版本 4.2.0 以来,as.POSIXlt()
转换现在对非有限数值 -Inf、Inf、NA 和 NaN 进行了不同处理(以前都被视为 NA),并且 POSIXlt 的 format() 方法现在将这些不同的非有限时间和日期与类型 double 类似地处理。
对于属于 POSIXct 类的变量,使用
as.Date()
可以解决问题,因此不需要检查 POSIXct 类。我没有写入权限来提取请求。
as_reliable_dte <- function (.x, ...){
if (any(class(.x) == "Date")) {
return(.x)
}
if (is.logical(.x)) {
stop("Logical vectors can't be converted to date.")
}
if (is.factor(.x)) {
.x <- as.character(.x)
}
# if (any(class(.x) == "POSIXct")) {
# .x <- strftime(.x)
# }
if (TRUE) {
return(as.Date(.x, ...))
}
}
注意:函数 as_reliable_dte()
是由 dte()
调用的内部函数。
dte <- function (...,
.args = list()) {
list(vars = dplyr::quos(...), fun =
~as_reliable_dte(., !!!.args))
}
A <- read_excel(
readxl_example("deaths.xlsx"),
range = "arts!A5:F15",
.name_repair = "universal"
)
A %>%
hablar::convert(dte(starts_with("Date")))
# A tibble: 10 x 6
Name Profession Age Has.kids Date.of.birth Date.of.death
<chr> <chr> <dbl> <lgl> <date> <date>
1 David Bowie musician 69 TRUE 1947-01-08 2016-01-10
2 Carrie Fisher actor 60 TRUE 1956-10-21 2016-12-27
3 Chuck Berry musician 90 TRUE 1926-10-18 2017-03-18
4 Bill Paxton actor 61 TRUE 1955-05-17 2017-02-25
5 Prince musician 57 TRUE 1958-06-07 2016-04-21
6 Alan Rickman actor 69 FALSE 1946-02-21 2016-01-14
7 Florence Henderson actor 82 TRUE 1934-02-14 2016-11-24
8 Harper Lee author 89 FALSE 1926-04-28 2016-02-19
9 Zsa Zsa Gábor actor 99 TRUE 1917-02-06 2016-12-18
10 George Michael musician 53 FALSE 1963-06-25 2016-12-25
英文:
Please see https://github.com/davidsjoberg/hablar/issues/17 for a potential answer. The contents are shown below in case the linked page becomes invalidated:
> For some reason, strftime
removes a day for dates that have times at
> midnight. According to the documentation, changes have been made in R
> versions 4.2.0 and following:
>
> > strftime is a wrapper for format.POSIXlt, and it and format.POSIXct first convert to class "POSIXlt" by calling
> as.POSIXlt
> (so they also work for class
> "Date"). Note
> that only that conversion depends on the time zone. Since R version
> 4.2.0, that as.POSIXlt() conversion now treats the non-finite numeric -Inf, Inf, NA and NaN differently (where previously all were treated as NA) and also the format() method for POSIXlt now treats these
> different non-finite times and dates analogously to type
> double.
>
> Using as.Date()
for variables belonging to the POSIXct class solves
> the problem, so the checking of the POSIXct class is not needed. I
> don't have the writing permissions to pull a request.
>
> {r} as_reliable_dte <- function (.x, ...){
> if (any(class(.x) == "Date")) {
> return(.x)
> }
> if (is.logical(.x)) {
> stop("Logical vectors can't be converted to date.")
> }
> if (is.factor(.x)) {
> .x <- as.character(.x)
> }
> # if (any(class(.x) == "POSIXct")) {
> # .x <- strftime(.x)
> # }
> if (TRUE) {
> return(as.Date(.x, ...))
> } }
>
> Note to other users: The function as_reliable_dte()
is an internal
> function that is called by dte()
. {r} dte <- function (...,
> .args = list()) { list(vars = dplyr::quos(...), fun =
> ~as_reliable_dte(., !!!.args)) }
>
>
> A <- read_excel( readxl_example("deaths.xlsx"), range =
> "arts!A5:F15", .name_repair = "universal" ) A %>%
> hablar::convert(dte(starts_with("Date")))
> # A tibble: 10 × 6 Name Profession Age Has.kids Date.of.birth Date.of.death <chr> <chr> <dbl>
> <lgl> <date> <date> 1 David Bowie musician
> 69 TRUE 1947-01-08 2016-01-10 2 Carrie Fisher actor
> 60 TRUE 1956-10-21 2016-12-27 3 Chuck Berry musician
> 90 TRUE 1926-10-18 2017-03-18 4 Bill Paxton actor
> 61 TRUE 1955-05-17 2017-02-25 5 Prince musician
> 57 TRUE 1958-06-07 2016-04-21 6 Alan Rickman actor
> 69 FALSE 1946-02-21 2016-01-14 7 Florence Henderson actor
> 82 TRUE 1934-02-14 2016-11-24 8 Harper Lee author
> 89 FALSE 1926-04-28 2016-02-19 9 Zsa Zsa Gábor actor
> 99 TRUE 1917-02-06 2016-12-18 10 George Michael musician
> 53 FALSE 1963-06-25 2016-12-25
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论