Handling 2000 vs 1900 in converting character dates to usable dates in R survey data

huangapple go评论122阅读模式
英文:

Handling 2000 vs 1900 in converting character dates to usable dates in R survey data

问题

我正在清理一些调查数据,并尝试将字符日期转换为可用日期,但函数自动将一些两位数年份分配给了1900年,另一些则为2000年,我难以弄清楚如何编写代码,以便假定任何两位数年份小于或等于23的将以20为前缀(即2001、2002、...2023),而所有其他两位数年份将以19为前缀(即1999、1998、...1924)。

我尝试了使用以下代码进行经典转换:

hd$DOB_new <- as.Date(hd$DOB, format='%m/%d/%y')

输出如下图所示:

Handling 2000 vs 1900 in converting character dates to usable dates in R survey data

英文:

I am working on cleaning up some survey data and trying to conver the character dates to usable dates but the functions are automatically assigning some two digit years to 1900 and some 2000 and I am having trouble figuring out how to write the codeo so it assumes any two digit year equal to or less than 23 will be prefaced with 20 (ie 2001, 2002, ... 2023) and all other two digit years will be prefaced with 19 (ie. 1999, 1998, ...1924).

I tried a classic conversion using the following code:

hd$DOB_new &lt;- as.Date(hd$DOB, format=&#39;%m/%d/%y&#39;)

and the output was this:

Handling 2000 vs 1900 in converting character dates to usable dates in R survey data

答案1

得分: 4

我会事后修复它。按照你已经进行的转换,然后回来修改,类似这样:

problems = hd$DOB_new >= '2024-01-01'
library(lubridate)
year(hd$DOB_new[problems]) = year(hd$DOB_new[problems]) - 100
英文:

I would fix it post-hoc. Do the conversion as you have, and then come back and modify, something like this:

problems = hd$DOB_new &gt;= &#39;2024-01-01&#39;
library(lubridate)
year(hd$DOB_new[problems]) = year(hd$DOB_new[problems]) - 100

答案2

得分: 1

另一种方法是使用 sub

date &lt;- &#39;01-01-24&#39;

as.Date(sub(&quot;(..)$&quot;, &quot;19\&quot;, date), &quot;%d-%m-%Y&quot;)
[1] &quot;1924-01-01&quot;

as.Date(sub(&quot;(..)$&quot;, &quot;20\&quot;, date), &quot;%d-%m-%Y&quot;)
[1] &quot;2024-01-01&quot;
英文:

Another way is to use sub:

date &lt;- &#39;01-01-24&#39;

as.Date(sub(&quot;(..)$&quot;, &quot;19\&quot;, date), &quot;%d-%m-%Y&quot;)
[1] &quot;1924-01-01&quot;

as.Date(sub(&quot;(..)$&quot;, &quot;20\&quot;, date), &quot;%d-%m-%Y&quot;)
[1] &quot;2024-01-01&quot;

答案3

得分: 0

这个一行代码附加了世纪然后使用 as.Datesub 提取了最后两个字符,然后将其与 "23" 进行比较,得到 TRUE/FALSE 值,加上 19 被视为 1 或 0,从而得到世纪。不使用任何包。

x <- c("6/22/94", "6/22/01") # 测试数据

as.Date(paste0(x, 19 + (sub(".*/", "", x) <= "23")), "%m/%d/%y%C")
## [1] "1994-06-22" "2001-06-22"
英文:

This one-liner appends the century and then uses as.Date. The sub extracts the last two characters and then we compare that to "23" giving a TRUE/FALSE value which when added to 19 is regarded as 1 or 0 giving the century. No packages are used.

x &lt;- c(&quot;6/22/94&quot;, &quot;6/22/01&quot;) # test data

as.Date(paste0(x, 19 + (sub(&quot;.*/&quot;, &quot;&quot;, x) &lt;= &quot;23&quot;)), &quot;%m/%d/%y%C&quot;)
## [1] &quot;1994-06-22&quot; &quot;2001-06-22&quot;

huangapple
  • 本文由 发表于 2023年6月1日 00:44:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/76375700.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定