从一个数据框的列中将数据转换为日期

huangapple go评论59阅读模式
英文:

Converting data from a column to a date in a list of dataframes

问题

我有一个包含50个数据框的列表。

第二列在某些数据框中不是日期。这一列是文本,当转换为数字格式后,可以按照以下思路转换为日期:

d2 <- as.numeric(d2)
d2 <- as.Date(d2, origin = "1899-12-30")

如何在那些需要更改的df2列中进行更改?我知道您需要应用某种形式的apply,并根据df2列中的数据类型进行条件转换。

英文:

I have a list of 50s of dataframes

df1 &lt;- data.frame(d1 = c (1,2,3),
                  d2 = c (&quot;2021-01-01&quot;, &quot;2021-01-02&quot;, &quot;2021-01-03&quot;))

df2 &lt;- data.frame(d1 = c (11,22,33, 56),
                  d2 = c (&quot;43877&quot;, &quot;43878&quot;, &quot;43879&quot;, &quot;43880&quot;))

df3 &lt;- data.frame(d1 = c (0.1,0.2,0.3),
                  d2 = c (&quot;2022-01-01&quot;, &quot;2022-01-02&quot;, &quot;2022-01-03&quot;))

dff &lt;- list (df1, df2, df3)

The second column in some data frames is not a date. This column is text which, when converted to a number format, can be converted to a date, as per the following idea:

  d2 &lt;- as.numeric(d2),
  d2 &lt;- as.Date(d2, origin = &quot;1899-12-30&quot;)

How to make a change in those df2 columns where such a change is necessary? I know that you need to apply some form of apply and a conditional conversion depending on the datatype in the df2 column

答案1

得分: 2

试试看

lapply(dff, \(x) {
   if(all(grepl(&quot;^\\d+$&quot;, x$d2))) {
  x$d2 &lt;- as.Date(as.numeric(x$d2), origin = &quot;1899-12-30&quot;)} else 
      x$d2 &lt;- as.Date(x$d2); x})

输出

[[1]]
  d1         d2
1  1 2021-01-01
2  2 2021-01-02
3  3 2021-01-03

[[2]]
  d1         d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19

[[3]]
   d1         d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03

或者使用 tidyverse

library(dplyr)
library(janitor)
library(purrr)
library(lubridate)
map(dff,  ~.x %&gt;% 
    mutate(d2 = coalesce(ymd(d2), excel_numeric_to_date(as.numeric(d2)))))

输出

[[1]]
  d1         d2
1  1 2021-01-01
2  2 2021-01-02
3  3 2021-01-03

[[2]]
  d1         d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19

[[3]]
   d1         d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03
英文:

Try

lapply(dff, \(x) {
   if(all(grepl(&quot;^\\d+$&quot;, x$d2))) {
  x$d2 &lt;- as.Date(as.numeric(x$d2), origin = &quot;1899-12-30&quot;)} else 
      x$d2 &lt;- as.Date(x$d2); x})

-output

[[1]]
  d1         d2
1  1 2021-01-01
2  2 2021-01-02
3  3 2021-01-03

[[2]]
  d1         d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19

[[3]]
   d1         d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03

Or with tidyverse

library(dplyr)
library(janitor)
library(purrr)
library(lubridate)
map(dff,  ~.x %&gt;% 
    mutate(d2 = coalesce(ymd(d2), excel_numeric_to_date(as.numeric(d2)))))

-output

[[1]]
  d1         d2
1  1 2021-01-01
2  2 2021-01-02
3  3 2021-01-03

[[2]]
  d1         d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19

[[3]]
   d1         d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03

答案2

得分: 2

使用指定的表达式对dff的每个组件使用transform。不使用任何包。

lapply(dff, transform, d2 = if (any(grepl("-", d2))) as.Date(d2) else
  as.Date(as.numeric(d2), origin = "1899-12-30"))
英文:

Use transform on each component of dff with the indicated expression. No packages are used.

lapply(dff, transform, d2 = if (any(grepl(&quot;-&quot;, d2))) as.Date(d2) else
  as.Date(as.numeric(d2), origin = &quot;1899-12-30&quot;))

huangapple
  • 本文由 发表于 2023年2月24日 02:26:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/75548866.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定