英文:
Converting data from a column to a date in a list of dataframes
问题
我有一个包含50个数据框的列表。
第二列在某些数据框中不是日期。这一列是文本,当转换为数字格式后,可以按照以下思路转换为日期:
d2 <- as.numeric(d2)
d2 <- as.Date(d2, origin = "1899-12-30")
如何在那些需要更改的df2列中进行更改?我知道您需要应用某种形式的apply,并根据df2列中的数据类型进行条件转换。
英文:
I have a list of 50s of dataframes
df1 <- data.frame(d1 = c (1,2,3),
d2 = c ("2021-01-01", "2021-01-02", "2021-01-03"))
df2 <- data.frame(d1 = c (11,22,33, 56),
d2 = c ("43877", "43878", "43879", "43880"))
df3 <- data.frame(d1 = c (0.1,0.2,0.3),
d2 = c ("2022-01-01", "2022-01-02", "2022-01-03"))
dff <- list (df1, df2, df3)
The second column in some data frames is not a date. This column is text which, when converted to a number format, can be converted to a date, as per the following idea:
d2 <- as.numeric(d2),
d2 <- as.Date(d2, origin = "1899-12-30")
How to make a change in those df2 columns where such a change is necessary? I know that you need to apply some form of apply and a conditional conversion depending on the datatype in the df2 column
答案1
得分: 2
试试看
lapply(dff, \(x) {
if(all(grepl("^\\d+$", x$d2))) {
x$d2 <- as.Date(as.numeric(x$d2), origin = "1899-12-30")} else
x$d2 <- as.Date(x$d2); x})
输出
[[1]]
d1 d2
1 1 2021-01-01
2 2 2021-01-02
3 3 2021-01-03
[[2]]
d1 d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19
[[3]]
d1 d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03
或者使用 tidyverse
library(dplyr)
library(janitor)
library(purrr)
library(lubridate)
map(dff, ~.x %>%
mutate(d2 = coalesce(ymd(d2), excel_numeric_to_date(as.numeric(d2)))))
输出
[[1]]
d1 d2
1 1 2021-01-01
2 2 2021-01-02
3 3 2021-01-03
[[2]]
d1 d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19
[[3]]
d1 d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03
英文:
Try
lapply(dff, \(x) {
if(all(grepl("^\\d+$", x$d2))) {
x$d2 <- as.Date(as.numeric(x$d2), origin = "1899-12-30")} else
x$d2 <- as.Date(x$d2); x})
-output
[[1]]
d1 d2
1 1 2021-01-01
2 2 2021-01-02
3 3 2021-01-03
[[2]]
d1 d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19
[[3]]
d1 d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03
Or with tidyverse
library(dplyr)
library(janitor)
library(purrr)
library(lubridate)
map(dff, ~.x %>%
mutate(d2 = coalesce(ymd(d2), excel_numeric_to_date(as.numeric(d2)))))
-output
[[1]]
d1 d2
1 1 2021-01-01
2 2 2021-01-02
3 3 2021-01-03
[[2]]
d1 d2
1 11 2020-02-16
2 22 2020-02-17
3 33 2020-02-18
4 56 2020-02-19
[[3]]
d1 d2
1 0.1 2022-01-01
2 0.2 2022-01-02
3 0.3 2022-01-03
答案2
得分: 2
使用指定的表达式对dff的每个组件使用transform。不使用任何包。
lapply(dff, transform, d2 = if (any(grepl("-", d2))) as.Date(d2) else
as.Date(as.numeric(d2), origin = "1899-12-30"))
英文:
Use transform on each component of dff with the indicated expression. No packages are used.
lapply(dff, transform, d2 = if (any(grepl("-", d2))) as.Date(d2) else
as.Date(as.numeric(d2), origin = "1899-12-30"))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论