英文:
Use a string to assign multiple column types to tibble
问题
以下是已更改所有列类型的代码部分:
d <- structure(list(a = as.factor(c(9, 9, 9, 9, 9, 9, 9)),
b = as.factor(c(2018, 2018, 2018, 2018, 2018, 2018, 2018)),
c = as.factor(c("605417CA", "605417CB", "606822AS", "606822AT", "606822AU", "606822AV", "60683MAB")),
d = as.integer(c(NA, NA, NA, NA, NA, NA, NA)),
e = as.integer(c(0, 0, 0, 0, 0, 0, 0)),
f = as.factor(c(2772, 2772, 46367, 46367, 46367, 46367, 47601))),
row.names = c(NA, -7L),
class = c("tbl_df", "tbl", "data.frame"))
如果您想通过字符串来更改列类型,您可以使用以下代码:
# 定义要进行的列类型转换字符串
conversion_string <- "fffiif"
# 创建一个空白数据框,与原始数据框结构相同,但列类型根据字符串进行转换
new_d <- as.data.frame(matrix(NA, nrow = nrow(d), ncol = ncol(d)))
colnames(new_d) <- colnames(d)
for (i in 1:ncol(d)) {
col_type <- substr(conversion_string, i, i)
if (col_type == "f") {
new_d[, i] <- as.factor(d[, i])
} else if (col_type == "i") {
new_d[, i] <- as.integer(d[, i])
} else {
new_d[, i] <- d[, i]
}
}
# 将新数据框的类别设置为与原始数据框相同
class(new_d) <- class(d)
这将创建一个新的数据框 new_d
,其列类型与您指定的字符串 fffiif
一致。
英文:
The following data has six columns. I want to change all their column types, respectively to factor-factor-factor-int-int-factor.
d <- structure(list(a = c(9, 9, 9, 9, 9, 9, 9), b = structure(c(2018, 2018, 2018, 2018, 2018, 2018, 2018), class = "yearmon"), c = c("605417CA", "605417CB", "606822AS", "606822AT", "606822AU", "606822AV", "60683MAB"), d = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), e = c(0, 0, 0, 0, 0, 0, 0), f = c(2772, 2772, 46367, 46367, 46367, 46367, 47601)), row.names = c(NA, -7L), class = c("tbl_df", "tbl", "data.frame"))
If I was reading this data from an external file, I would use vroom(path, col_types = "fffiif")
, and it automatically converts each variable in a string. But here the data is the result of previous computation, so I need to do the conversion myself. Is there a way to change all column types with a simple string, like vroom does?
Things I tried:
- Using
mutate
for each of six variables is quite long. - Conversions are not one-to-one. For example, "a" and "e" are double, but I want to convert them to factor and int respectively. So
mutate_if
would not work. - The magrittr package has
set_colnames
, to change colum names by passing a vector of strings. There may be something similar to change column types, but I haven't found anything. readr::type_convert
seems to only apply to columns of typecharacter
.
I saved the data locally and imported it with vroom(path, col_types = "fffiif")
, which works perfectly. So the question is to what function I can pass the string fffiif
to do the conversion once I already have the data.
答案1
得分: 2
要避免重复,可以使用[链接帖子](https://stackoverflow.com/a/72369872/680068)中的*forloop*或使用*across*进行*mutate*:
```R
library(dplyr)
d %>%
mutate(across(c(a:c, f), ~ as.factor(.x)),
across(d:e, ~ as.integer(.x)))
# # A tibble: 7 × 6
# a b c d e f
# <fct> <fct> <fct> <int> <int> <fct>
# 1 9 2018 605417CA NA 0 2772
# 2 9 2018 605417CB NA 0 2772
# 3 9 2018 606822AS NA 0 46367
# 4 9 2018 606822AT NA 0 46367
# 5 9 2018 606822AU NA 0 46367
# 6 9 2018 606822AV NA 0 46367
# 7 9 2018 60683MAB NA 0 47601
与链接帖子类似,使用lapply:
ff <- list(f = as.factor, i = as.integer)
cc <- unlist(strsplit("fffiif", ""))
d[] <- lapply(seq_along(d), \(i) ff[[cc[i]]](d[[i]]))
sapply(d, class)
# a b c d e f
# "factor" "factor" "factor" "integer" "integer" "factor"
英文:
Either use the forloop from the linked post or mutate with across to avoid repetition:
library(dplyr)
d %>%
mutate(across(c(a:c, f), ~ as.factor(.x)),
across(d:e, ~ as.integer(.x)))
# # A tibble: 7 × 6
# a b c d e f
# <fct> <fct> <fct> <int> <int> <fct>
# 1 9 2018 605417CA NA 0 2772
# 2 9 2018 605417CB NA 0 2772
# 3 9 2018 606822AS NA 0 46367
# 4 9 2018 606822AT NA 0 46367
# 5 9 2018 606822AU NA 0 46367
# 6 9 2018 606822AV NA 0 46367
# 7 9 2018 60683MAB NA 0 47601
Similar to the linked post, using lapply:
ff <- list(f = as.factor, i = as.integer)
cc <- unlist(strsplit("fffiif", ""))
d[] <- lapply(seq_along(d), \(i) ff[[ cc[ i ] ]](d[[ i ]]))
sapply(d, class)
# a b c d e f
# "factor" "factor" "factor" "integer" "integer" "factor"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论