英文:
Change/Map levels of a data frame columns using other reference list in R
问题
以下是翻译好的部分:
样本数据:
dt <- data.frame(a=c(1,2,3,4,5,6),b=c(50,10,20,30,99,190),c=c("a","b","c","d","e","f"))
ref_list <- list(a=c(1,2,3,4,5,6),b=NULL,c=c("a","b","c","d","e","f"))
str(dt)
'data.frame': 6 obs. of 3 variables:
$ a: num 1 2 3 4 5 6
$ b: num 50 10 20 30 99 190
$ c: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
期望输出:
将其转换为numeric
,如果在ref_list
中为NULL
,否则将其转换为具有在ref_list
中找到的水平的factor
。
'data.frame': 6 obs. of 3 variables:
$ a: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6
$ b: num 50 10 20 30 99 190
$ c: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
英文:
I am trying to change levels of the data frame and convert them to appropriate type using reference list.
Sample Data:
dt <- data.frame(a=c(1,2,3,4,5,6),b=c(50,10,20,30,99,190),c=c("a","b","c","d","e","f"))
ref_list <- list(a=c(1,2,3,4,5,6),b=NULL,c=c("a","b","c","d","e","f"))
str(dt)
'data.frame': 6 obs. of 3 variables:
$ a: num 1 2 3 4 5 6
$ b: num 50 10 20 30 99 190
$ c: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
Desired Output:
convert it to numeric
, if it is NULL
in the ref_list
else convert it to factor
with levels found in the ref_list
'data.frame': 6 obs. of 3 variables:
$ a: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6
$ b: num 50 10 20 30 99 190
$ c: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
答案1
得分: 2
我会这样做,首先识别需要转换的变量,即在列表中不是NULL
且在数据框中存在的变量,然后在一个快速的for循环中进行转换。
vars_to_convert = intersect(names(ref_list)[lengths(ref_list) > 0], names(dt))
for (var in vars_to_convert) {
dt[[var]] = factor(dt[[var]], levels = ref_list[[var]])
}
str(dt)
# 'data.frame': 6 obs. of 3 variables:
# $ a: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6
# $ b: num 50 10 20 30 99 190
# $ c: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
英文:
I would do it like this, first identifying the variables that need converting, i.e., the variables that are not NULL
in the list and also appear in the data frame, then converting in a quick for loop.
vars_to_convert = intersect(names(ref_list)[lengths(ref_list) > 0], names(dt))
for (var in vars_to_convert) {
dt[[var]] = factor(dt[[var]], levels = ref_list[[var]])
}
str(dt)
# 'data.frame': 6 obs. of 3 variables:
# $ a: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6
# $ b: num 50 10 20 30 99 190
# $ c: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论