英文:
Prevent the list number to appear in column names after do.call("cbind.data.frame", my_list)
问题
我经常在使用lapply()
后使用do.call("cbind.data.frame", my_list)
,通常情况下没有问题。但出现以下列表时,绑定后的列名会不同,列表编号会出现在名称之前。
我的列表类似于这样:
dput(my_list)
list(`1` = structure(list(`Pulmonary_embolism~f.20002.0` = NA,
`Pulmonary_embolism~f.20002.1` = NA, `Pulmonary_embolism~f.20002.2` = NA,
`Pulmonary_embolism~f.20002.3` = NA, `Pulmonary_embolism~f.20002.all` = NA), row.names = "1", class = "data.frame"),
`2` = structure(list(`Pulmonary_embolism~f.6152.0` = NA,
`Pulmonary_embolism~f.6152.1` = NA, `Pulmonary_embolism~f.6152.2` = NA,
`Pulmonary_embolism~f.6152.3` = NA, `Pulmonary_embolism~f.6152.all` = NA), row.names = "1", class = "data.frame"))
但在执行do.call("cbind.data.frame", my_list)
之后,变量发生了变化:
names(do.call("cbind.data.frame", my_list))
[1] "1.Pulmonary_embolism~f.20002.0" "1.Pulmonary_embolism~f.20002.1" "1.Pulmonary_embolism~f.20002.2" "1.Pulmonary_embolism~f.20002.3" "1.Pulmonary_embolism~f.20002.all"
[6] "2.Pulmonary_embolism~f.6152.0" "2.Pulmonary_embolism~f.6152.1" "2.Pulmonary_embolism~f.6152.2" "2.Pulmonary_embolism~f.6152.3" "2.Pulmonary_embolism~f.6152.all"
如何防止列表编号成为列名的一部分?
英文:
I often use do.call("cbind.data.frame", my_list)
after a lapply()
call and usually I face no problems. For some reason in the following list the column names are different after binding them; the list number preceeds the names.
My list is something like this:
dput(my_list)
list(`1` = structure(list(`Pulmonary_embolism~f.20002.0` = NA,
`Pulmonary_embolism~f.20002.1` = NA, `Pulmonary_embolism~f.20002.2` = NA,
`Pulmonary_embolism~f.20002.3` = NA, `Pulmonary_embolism~f.20002.all` = NA), row.names = "1", class = "data.frame"),
`2` = structure(list(`Pulmonary_embolism~f.6152.0` = NA,
`Pulmonary_embolism~f.6152.1` = NA, `Pulmonary_embolism~f.6152.2` = NA,
`Pulmonary_embolism~f.6152.3` = NA, `Pulmonary_embolism~f.6152.all` = NA), row.names = "1", class = "data.frame"))
But after do.call("cbind.data.frame", my_list)
the variables change:
names(do.call("cbind.data.frame", my_list))
[1] "1.Pulmonary_embolism~f.20002.0" "1.Pulmonary_embolism~f.20002.1" "1.Pulmonary_embolism~f.20002.2" "1.Pulmonary_embolism~f.20002.3" "1.Pulmonary_embolism~f.20002.all"
[6] "2.Pulmonary_embolism~f.6152.0" "2.Pulmonary_embolism~f.6152.1" "2.Pulmonary_embolism~f.6152.2" "2.Pulmonary_embolism~f.6152.3" "2.Pulmonary_embolism~f.6152.all"
How to prevent the list number beeing part of the column name?
答案1
得分: 1
首先,您可以使用 lapply
和 unlist
提取列名,确保使用 use.names=FALSE
选项来移除列名中的数据框名称。之后,您可以将这些列名应用到 do.call
的输出中,如下所示:
your_names = unlist(lapply(my_list, function(x) colnames(x)), use.names = FALSE)
df = do.call("cbind.data.frame", my_list)
names(df) = your_names
df
#> Pulmonary_embolism~f.20002.0 Pulmonary_embolism~f.20002.1
#> 1 NA NA
#> Pulmonary_embolism~f.20002.2 Pulmonary_embolism~f.20002.3
#> 1 NA NA
#> Pulmonary_embolism~f.20002.all Pulmonary_embolism~f.6152.0
#> 1 NA NA
#> Pulmonary_embolism~f.6152.1 Pulmonary_embolism~f.6152.2
#> 1 NA NA
#> Pulmonary_embolism~f.6152.3 Pulmonary_embolism~f.6152.all
#> 1 NA NA
另一种选择是使用 dplyr
中的 bind_cols
函数,以避免列名中包含数字,如下所示:
library(dplyr)
bind_cols(my_list)
#> Pulmonary_embolism~f.20002.0 Pulmonary_embolism~f.20002.1
#> 1 NA NA
#> Pulmonary_embolism~f.20002.2 Pulmonary_embolism~f.20002.3
#> 1 NA NA
#> Pulmonary_embolism~f.20002.all Pulmonary_embolism~f.6152.0
#> 1 NA NA
#> Pulmonary_embolism~f.6152.1 Pulmonary_embolism~f.6152.2
#> 1 NA NA
#> Pulmonary_embolism~f.6152.3 Pulmonary_embolism~f.6152.all
#> 1 NA NA
创建于 2023-06-29,使用 reprex v2.0.2
英文:
First you could extract the column names using lapply
with unlist
and make sure you use use.names=FALSE
to remove the names of the dataframes in the names of the column. After that you can use these names to your do.call
output like this:
your_names = unlist(lapply(my_list, \(x) colnames(x)), use.names = FALSE)
df = do.call("cbind.data.frame", my_list)
names(df) = your_names
df
#> Pulmonary_embolism~f.20002.0 Pulmonary_embolism~f.20002.1
#> 1 NA NA
#> Pulmonary_embolism~f.20002.2 Pulmonary_embolism~f.20002.3
#> 1 NA NA
#> Pulmonary_embolism~f.20002.all Pulmonary_embolism~f.6152.0
#> 1 NA NA
#> Pulmonary_embolism~f.6152.1 Pulmonary_embolism~f.6152.2
#> 1 NA NA
#> Pulmonary_embolism~f.6152.3 Pulmonary_embolism~f.6152.all
#> 1 NA NA
Another option could be using bind_cols
from dplyr
to not have the numbers in the names like this:
library(dplyr)
bind_cols(my_list)
#> Pulmonary_embolism~f.20002.0 Pulmonary_embolism~f.20002.1
#> 1 NA NA
#> Pulmonary_embolism~f.20002.2 Pulmonary_embolism~f.20002.3
#> 1 NA NA
#> Pulmonary_embolism~f.20002.all Pulmonary_embolism~f.6152.0
#> 1 NA NA
#> Pulmonary_embolism~f.6152.1 Pulmonary_embolism~f.6152.2
#> 1 NA NA
#> Pulmonary_embolism~f.6152.3 Pulmonary_embolism~f.6152.all
#> 1 NA NA
<sup>Created on 2023-06-29 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论