英文:
how to create a tibble from a list with named vectors
问题
我尝试从我有的数据集中制作一个网络构建的边缘数据框架:
# 创建一个空列表以添加内容
edge_list <- vector("list", nrow(x)-1)
# 循环遍历每一行,将起始点和终点属性ID添加到边缘列表
for (k in 1 : (nrow(x)-1)) {
# 检查相邻行的属性ID是否相同
if (x[[k, 3]] != x[[k+1, 3]]) {
# 在每个列表元素内创建一个带名称的向量
edge_list[[k]] = c(from = x[[k,3]], to = x[[k+1,3]]) # 如果不同,则将属性ID添加到列表
} else {
edge_list[[k]] = c(from = NA, to = NA) # 如果相同,则将NA放入,稍后会删除这些。
}
}
这是它的样子:
# 查看前几行边缘列表
head(edge_list)
我试图将其转换为一个类似这样的数据框,以便可以将其输入到网络包中:
# 将列表转换为数据框
edge_df <- data.frame(do.call(rbind, edge_list))
# 为列命名
colnames(edge_df) <- c("from", "to")
英文:
I am trying to make an edge df from a dataset that I have for a network I'm building:
> head(x)
# A tibble: 6 × 8
visitID entry_date propertyID prop_grp_ID entry_type edgeID
<dbl> <dttm> <dbl> <dbl> <fct> <chr>
1 1157349647 2022-04-01 00:00:00 251317243 1915318832 X14 231224220
2 1215457284 2022-04-01 00:00:00 165589924 1915318832 X14 231224220
3 986527720 2022-04-01 00:00:00 659308365 1915318832 X14 231224220
4 986527720 2022-04-01 00:00:00 659308365 1915318832 X14 231224220
5 1106040433 2022-04-01 00:00:00 1659425840 1915318832 X14 231224220
6 1106040433 2022-04-01 00:00:00 1659425840 1915318832 X14 231224220
# create an empty list to add to
edge_list <- vector("list", nrow(x)-1)
# loop through each row number and add the from and to property ids to edge list
for (k in 1 : (nrow(x)-1)) {
# check to see if the property IDs in adjacent rows are the same or not.
if (x[[k, 3]] != x[[k+1, 3]]) {
# create a named vector inside each list element
edge_list[[k]] = c(from = x[[k,3]], to = x[[k+1,3]]) # if they aren't the same, then add the propertyID to the list
} else {
edge_list[[k]] = c(from = NA, to = NA) # if they are the same, then put NA, these will be dropped later.
}
}
And this is what it looks like:
> head(edge_list)
[[1]]
from to
251317243 165589924
[[2]]
from to
165589924 659308365
[[3]]
from to
NA NA
[[4]]
from to
659308365 1659425840
[[5]]
from to
NA NA
[[6]]
from to
1659425840 1834267060
I am trying to turn it into a tibble that will look like this so I can feed it into the network packages:
from to
251317243 165589924
165589924 659308365
NA NA
659308365 1659425840
NA NA
1659425840 1834267060
I've found posts https://stackoverflow.com/questions/45452015/how-to-convert-list-of-list-into-a-tibble-dataframe, https://stackoverflow.com/questions/71225412/convert-named-list-of-lists-to-dataframe-using-tidy-approach?noredirect=1&lq=1, https://stackoverflow.com/questions/69851215/named-list-to-data-frame, https://stackoverflow.com/questions/10432993/named-list-to-from-data-frame, https://stackoverflow.com/questions/74259987/how-to-keep-the-name-of-vector-as-a-column-in-the-tibble but not quite what I'm looking for since the list isn't named, but the vector inside the list is, and that's what I need for the column names.
I've tried:
test <- enframe(unlist(edge_list))
test <- test |>
pivot_wider(names_from = name, values_from = value)
and got
# A tibble: 1 × 2
from to
<list> <list>
1 <dbl [693]> <dbl [693]>
and warnings that there are duplicates, which I'm expecting.
I've tried versions of this:
tbl_colnames <- c("from", "to")
edge_df <- tibble(
from = numeric(),
to = numeric(),
.name_repair = tbl_colnames
)
I've found posts an using map(), which I don't really understand what they are doing. I've also tried starting with a dataframe instead of a list, but couldn't figure out how to make an empty tibble with named columns and 693 rows. (I was taught not to 'grow' vectors so I assumed you shouldn't grow tibbles either)
Is there a straight forward way of doing this?
Thanks.
答案1
得分: 3
edge_list <- list(c(from=1,to=2), c(from=3,to=4), c(from=NA, to=NA))
# dplyr
library(dplyr)
bind_rows(edge_list)
# # A tibble: 3 × 2
# from to
# <dbl> <dbl>
# 1 1 2
# 2 3 4
# 3 NA NA
# base R
# R-4 or newer
do.call(rbind.data.frame, edge_list) |
setNames(names(edge_list[[1]]))
# from to
# 1 1 2
# 2 3 4
# 3 NA NA
or
do.call(rbind.data.frame, lapply(edge_list, as.list))
# from to
# 1 1 2
# 2 3 4
# 3 NA NA
(Base R isn't getting the names from a named vector, but a named list works.)
英文:
Sample data:
edge_list <- list(c(from=1,to=2), c(from=3,to=4), c(from=NA, to=NA))
dplyr
library(dplyr)
bind_rows(edge_list)
# # A tibble: 3 × 2
# from to
# <dbl> <dbl>
# 1 1 2
# 2 3 4
# 3 NA NA
base R
# R-4 or newer
do.call(rbind.data.frame, edge_list) |>
setNames(names(edge_list[[1]]))
# from to
# 1 1 2
# 2 3 4
# 3 NA NA
or
do.call(rbind.data.frame, lapply(edge_list, as.list))
# from to
# 1 1 2
# 2 3 4
# 3 NA NA
(Base R isn't getting the names from a named vector, but a named list works.)
答案2
得分: 2
这是do.call
的一个良好使用示例:
除了 @r2evans 的解决方案外:
do.call(rbind, edge_list) %>%
as.data.frame()
from to
1 251317243 165589924
2 165589924 659308365
3 NA NA
4 659308365 1659425840
5 NA NA
6 1659425840 1834267060
数据:
edge_list <- list(
structure(list(from = 251317243, to = 165589924), class = "AsIs"),
structure(list(from = 165589924, to = 659308365), class = "AsIs"),
structure(list(from = NA, to = NA), class = "AsIs"),
structure(list(from = 659308365, to = 1659425840), class = "AsIs"),
structure(list(from = NA, to = NA), class = "AsIs"),
structure(list(from = 1659425840, to = 1834267060), class = "AsIs")
)
英文:
This is a good use case for do.call
:
In addition to @r2evans solution:
do.call(rbind, edge_list) %>%
as.data.frame()
from to
1 251317243 165589924
2 165589924 659308365
3 NA NA
4 659308365 1659425840
5 NA NA
6 1659425840 1834267060
data:
edge_list <- list(
structure(list(from = 251317243, to = 165589924), class = "AsIs"),
structure(list(from = 165589924, to = 659308365), class = "AsIs"),
structure(list(from = NA, to = NA), class = "AsIs"),
structure(list(from = 659308365, to = 1659425840), class = "AsIs"),
structure(list(from = NA, to = NA), class = "AsIs"),
structure(list(from = 1659425840, to = 1834267060), class = "AsIs")
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论