如何从具有命名向量的列表创建一个tibble。

huangapple go评论95阅读模式
英文:

how to create a tibble from a list with named vectors

问题

我尝试从我有的数据集中制作一个网络构建的边缘数据框架:

  1. # 创建一个空列表以添加内容
  2. edge_list <- vector("list", nrow(x)-1)
  3. # 循环遍历每一行,将起始点和终点属性ID添加到边缘列表
  4. for (k in 1 : (nrow(x)-1)) {
  5. # 检查相邻行的属性ID是否相同
  6. if (x[[k, 3]] != x[[k+1, 3]]) {
  7. # 在每个列表元素内创建一个带名称的向量
  8. edge_list[[k]] = c(from = x[[k,3]], to = x[[k+1,3]]) # 如果不同,则将属性ID添加到列表
  9. } else {
  10. edge_list[[k]] = c(from = NA, to = NA) # 如果相同,则将NA放入,稍后会删除这些。
  11. }
  12. }

这是它的样子:

  1. # 查看前几行边缘列表
  2. head(edge_list)

我试图将其转换为一个类似这样的数据框,以便可以将其输入到网络包中:

  1. # 将列表转换为数据框
  2. edge_df <- data.frame(do.call(rbind, edge_list))
  3. # 为列命名
  4. colnames(edge_df) <- c("from", "to")
英文:

I am trying to make an edge df from a dataset that I have for a network I'm building:

  1. &gt; head(x)
  2. # A tibble: 6 &#215; 8
  3. visitID entry_date propertyID prop_grp_ID entry_type edgeID
  4. &lt;dbl&gt; &lt;dttm&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;fct&gt; &lt;chr&gt;
  5. 1 1157349647 2022-04-01 00:00:00 251317243 1915318832 X14 231224220
  6. 2 1215457284 2022-04-01 00:00:00 165589924 1915318832 X14 231224220
  7. 3 986527720 2022-04-01 00:00:00 659308365 1915318832 X14 231224220
  8. 4 986527720 2022-04-01 00:00:00 659308365 1915318832 X14 231224220
  9. 5 1106040433 2022-04-01 00:00:00 1659425840 1915318832 X14 231224220
  10. 6 1106040433 2022-04-01 00:00:00 1659425840 1915318832 X14 231224220
  11. # create an empty list to add to
  12. edge_list &lt;- vector(&quot;list&quot;, nrow(x)-1)
  13. # loop through each row number and add the from and to property ids to edge list
  14. for (k in 1 : (nrow(x)-1)) {
  15. # check to see if the property IDs in adjacent rows are the same or not.
  16. if (x[[k, 3]] != x[[k+1, 3]]) {
  17. # create a named vector inside each list element
  18. edge_list[[k]] = c(from = x[[k,3]], to = x[[k+1,3]]) # if they aren&#39;t the same, then add the propertyID to the list
  19. } else {
  20. edge_list[[k]] = c(from = NA, to = NA) # if they are the same, then put NA, these will be dropped later.
  21. }
  22. }

And this is what it looks like:

  1. &gt; head(edge_list)
  2. [[1]]
  3. from to
  4. 251317243 165589924
  5. [[2]]
  6. from to
  7. 165589924 659308365
  8. [[3]]
  9. from to
  10. NA NA
  11. [[4]]
  12. from to
  13. 659308365 1659425840
  14. [[5]]
  15. from to
  16. NA NA
  17. [[6]]
  18. from to
  19. 1659425840 1834267060

I am trying to turn it into a tibble that will look like this so I can feed it into the network packages:

  1. from to
  2. 251317243 165589924
  3. 165589924 659308365
  4. NA NA
  5. 659308365 1659425840
  6. NA NA
  7. 1659425840 1834267060

I've found posts https://stackoverflow.com/questions/45452015/how-to-convert-list-of-list-into-a-tibble-dataframe, https://stackoverflow.com/questions/71225412/convert-named-list-of-lists-to-dataframe-using-tidy-approach?noredirect=1&amp;lq=1, https://stackoverflow.com/questions/69851215/named-list-to-data-frame, https://stackoverflow.com/questions/10432993/named-list-to-from-data-frame, https://stackoverflow.com/questions/74259987/how-to-keep-the-name-of-vector-as-a-column-in-the-tibble but not quite what I'm looking for since the list isn't named, but the vector inside the list is, and that's what I need for the column names.

I've tried:

  1. test &lt;- enframe(unlist(edge_list))
  2. test &lt;- test |&gt;
  3. pivot_wider(names_from = name, values_from = value)

and got

  1. # A tibble: 1 &#215; 2
  2. from to
  3. &lt;list&gt; &lt;list&gt;
  4. 1 &lt;dbl [693]&gt; &lt;dbl [693]&gt;

and warnings that there are duplicates, which I'm expecting.

I've tried versions of this:

  1. tbl_colnames &lt;- c(&quot;from&quot;, &quot;to&quot;)
  2. edge_df &lt;- tibble(
  3. from = numeric(),
  4. to = numeric(),
  5. .name_repair = tbl_colnames
  6. )

I've found posts an using map(), which I don't really understand what they are doing. I've also tried starting with a dataframe instead of a list, but couldn't figure out how to make an empty tibble with named columns and 693 rows. (I was taught not to 'grow' vectors so I assumed you shouldn't grow tibbles either)

Is there a straight forward way of doing this?
Thanks.

答案1

得分: 3

  1. edge_list <- list(c(from=1,to=2), c(from=3,to=4), c(from=NA, to=NA))
  2. # dplyr
  3. library(dplyr)
  4. bind_rows(edge_list)
  5. # # A tibble: 3 × 2
  6. # from to
  7. # <dbl> <dbl>
  8. # 1 1 2
  9. # 2 3 4
  10. # 3 NA NA
  11. # base R
  12. # R-4 or newer
  13. do.call(rbind.data.frame, edge_list) |
  14. setNames(names(edge_list[[1]]))
  15. # from to
  16. # 1 1 2
  17. # 2 3 4
  18. # 3 NA NA
  19. or
  20. do.call(rbind.data.frame, lapply(edge_list, as.list))
  21. # from to
  22. # 1 1 2
  23. # 2 3 4
  24. # 3 NA NA
  25. (Base R isn't getting the names from a named vector, but a named list works.)
英文:

Sample data:

  1. edge_list &lt;- list(c(from=1,to=2), c(from=3,to=4), c(from=NA, to=NA))

dplyr

  1. library(dplyr)
  2. bind_rows(edge_list)
  3. # # A tibble: 3 &#215; 2
  4. # from to
  5. # &lt;dbl&gt; &lt;dbl&gt;
  6. # 1 1 2
  7. # 2 3 4
  8. # 3 NA NA

base R

  1. # R-4 or newer
  2. do.call(rbind.data.frame, edge_list) |&gt;
  3. setNames(names(edge_list[[1]]))
  4. # from to
  5. # 1 1 2
  6. # 2 3 4
  7. # 3 NA NA

or

  1. do.call(rbind.data.frame, lapply(edge_list, as.list))
  2. # from to
  3. # 1 1 2
  4. # 2 3 4
  5. # 3 NA NA

(Base R isn't getting the names from a named vector, but a named list works.)

答案2

得分: 2

这是do.call的一个良好使用示例:
除了 @r2evans 的解决方案外:

  1. do.call(rbind, edge_list) %>%
  2. as.data.frame()
  3. from to
  4. 1 251317243 165589924
  5. 2 165589924 659308365
  6. 3 NA NA
  7. 4 659308365 1659425840
  8. 5 NA NA
  9. 6 1659425840 1834267060

数据:

  1. edge_list <- list(
  2. structure(list(from = 251317243, to = 165589924), class = "AsIs"),
  3. structure(list(from = 165589924, to = 659308365), class = "AsIs"),
  4. structure(list(from = NA, to = NA), class = "AsIs"),
  5. structure(list(from = 659308365, to = 1659425840), class = "AsIs"),
  6. structure(list(from = NA, to = NA), class = "AsIs"),
  7. structure(list(from = 1659425840, to = 1834267060), class = "AsIs")
  8. )
英文:

This is a good use case for do.call:
In addition to @r2evans solution:

  1. do.call(rbind, edge_list) %&gt;%
  2. as.data.frame()
  3. from to
  4. 1 251317243 165589924
  5. 2 165589924 659308365
  6. 3 NA NA
  7. 4 659308365 1659425840
  8. 5 NA NA
  9. 6 1659425840 1834267060

data:

  1. edge_list &lt;- list(
  2. structure(list(from = 251317243, to = 165589924), class = &quot;AsIs&quot;),
  3. structure(list(from = 165589924, to = 659308365), class = &quot;AsIs&quot;),
  4. structure(list(from = NA, to = NA), class = &quot;AsIs&quot;),
  5. structure(list(from = 659308365, to = 1659425840), class = &quot;AsIs&quot;),
  6. structure(list(from = NA, to = NA), class = &quot;AsIs&quot;),
  7. structure(list(from = 1659425840, to = 1834267060), class = &quot;AsIs&quot;)
  8. )

huangapple
  • 本文由 发表于 2023年6月5日 02:09:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/76401804.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定