在 mutate 中遍历嵌套列表并提取特定列表元素

huangapple go评论94阅读模式
英文:

map over a nested list in mutate and extract specific list elements

问题

  1. 我有一些看起来像这样的数据:
  2. price unleveragedData
  3. <dbl> <list>
  4. 1 450000 <list [5]>
  5. 2 400000 <list [5]>
  6. 3 400000 <list [5]>
  7. 4 397000 <list [5]>
  8. 5 750000 <list [5]>
  9. 6 550000 <list [5]>
  10. 我试图将列表的第 `5` 个元素放入新列中,即:
  11. price unleveragedData element5
  12. <dbl> <list>
  13. 1 450000 <list [5]> -----
  14. 2 400000 <list [5]> -----
  15. 3 400000 <list [5]>
  16. 4 397000 <list [5]>
  17. 5 750000 <list [5]>
  18. 6 550000 <list [5]> -----
  19. 使用以下代码:
  20. df$unleveragedData %>%
  21. map(., ~ pluck(., c(5)))
  22. 我可以得到我想要的输出:
  23. [[1]]
  24. [1] 260551.4
  25. [[2]]
  26. [1] 330786.9
  27. [[3]]
  28. [1] 330786.9
  29. [[4]]
  30. [1] 287739.3
  31. [[5]]
  32. [1] 566416
  33. [[6]]
  34. [1] 271879.7
  35. 然而,在 `mutate` 函数内部,我无法让它正常工作。
  36. df %>%
  37. mutate(
  38. element5 = map(unleveragedData, ~ map_dbl(., pluck(., c(5))))
  39. )
  40. 数据:
  41. df = structure(list(price = c(450000, 400000, 400000, 397000, 750000,
  42. 550000), unleveragedData = list(list(-0.0547083151944441, c(-450000,
  43. 15533.28, 16475.2128, 17473.760928, 18532.32229728, 280205.849576444
  44. ), "450000-0.08", structure(list(` ` = c("Revenue", "Vacancy",
  45. "Gross Revenue", "Operating Expenses", "Net Operating Income"), Year1 = c(16560, 828, 15732, 199, 15533), Year2 = c(17554,
  46. 878, 16676, 201, 16475), Year3 = c(18607, 930, 17676, 203, 17474
  47. ), Year4 = c(19723, 986, 18737, 205, 18532), Year5 = c(20907,
  48. 1045, 19861, 207, 19654), Year6 = c(22161, 1108, 21053, 209,
  49. 20844), purchasePriceCapRate = c("450000-0.08", "450000-0.08",
  50. "450000-0.08", "450000-0.08", "450000-0.08")), row.names = c(NA,
  51. -5L), class = "data.frame"), 260551.350870592), list(0.0224165566243759,
  52. c(-400000, 19720.512, 20916.35712, 22184.0790912, 23527.991786112,
  53. 355739.600331834), "400000-0.08", structure(list(` ` = c("Revenue",
  54. "Vacancy", "Gross Revenue", "Operating Expenses", "Net Operating Income"
  55. ), Year1 = c(21024, 1051, 19973, 252, 19721), Year2 = c(22285,
  56. 1114, 21171, 255, 20916), Year3 = c(23623, 1181, 22441, 257,
  57. 22184), Year4 = c(25040, 1252, 23788, 260, 23528), Year5 = c(26542,
  58. 1327, 25215, 263, 24953), Year6 = c(28135, 1407, 26728, 265,
  59. 26463), purchasePriceCapRate = c("400000-0.08", "400000-0.08",
  60. "400000-0.08", "400000-0.08", "400000-0.08")), row.names = c(NA,
  61. -5L), class = "data.frame"), 330786.932409621), list(0.0224165566243759,
  62. c(-400000, 19720.512, 20916.35712, 22184.0790912, 23527.991786112,
  63. 355739.600331834), "400000-0.08", structure(list(` ` = c("Revenue",
  64. "Vacancy", "Gross Revenue", "Operating Expenses", "Net Operating Income"
  65. ), Year1 = c(21024, 1051, 19973, 252, 19721), Year2 = c(22285,
  66. 1114, 21171, 255, 20916), Year3 = c(23623, 1181, 22441, 257,
  67. 22184), Year4 = c(25040, 1252, 23788, 260, 23528), Year5 = c(26542,
  68. 1327, 25215, 263, 24953), Year6 = c(28135, 1407, 26728, 265,
  69. 26463), purchasePriceCapRate = c("400000-0.08", "400000-0.08",
  70. "400000-0.08", "400000-0.08", "400000-0.08")), row.names = c(NA,
  71. -5L), class = "data.frame"), 330786.932409621), list(-0.00700507916565851,
  72. c(-397000, 17154.144, 18194.36544, 19297.1098944, 20466.129841344,
  73. 309444.720836595), "397000-0.08", structure(list(` ` = c("Revenue",
  74. "Vacancy", "Gross Revenue", "Operating Expenses", "Net Operating Income"
  75. ), Year1 = c(18288, 914, 17374, 219, 17154), Year2 = c(19385,
  76. 969, 18416, 222, 18194), Year3 = c(20548, 1027, 195
  77. <details>
  78. <summary>英文:</summary>
  79. I have some data which looks like:
  80. price unleveragedData
  81. &lt;dbl&gt; &lt;list&gt;
  82. 1 450000 &lt;list [5]&gt;
  83. 2 400000 &lt;list [5]&gt;
  84. 3 400000 &lt;list [5]&gt;
  85. 4 397000 &lt;list [5]&gt;
  86. 5 750000 &lt;list [5]&gt;
  87. 6 550000 &lt;list [5]&gt;
  88. I am trying to put into a new column the element `5` of the lists - i.e.
  89. price unleveragedData element5
  90. &lt;dbl&gt; &lt;list&gt;
  91. 1 450000 &lt;list [5]&gt; -----
  92. 2 400000 &lt;list [5]&gt; -----
  93. 3 400000 &lt;list [5]&gt;
  94. 4 397000 &lt;list [5]&gt;
  95. 5 750000 &lt;list [5]&gt;
  96. 6 550000 &lt;list [5]&gt; -----
  97. Using the following:
  98. df$unleveragedData %&gt;%
  99. map(., ~ pluck(., c(5)))
  100. I can get the output I want:
  101. [[1]]
  102. [1] 260551.4
  103. [[2]]
  104. [1] 330786.9
  105. [[3]]
  106. [1] 330786.9
  107. [[4]]
  108. [1] 287739.3
  109. [[5]]
  110. [1] 566416
  111. [[6]]
  112. [1] 271879.7
  113. However, inside the `mutate` function I can&#39;t get it to work.
  114. df %&gt;%
  115. mutate(
  116. element5 = map(unleveragedData, ~ map_dbl(., pluck(., c(5))))
  117. )
  118. Data:
  119. df = structure(list(price = c(450000, 400000, 400000, 397000, 750000,
  120. 550000), unleveragedData = list(list(-0.0547083151944441, c(-450000,
  121. 15533.28, 16475.2128, 17473.760928, 18532.32229728, 280205.849576444
  122. ), &quot;450000-0.08&quot;, structure(list(` ` = c(&quot;Revenue&quot;, &quot;Vacancy&quot;,
  123. &quot;Gross Revenue&quot;, &quot;Operating Expenses&quot;, &quot;Net Operating Income&quot;
  124. ), Year1 = c(16560, 828, 15732, 199, 15533), Year2 = c(17554,
  125. 878, 16676, 201, 16475), Year3 = c(18607, 930, 17676, 203, 17474
  126. ), Year4 = c(19723, 986, 18737, 205, 18532), Year5 = c(20907,
  127. 1045, 19861, 207, 19654), Year6 = c(22161, 1108, 21053, 209,
  128. 20844), purchasePriceCapRate = c(&quot;450000-0.08&quot;, &quot;450000-0.08&quot;,
  129. &quot;450000-0.08&quot;, &quot;450000-0.08&quot;, &quot;450000-0.08&quot;)), row.names = c(NA,
  130. -5L), class = &quot;data.frame&quot;), 260551.350870592), list(0.0224165566243759,
  131. c(-400000, 19720.512, 20916.35712, 22184.0790912, 23527.991786112,
  132. 355739.600331834), &quot;400000-0.08&quot;, structure(list(` ` = c(&quot;Revenue&quot;,
  133. &quot;Vacancy&quot;, &quot;Gross Revenue&quot;, &quot;Operating Expenses&quot;, &quot;Net Operating Income&quot;
  134. ), Year1 = c(21024, 1051, 19973, 252, 19721), Year2 = c(22285,
  135. 1114, 21171, 255, 20916), Year3 = c(23623, 1181, 22441, 257,
  136. 22184), Year4 = c(25040, 1252, 23788, 260, 23528), Year5 = c(26542,
  137. 1327, 25215, 263, 24953), Year6 = c(28135, 1407, 26728, 265,
  138. 26463), purchasePriceCapRate = c(&quot;400000-0.08&quot;, &quot;400000-0.08&quot;,
  139. &quot;400000-0.08&quot;, &quot;400000-0.08&quot;, &quot;400000-0.08&quot;)), row.names = c(NA,
  140. -5L), class = &quot;data.frame&quot;), 330786.932409621), list(0.0224165566243759,
  141. c(-400000, 19720.512, 20916.35712, 22184.0790912, 23527.991786112,
  142. 355739.600331834), &quot;400000-0.08&quot;, structure(list(` ` = c(&quot;Revenue&quot;,
  143. &quot;Vacancy&quot;, &quot;Gross Revenue&quot;, &quot;Operating Expenses&quot;, &quot;Net Operating Income&quot;
  144. ), Year1 = c(21024, 1051, 19973, 252, 19721), Year2 = c(22285,
  145. 1114, 21171, 255, 20916), Year3 = c(23623, 1181, 22441, 257,
  146. 22184), Year4 = c(25040, 1252, 23788, 260, 23528), Year5 = c(26542,
  147. 1327, 25215, 263, 24953), Year6 = c(28135, 1407, 26728, 265,
  148. 26463), purchasePriceCapRate = c(&quot;400000-0.08&quot;, &quot;400000-0.08&quot;,
  149. &quot;400000-0.08&quot;, &quot;400000-0.08&quot;, &quot;400000-0.08&quot;)), row.names = c(NA,
  150. -5L), class = &quot;data.frame&quot;), 330786.932409621), list(-0.00700507916565851,
  151. c(-397000, 17154.144, 18194.36544, 19297.1098944, 20466.129841344,
  152. 309444.720836595), &quot;397000-0.08&quot;, structure(list(` ` = c(&quot;Revenue&quot;,
  153. &quot;Vacancy&quot;, &quot;Gross Revenue&quot;, &quot;Operating Expenses&quot;, &quot;Net Operating Income&quot;
  154. ), Year1 = c(18288, 914, 17374, 219, 17154), Year2 = c(19385,
  155. 969, 18416, 222, 18194), Year3 = c(20548, 1027, 19521, 224,
  156. 19297), Year4 = c(21781, 1089, 20692, 226, 20466), Year5 = c(23088,
  157. 1154, 21934, 228, 21705), Year6 = c(24473, 1224, 23250, 231,
  158. 23019), purchasePriceCapRate = c(&quot;397000-0.08&quot;, &quot;397000-0.08&quot;,
  159. &quot;397000-0.08&quot;, &quot;397000-0.08&quot;, &quot;397000-0.08&quot;)), row.names = c(NA,
  160. -5L), class = &quot;data.frame&quot;), 287739.317917958), list(0.00205549716813258,
  161. c(-750000, 33768, 35815.68, 37986.4368, 40287.657168, 609143.15125314
  162. ), &quot;750000-0.08&quot;, structure(list(` ` = c(&quot;Revenue&quot;, &quot;Vacancy&quot;,
  163. &quot;Gross Revenue&quot;, &quot;Operating Expenses&quot;, &quot;Net Operating Income&quot;
  164. ), Year1 = c(36000, 1800, 34200, 432, 33768), Year2 = c(38160,
  165. 1908, 36252, 436, 35816), Year3 = c(40450, 2022, 38427, 441,
  166. 37986), Year4 = c(42877, 2144, 40733, 445, 40288), Year5 = c(45449,
  167. 2272, 43177, 450, 42727), Year6 = c(48176, 2409, 45767, 454,
  168. 45313), purchasePriceCapRate = c(&quot;750000-0.08&quot;, &quot;750000-0.08&quot;,
  169. &quot;750000-0.08&quot;, &quot;750000-0.08&quot;, &quot;750000-0.08&quot;)), row.names = c(NA,
  170. -5L), class = &quot;data.frame&quot;), 566415.98015346), list(-0.0866171399087425,
  171. c(-550000, 16208.64, 17191.5264, 18233.489664, 19338.07544064,
  172. 292388.712601507), &quot;550000-0.08&quot;, structure(list(` ` = c(&quot;Revenue&quot;,
  173. &quot;Vacancy&quot;, &quot;Gross Revenue&quot;, &quot;Operating Expenses&quot;, &quot;Net Operating Income&quot;
  174. ), Year1 = c(17280, 864, 16416, 207, 16209), Year2 = c(18317,
  175. 916, 17401, 209, 17192), Year3 = c(19416, 971, 18445, 212,
  176. 18233), Year4 = c(20581, 1029, 19552, 214, 19338), Year5 = c(21816,
  177. 1091, 20725, 216, 20509), Year6 = c(23125, 1156, 21968, 218,
  178. 21750), purchasePriceCapRate = c(&quot;550000-0.08&quot;, &quot;550000-0.08&quot;,
  179. &quot;550000-0.08&quot;, &quot;550000-0.08&quot;, &quot;550000-0.08&quot;)), row.names = c(NA,
  180. -5L), class = &quot;data.frame&quot;), 271879.670473661))), class = c(&quot;rowwise_df&quot;,
  181. &quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;), row.names = c(NA, -6L), groups = structure(list(
  182. .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c(&quot;vctrs_list_of&quot;,
  183. &quot;vctrs_vctr&quot;, &quot;list&quot;))), row.names = c(NA, -6L), class = c(&quot;tbl_df&quot;,
  184. &quot;tbl&quot;, &quot;data.frame&quot;)))
  185. </details>
  186. # 答案1
  187. **得分**: 4
  188. ```markdown
  189. 存在一个`rowwise`分组,如果要使用`map`,应该将其解除分组(`ungroup`)
  190. ```R
  191. library(dplyr)
  192. library(purrr)
  193. df %>%
  194. ungroup() %>%
  195. mutate(
  196. element5 = map_dbl(unleveragedData, ~ nth(.x, 5))
  197. )

-输出

  1. # 一个 tibble: 6 × 3
  2. price unleveragedData element5
  3. <dbl> <list> <dbl>
  4. 1 450000 <list [5]> 260551.
  5. 2 400000 <list [5]> 330787.
  6. 3 400000 <list [5]> 330787.
  7. 4 397000 <list [5]> 287739.
  8. 5 750000 <list [5]> 566416.
  9. 6 550000 <list [5]> 271880.

另外,由于是rowwise,我们也可以直接提取

  1. df %>%
  2. mutate(element5 = nth(unleveragedData, 5)) %>%
  3. ungroup()

-输出

  1. # 一个 tibble: 6 × 3
  2. price unleveragedData element5
  3. <dbl> <list> <dbl>
  4. 1 450000 <list [5]> 260551.
  5. 2 400000 <list [5]> 330787.
  6. 3 400000 <list [5]> 330787.
  7. 4 397000 <list [5]> 287739.
  8. 5 750000 <list [5]> 566416.
  9. 6 550000 <list [5]> 271880.

或者使用 pluck

  1. df %>%
  2. mutate(element5 = pluck(unleveragedData, 5)) %>%
  3. ungroup()

-输出

  1. # 一个 tibble: 6 × 3
  2. price unleveragedData element5
  3. <dbl> <list> <dbl>
  4. 1 450000 <list [5]> 260551.
  5. 2 400000 <list [5]> 330787.
  6. 3 400000 <list [5]> 330787.
  7. 4 397000 <list [5]> 287739.
  8. 5 750000 <list [5]> 566416.
  9. 6 550000 <list [5]> 271880.

rowwise上使用map也是可能的,只需要将其包装在pick

  1. df %>%
  2. mutate(element5 = map_dbl(pick(unleveragedData), pluck, 5))

-输出

  1. # 一个 tibble: 6 × 3
  2. # Rowwise:
  3. price unleveragedData element5
  4. <dbl> <list> <dbl>
  5. 1 450000 <list [5]> 260551.
  6. 2 400000 <list [5]> 330787.
  7. 3 400000 <list [5]> 330787.
  8. 4 397000 <list [5]> 287739.
  9. 5 750000 <list [5]> 566416.
  10. 6 550000 <list [5]> 271880.
  1. <details>
  2. <summary>英文:</summary>
  3. There is a `rowwise` grouping, which should be `ungroup`ed if we want to use `map`

library(dplyr)
library(purrr)
df %>%
ungroup %>%
mutate(
element5 = map_dbl(unleveragedData, ~ nth(.x, 5))
)

  1. -output

A tibble: 6 × 3

price unleveragedData element5
<dbl> <list> <dbl>
1 450000 <list [5]> 260551.
2 400000 <list [5]> 330787.
3 400000 <list [5]> 330787.
4 397000 <list [5]> 287739.
5 750000 <list [5]> 566416.
6 550000 <list [5]> 271880.

  1. ---
  2. Also, as it is `rowwise`, we can directly extract as well

df %>%
mutate(element5 = nth(unleveragedData, 5)) %>%
ungroup

  1. -output

A tibble: 6 × 3

price unleveragedData element5
<dbl> <list> <dbl>
1 450000 <list [5]> 260551.
2 400000 <list [5]> 330787.
3 400000 <list [5]> 330787.
4 397000 <list [5]> 287739.
5 750000 <list [5]> 566416.
6 550000 <list [5]> 271880.

  1. Or with `pluck`

df %>%
mutate(element5 = pluck(unleveragedData, 5)) %>%
ungroup

  1. -output

A tibble: 6 × 3

price unleveragedData element5
<dbl> <list> <dbl>
1 450000 <list [5]> 260551.
2 400000 <list [5]> 330787.
3 400000 <list [5]> 330787.
4 397000 <list [5]> 287739.
5 750000 <list [5]> 566416.
6 550000 <list [5]> 271880.

  1. ---
  2. It is possible to do this in `map` on a `rowwise`, if we wrap it in `pick`

df %>%
mutate(element5 = map_dbl(pick(unleveragedData), pluck, 5))

  1. -output

A tibble: 6 × 3

Rowwise:

price unleveragedData element5
<dbl> <list> <dbl>
1 450000 <list [5]> 260551.
2 400000 <list [5]> 330787.
3 400000 <list [5]> 330787.
4 397000 <list [5]> 287739.
5 750000 <list [5]> 566416.
6 550000 <list [5]> 271880.

  1. </details>

huangapple
  • 本文由 发表于 2023年2月27日 09:12:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/75576055.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定