在R中取消嵌套列表,将它们放入不同的列中

huangapple go评论94阅读模式
英文:

Unlisting lists inside data frame and put them in different columns in r

问题

  1. 我使用Twitter API获取了大量的推文。我做的是使用想要的数据创建一个数据框:
  2. preprocess <- function(df) {
  3. df_tw <- do.call(rbind, lapply(df, function(m)
  4. data.frame(text = df$text,
  5. lang = df$lang,
  6. geo = df$geo,
  7. date = df$created_at)))
  8. # 仅基于文本列选择唯一行
  9. df_u <- df_tw %>% distinct(text, .keep_all=TRUE)
  10. return(df)
  11. }
  12. 然而,坐标看起来像这样:`c(14.4865036, 35.85288308)`。我怎样才能将它们放在同一个数据框中的不同列中?
  13. 感谢。
英文:

I used the Twitter API to get lots of tweets. What I did was to create a df with the data I want:

  1. preprocess &lt;- function(df) {
  2. df_tw &lt;- do.call(rbind,lapply(df, function (m)
  3. data.frame(text = df$text,
  4. lang = df$lang,
  5. geo = df$geo,
  6. date = df$created_at)))
  7. # Select unique rows based on the text column only
  8. df_u &lt;- df_tw %&gt;% distinct(text, .keep_all=TRUE)
  9. return(df)
  10. }

However, the coordinates look like this: c(14.4865036, 35.85288308). How can I put them in different columns in the same df?

在R中取消嵌套列表,将它们放入不同的列中

  1. &gt; dput(head(df_mt))
  2. structure(list(text = c(&quot;A tiny little fish dish to round off the day. &quot;,
  3. &quot;Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija #venezuelanDj &quot;,
  4. &quot;Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #VenezuelanDj &quot;,
  5. &quot;Natures very own private pool, the blue hole in Gozo is a place that you can enjoy all year round. &#128248;: @chrissefarbi and @ch.farbmacher \n\n#Malta #VisitMalta #MoreToExplore &quot;,
  6. &quot;Londons first EV rapid charging hub opened by TfL and Engenie #Taxi #Chauffeur #Malta &quot;,
  7. &quot;Incredible to see this in Malta &#127474;&#127481;&#127477;&#127473;@FlightPolish &quot;
  8. ), lang = c(&quot;en&quot;, &quot;en&quot;, &quot;en&quot;, &quot;en&quot;, &quot;en&quot;, &quot;en&quot;), geo.place_id = c(&quot;0fc3ac0d6915e000&quot;,
  9. &quot;1d834adff5d584df&quot;, &quot;07d9d2902f483001&quot;, &quot;1d834adff5d584df&quot;, &quot;1d834adff5d584df&quot;,
  10. &quot;0fc2ecc63cd4c000&quot;), geo.coordinates = structure(list(type = c(NA,
  11. NA, NA, NA, &quot;Point&quot;, NA), coordinates = list(NULL, NULL, NULL,
  12. NULL, c(14.4865036, 35.85288308), NULL)), row.names = c(NA,
  13. 6L), class = &quot;data.frame&quot;), date = c(&quot;2022-12-30T20:00:29.000Z&quot;,
  14. &quot;2022-12-30T17:21:44.000Z&quot;, &quot;2022-12-30T17:16:15.000Z&quot;, &quot;2022-12-30T15:54:39.000Z&quot;,
  15. &quot;2022-12-30T14:57:34.000Z&quot;, &quot;2022-12-30T14:32:18.000Z&quot;), row.names = c(&quot;attachments.3&quot;,
  16. &quot;attachments.4&quot;, &quot;attachments.5&quot;, &quot;attachments.6&quot;, &quot;attachments.7&quot;,
  17. &quot;attachments.8&quot;), class = &quot;data.frame&quot;)

Thank you.

答案1

得分: 1

使用 unnest_wider 函数:

  1. library(tidyr)
  2. data.frame(df) |&gt;
  3. unnest_wider(geo.coordinates.coordinates, names_sep = &quot;.&quot;)

输出:

  1. ## A tibble: 6 &#215; 9
  2. # text lang geo.p…&#185; geo.c…&#178; geo.c…&#179; geo.c…⁴ date row.n…⁵ class
  3. # &lt;chr&gt; &lt;chr&gt; #&lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  4. #1 &quot;A tiny little fish dish to round off the day. &quot; en 0fc3ac… NA NA NA 2022… attach… data…
  5. #2 &quot;Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija… en 1d834a… NA NA NA 2022… attach… data…
  6. #3 &quot;Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #Ven… en 07d9d2… NA NA NA 2022… attach… data…
  7. #4 &quot;Nature’s very own private pool, the blue hole in Gozo is a pl… en 1d834a… NA NA NA 2022… attach… data…
  8. #5 &quot;London’s first EV rapid charging hub opened by TfL and Engeni… en 1d834a… Point 14.5 35.9 2022… attach… data…
  9. #6 &quot;Incredible to see this in Malta \U0001f1f2\U0001f1f9\U0001f1f… en 0fc2ec… NA NA NA 2022… attach… data…
  10. ## … with abbreviated variable names &#185;​geo.place_id, &#178;​geo.coordinates.type, &#179;​geo.coordinates.coordinates.1,
  11. ## ⁴​geo.coordinates.coordinates.2, ⁵​row.names

这是关于R编程语言中使用 unnest_wider 函数的代码示例和输出。

英文:

With unnest_wider:

  1. library(tidyr)
  2. data.frame(df) |&gt;
  3. unnest_wider(geo.coordinates.coordinates, names_sep = &quot;.&quot;)

output

  1. ## A tibble: 6 &#215; 9
  2. # text lang geo.p…&#185; geo.c…&#178; geo.c…&#179; geo.c…⁴ date row.n…⁵ class
  3. # &lt;chr&gt; &lt;chr&gt; #&lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  4. #1 &quot;A tiny little fish dish to round off the day. &quot; en 0fc3ac… NA NA NA 2022… attach… data…
  5. #2 &quot;Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija… en 1d834a… NA NA NA 2022… attach… data…
  6. #3 &quot;Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #Ven… en 07d9d2… NA NA NA 2022… attach… data…
  7. #4 &quot;Nature’s very own private pool, the blue hole in Gozo is a pl… en 1d834a… NA NA NA 2022… attach… data…
  8. #5 &quot;London’s first EV rapid charging hub opened by TfL and Engeni… en 1d834a… Point 14.5 35.9 2022… attach… data…
  9. #6 &quot;Incredible to see this in Malta \U0001f1f2\U0001f1f9\U0001f1f… en 0fc2ec… NA NA NA 2022… attach… data…
  10. ## … with abbreviated variable names &#185;​geo.place_id, &#178;​geo.coordinates.type, &#179;​geo.coordinates.coordinates.1,
  11. ## ⁴​geo.coordinates.coordinates.2, ⁵​row.names

huangapple
  • 本文由 发表于 2023年2月14日 22:13:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75449079.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定