在R中取消嵌套列表,将它们放入不同的列中

huangapple go评论53阅读模式
英文:

Unlisting lists inside data frame and put them in different columns in r

问题

我使用Twitter API获取了大量的推文。我做的是使用想要的数据创建一个数据框:

preprocess <- function(df) {
  df_tw <- do.call(rbind, lapply(df, function(m)
    data.frame(text = df$text,
               lang = df$lang,
               geo = df$geo,
               date = df$created_at)))
  # 仅基于文本列选择唯一行
  df_u <- df_tw %>% distinct(text, .keep_all=TRUE)
  return(df)
}

然而,坐标看起来像这样:`c(14.4865036, 35.85288308)`。我怎样才能将它们放在同一个数据框中的不同列中?

感谢。
英文:

I used the Twitter API to get lots of tweets. What I did was to create a df with the data I want:

  preprocess &lt;- function(df) {
  df_tw &lt;- do.call(rbind,lapply(df, function (m)
    data.frame(text = df$text,
               lang = df$lang,
               geo = df$geo,
               date = df$created_at)))
  # Select unique rows based on the text column only
  df_u &lt;- df_tw %&gt;% distinct(text, .keep_all=TRUE)
  return(df)
}

However, the coordinates look like this: c(14.4865036, 35.85288308). How can I put them in different columns in the same df?

在R中取消嵌套列表,将它们放入不同的列中

&gt; dput(head(df_mt))
structure(list(text = c(&quot;A tiny little fish dish to round off the day. &quot;, 
&quot;Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija #venezuelanDj &quot;, 
&quot;Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #VenezuelanDj &quot;, 
&quot;Nature’s very own private pool, the blue hole in Gozo is a place that you can enjoy all year round. &#128248;: @chrissefarbi and @ch.farbmacher \n\n#Malta #VisitMalta #MoreToExplore &quot;, 
&quot;London’s first EV rapid charging hub opened by TfL and Engenie  #Taxi #Chauffeur #Malta &quot;, 
&quot;Incredible to see this in Malta &#127474;&#127481;&#127477;&#127473;@FlightPolish &quot;
), lang = c(&quot;en&quot;, &quot;en&quot;, &quot;en&quot;, &quot;en&quot;, &quot;en&quot;, &quot;en&quot;), geo.place_id = c(&quot;0fc3ac0d6915e000&quot;, 
&quot;1d834adff5d584df&quot;, &quot;07d9d2902f483001&quot;, &quot;1d834adff5d584df&quot;, &quot;1d834adff5d584df&quot;, 
&quot;0fc2ecc63cd4c000&quot;), geo.coordinates = structure(list(type = c(NA, 
NA, NA, NA, &quot;Point&quot;, NA), coordinates = list(NULL, NULL, NULL, 
    NULL, c(14.4865036, 35.85288308), NULL)), row.names = c(NA, 
6L), class = &quot;data.frame&quot;), date = c(&quot;2022-12-30T20:00:29.000Z&quot;, 
&quot;2022-12-30T17:21:44.000Z&quot;, &quot;2022-12-30T17:16:15.000Z&quot;, &quot;2022-12-30T15:54:39.000Z&quot;, 
&quot;2022-12-30T14:57:34.000Z&quot;, &quot;2022-12-30T14:32:18.000Z&quot;), row.names = c(&quot;attachments.3&quot;, 
&quot;attachments.4&quot;, &quot;attachments.5&quot;, &quot;attachments.6&quot;, &quot;attachments.7&quot;, 
&quot;attachments.8&quot;), class = &quot;data.frame&quot;)

Thank you.

答案1

得分: 1

使用 unnest_wider 函数:

library(tidyr)
data.frame(df) |&gt;
  unnest_wider(geo.coordinates.coordinates, names_sep = &quot;.&quot;)

输出:

## A tibble: 6 &#215; 9
#  text                                                            lang  geo.p…&#185; geo.c…&#178; geo.c…&#179; geo.c…⁴ date  row.n…⁵ class
#  &lt;chr&gt;                                                           &lt;chr&gt; #&lt;chr&gt;   &lt;chr&gt;     &lt;dbl&gt;   &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;   &lt;chr&gt;
#1 &quot;A tiny little fish dish to round off the day. &quot;                en    0fc3ac… NA         NA      NA   2022… attach… data…
#2 &quot;Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija… en    1d834a… NA         NA      NA   2022… attach… data…
#3 &quot;Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #Ven… en    07d9d2… NA         NA      NA   2022… attach… data…
#4 &quot;Nature’s very own private pool, the blue hole in Gozo is a pl… en    1d834a… NA         NA      NA   2022… attach… data…
#5 &quot;London’s first EV rapid charging hub opened by TfL and Engeni… en    1d834a… Point      14.5    35.9 2022… attach… data…
#6 &quot;Incredible to see this in Malta \U0001f1f2\U0001f1f9\U0001f1f… en    0fc2ec… NA         NA      NA   2022… attach… data…
## … with abbreviated variable names &#185;​geo.place_id, &#178;​geo.coordinates.type, &#179;​geo.coordinates.coordinates.1,
##   ⁴​geo.coordinates.coordinates.2, ⁵​row.names

这是关于R编程语言中使用 unnest_wider 函数的代码示例和输出。

英文:

With unnest_wider:

library(tidyr)
data.frame(df) |&gt;
  unnest_wider(geo.coordinates.coordinates, names_sep = &quot;.&quot;)

output

## A tibble: 6 &#215; 9
#  text                                                            lang  geo.p…&#185; geo.c…&#178; geo.c…&#179; geo.c…⁴ date  row.n…⁵ class
#  &lt;chr&gt;                                                           &lt;chr&gt; #&lt;chr&gt;   &lt;chr&gt;     &lt;dbl&gt;   &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;   &lt;chr&gt;
#1 &quot;A tiny little fish dish to round off the day. &quot;                en    0fc3ac… NA         NA      NA   2022… attach… data…
#2 &quot;Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija… en    1d834a… NA         NA      NA   2022… attach… data…
#3 &quot;Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #Ven… en    07d9d2… NA         NA      NA   2022… attach… data…
#4 &quot;Nature’s very own private pool, the blue hole in Gozo is a pl… en    1d834a… NA         NA      NA   2022… attach… data…
#5 &quot;London’s first EV rapid charging hub opened by TfL and Engeni… en    1d834a… Point      14.5    35.9 2022… attach… data…
#6 &quot;Incredible to see this in Malta \U0001f1f2\U0001f1f9\U0001f1f… en    0fc2ec… NA         NA      NA   2022… attach… data…
## … with abbreviated variable names &#185;​geo.place_id, &#178;​geo.coordinates.type, &#179;​geo.coordinates.coordinates.1,
##   ⁴​geo.coordinates.coordinates.2, ⁵​row.names

huangapple
  • 本文由 发表于 2023年2月14日 22:13:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75449079.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定