英文:
Unlisting lists inside data frame and put them in different columns in r
问题
我使用Twitter API获取了大量的推文。我做的是使用想要的数据创建一个数据框:
preprocess <- function(df) {
df_tw <- do.call(rbind, lapply(df, function(m)
data.frame(text = df$text,
lang = df$lang,
geo = df$geo,
date = df$created_at)))
# 仅基于文本列选择唯一行
df_u <- df_tw %>% distinct(text, .keep_all=TRUE)
return(df)
}
然而,坐标看起来像这样:`c(14.4865036, 35.85288308)`。我怎样才能将它们放在同一个数据框中的不同列中?
感谢。
英文:
I used the Twitter API to get lots of tweets. What I did was to create a df with the data I want:
preprocess <- function(df) {
df_tw <- do.call(rbind,lapply(df, function (m)
data.frame(text = df$text,
lang = df$lang,
geo = df$geo,
date = df$created_at)))
# Select unique rows based on the text column only
df_u <- df_tw %>% distinct(text, .keep_all=TRUE)
return(df)
}
However, the coordinates look like this: c(14.4865036, 35.85288308)
. How can I put them in different columns in the same df?
> dput(head(df_mt))
structure(list(text = c("A tiny little fish dish to round off the day. ",
"Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija #venezuelanDj ",
"Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #VenezuelanDj ",
"Nature’s very own private pool, the blue hole in Gozo is a place that you can enjoy all year round. 📸: @chrissefarbi and @ch.farbmacher \n\n#Malta #VisitMalta #MoreToExplore ",
"London’s first EV rapid charging hub opened by TfL and Engenie #Taxi #Chauffeur #Malta ",
"Incredible to see this in Malta 🇲🇹🇵🇱@FlightPolish "
), lang = c("en", "en", "en", "en", "en", "en"), geo.place_id = c("0fc3ac0d6915e000",
"1d834adff5d584df", "07d9d2902f483001", "1d834adff5d584df", "1d834adff5d584df",
"0fc2ecc63cd4c000"), geo.coordinates = structure(list(type = c(NA,
NA, NA, NA, "Point", NA), coordinates = list(NULL, NULL, NULL,
NULL, c(14.4865036, 35.85288308), NULL)), row.names = c(NA,
6L), class = "data.frame"), date = c("2022-12-30T20:00:29.000Z",
"2022-12-30T17:21:44.000Z", "2022-12-30T17:16:15.000Z", "2022-12-30T15:54:39.000Z",
"2022-12-30T14:57:34.000Z", "2022-12-30T14:32:18.000Z"), row.names = c("attachments.3",
"attachments.4", "attachments.5", "attachments.6", "attachments.7",
"attachments.8"), class = "data.frame")
Thank you.
答案1
得分: 1
使用 unnest_wider
函数:
library(tidyr)
data.frame(df) |>
unnest_wider(geo.coordinates.coordinates, names_sep = ".")
输出:
## A tibble: 6 × 9
# text lang geo.p…¹ geo.c…² geo.c…³ geo.c…⁴ date row.n…⁵ class
# <chr> <chr> #<chr> <chr> <dbl> <dbl> <chr> <chr> <chr>
#1 "A tiny little fish dish to round off the day. " en 0fc3ac… NA NA NA 2022… attach… data…
#2 "Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija… en 1d834a… NA NA NA 2022… attach… data…
#3 "Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #Ven… en 07d9d2… NA NA NA 2022… attach… data…
#4 "Nature’s very own private pool, the blue hole in Gozo is a pl… en 1d834a… NA NA NA 2022… attach… data…
#5 "London’s first EV rapid charging hub opened by TfL and Engeni… en 1d834a… Point 14.5 35.9 2022… attach… data…
#6 "Incredible to see this in Malta \U0001f1f2\U0001f1f9\U0001f1f… en 0fc2ec… NA NA NA 2022… attach… data…
## … with abbreviated variable names ¹geo.place_id, ²geo.coordinates.type, ³geo.coordinates.coordinates.1,
## ⁴geo.coordinates.coordinates.2, ⁵row.names
这是关于R编程语言中使用 unnest_wider
函数的代码示例和输出。
英文:
With unnest_wider
:
library(tidyr)
data.frame(df) |>
unnest_wider(geo.coordinates.coordinates, names_sep = ".")
output
## A tibble: 6 × 9
# text lang geo.p…¹ geo.c…² geo.c…³ geo.c…⁴ date row.n…⁵ class
# <chr> <chr> #<chr> <chr> <dbl> <dbl> <chr> <chr> <chr>
#1 "A tiny little fish dish to round off the day. " en 0fc3ac… NA NA NA 2022… attach… data…
#2 "Sharing Music #dj #Malta #house #housemusic #pioneer #xemxija… en 1d834a… NA NA NA 2022… attach… data…
#3 "Dj Abraham Sound en Malta #dj #pioneer #paceville #Malta #Ven… en 07d9d2… NA NA NA 2022… attach… data…
#4 "Nature’s very own private pool, the blue hole in Gozo is a pl… en 1d834a… NA NA NA 2022… attach… data…
#5 "London’s first EV rapid charging hub opened by TfL and Engeni… en 1d834a… Point 14.5 35.9 2022… attach… data…
#6 "Incredible to see this in Malta \U0001f1f2\U0001f1f9\U0001f1f… en 0fc2ec… NA NA NA 2022… attach… data…
## … with abbreviated variable names ¹geo.place_id, ²geo.coordinates.type, ³geo.coordinates.coordinates.1,
## ⁴geo.coordinates.coordinates.2, ⁵row.names
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论