将 JSON 列表转换为数据框。

huangapple go评论71阅读模式
英文:

Converting JSON Lists into Data Frames

问题

我从以下页面提取了JSON:

library(jsonlite)
results <- fromJSON("https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json")
final = results$data

当我检查输出时,我可以看到尽管输出是以"list"格式呈现,但在输出中似乎有一个"表格数据框"结构:

t3, NA, gardening, , FALSE, NA, 0, FALSE, Tree surgeon butchered my tree - will it be ok?, r/gardening, FALSE, 6, NA, 0, 140, NA, all_ads, FALSE, t3_1196op

我的问题: 根据上述情况,是否可能将此输出转换为数据框?

我尝试了以下代码:

dataframe_list = as.data.frame(final)

代码已运行,但输出仍然不是表格/数据框输出。

最终,我想以以下格式获取结果:

  comment_id                      comment_text
1          1                 I like gardening!
2          2            I dont like to garden!
3          3             its too cold outside?
4          4 try planting something different?
5          5                    garden is fun!

请问有人可以向我展示如何做到这一点吗?

谢谢!

注意: 如果您查看实际网站 https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json - 所需的文本似乎位于"body:"和"edited"标签之间:

将 JSON 列表转换为数据框。

也许我以错误的方式解决了这个问题,可能有更好的方法?

英文:

I extracted the JSON from the following page:

library(jsonlite)
results <-  fromJSON("https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json")
final = results$data

When I inspect the output, I can see that even though that the output is in a "list" format, there appears to be a "tabular data frame" structure within the output:

t3, NA, gardening, , FALSE, NA, 0, FALSE, Tree surgeon butchered my tree - will it be ok?, r/gardening, FALSE, 6, NA, 0, 140, NA, all_ads, FALSE, t3_1196op

My Question: Based on the above - is it possible to somehow convert this output into a data frame?

I tried the following code:

dataframe_list = as.data.frame(final)

The code ran - but the output is still not in a tabular/data frame output.

In the end, I would like to have the result in the following format:

  comment_id                      comment_text
1          1                 I like gardening!
2          2            I dont like to garden!
3          3             its too cold outside?
4          4 try planting something different?
5          5                    garden is fun!

Can someone please show me how to do this?

Thanks!

Note: If you look at the actual website https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json - the desired text appears to be between the tags "body:" and "edited" :

将 JSON 列表转换为数据框。

Maybe I am approaching this problem the wrong way and there is a better way of doing this?

答案1

得分: 2

以下是使用 pluck()bind_rows()unnest() 的一种方法:

library(jsonlite)
library(purrr)
library(dplyr)
library(tidyr)

URL <- "https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json";

fromJSON(URL) %>%
  pluck("data", "children") %>%
  bind_rows() %>%
  filter(row_number() > 1) %>%
  unnest(data) %>%
  select(id, author, body) %>%
  mutate(comment_id = row_number(), .before = "id")

输出:

# A tibble: 75 × 4
   comment_id id      author         body                                                                                             
        <int> <chr>   <chr>          <chr>                            
 1          1 j9ktvi3 mikpgod        "It'll grow back, probably won't be able to tell by summer. Except it'll be smaller"             
 2          2 j9l0egd hrudnick       "Saw a tree surgeons advert today. Said, \"Don't worry, I hug them first.\""                     
 3          3 j9kyb1v anonnewengland "It will be covered in new growth in a few months."                                              
 4          4 j9kqqqk Beatnikdan     "He must've been a civil war surgeon. \n\nThey should survive but get a different tree guy to cl…
 5          5 j9n0kp8 Live-Steaky    "Very few people in there comment section actually know what’s up. It’s a fine pruning job, extr…
 6          6 j9l2gxf Luke_low       "Speaking of Tree Butchery, My parents have hired an \"amateur landscaper guy\" a bunch of times…
 7          7 j9npnl1 tomt6371       "In all honesty it looks good,and definitely could have been pollarded further, it's the right s…
 8          8 j9kpkux Amezrou        "Had a tree surgeon round today to take the height off my Hazel and Plum trees and he’s absolute…
 9          9 j9kxjyz testhec10ck    "Those cut angles all look good. This seems pretty standard for an early spring pruning"         
10         10 j9laq63 MarieTC        "Lots of new growth will come and the tree will be fuller"                                       
# … with 65 more rows

希望这对您有所帮助。

英文:

Here is one approach using pluck(), bind_rows() and unnest():

library(jsonlite)
library(purrr)
library(dplyr)
library(tidyr)

URL &lt;- &quot;https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json&quot;

fromJSON(URL) |&gt;
  pluck(&quot;data&quot;, &quot;children&quot;) |&gt; # .$data$children
  bind_rows() |&gt;
  filter(row_number() &gt; 1) |&gt;
  unnest(data) |&gt;
  select(id, author, body) |&gt;
  mutate(comment_id = row_number(), .before = &quot;id&quot;)

Output:

# A tibble: 75 &#215; 4
   comment_id id      author         body                                                                                             
        &lt;int&gt; &lt;chr&gt;   &lt;chr&gt;          &lt;chr&gt;                                                                                            
 1          1 j9ktvi3 mikpgod        &quot;It&#39;ll grow back, probably won&#39;t be able to tell by summer. Except it&#39;ll be smaller&quot;             
 2          2 j9l0egd hrudnick       &quot;Saw a tree surgeons advert today. Said, \&quot;Don&#39;t worry, I hug them first.\&quot;&quot;                     
 3          3 j9kyb1v anonnewengland &quot;It will be covered in new growth in a few months.&quot;                                              
 4          4 j9kqqqk Beatnikdan     &quot;He must&#39;ve been a civil war surgeon. \n\nThey should survive but get a different tree guy to cl…
 5          5 j9n0kp8 Live-Steaky    &quot;Very few people in there comment section actually know what’s up. It’s a fine pruning job, extr…
 6          6 j9l2gxf Luke_low       &quot;Speaking of Tree Butchery, My parents have hired an \&quot;amateur landscaper guy\&quot; a bunch of times…
 7          7 j9npnl1 tomt6371       &quot;In all honesty it looks good,and definitely could have been pollarded further, it&#39;s the right s…
 8          8 j9kpkux Amezrou        &quot;Had a tree surgeon round today to take the height off my Hazel and Plum trees and he’s absolute…
 9          9 j9kxjyz testhec10ck    &quot;Those cut angles all look good. This seems pretty standard for an early spring pruning&quot;         
10         10 j9laq63 MarieTC        &quot;Lots of new growth will come and the tree will be fuller&quot;                                       
# … with 65 more rows

答案2

得分: 1

用于解析Reddit JSON的工具,你可能想要检查RedditExtractoR包,get_thread_content()函数返回2个数据框的列表,一个用于主题,另一个用于评论:

library(dplyr)
thread <- RedditExtractoR::get_thread_content("https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/")

thread$threads %>%
  select(author, title, text) %>%
  as_tibble()
#> # A tibble: 1 × 3
#>   author  title                                           text 
#>   <chr>   <chr>                                           <chr>
#> 1 Amezrou Tree surgeon butchered my tree - will it be ok? ""

thread$comments %>%
  select(comment_id, author, comment) %>%
  as_tibble()
#> # A tibble: 176 × 3
#>    comment_id  author           comment                                         
#>    <chr>       <chr>            <chr>                                           
#>  1 1           mikpgod          "It'll grow back, probably won't be able to tel…
#  2 1_1         Amezrou          "I really hope so&"                             
#  3 1_1_1       mikpgod          "Hazel's difficult to kill."                    
#  4 1_1_1_1     symetry_myass    "&gt; Hazel's difficult to kill.\n\nI see an Um…
#  5 1_1_1_2     Amezrou          "Yeah but what\u0019s it going to look like whe…
#  6 1_1_1_2_1   EpidonoTheFool   "Not very good in my opinion a lot of weak grow…
#  7 1_1_1_2_2   Cold-Pack-7653   "It will eventually look normal but its going t…
#  8 1_1_1_2_3   lethal_moustache "Look for images of coppiced trees. Yours will …
#  9 1_1_1_2_3_1 LeGrandePoobah   "This is an interesting article. I\u0019m not s…
# 10 1_1_1_2_3_2 treecarefanatic  "this is pollarding not coppicing"              
# # … with 166 more rows

创建于2023-02-23,使用reprex v2.0.2

英文:

For parsing JSON from Reddit you may want to check RedditExtractoR package, get_thread_content() returns list of 2 data.frames, one for thread and another for comments:

library(dplyr)
thread &lt;- RedditExtractoR::get_thread_content(&quot;https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/&quot;)

thread$threads %&gt;% 
  select(author, title, text) %&gt;% 
  as_tibble()
#&gt; # A tibble: 1 &#215; 3
#&gt;   author  title                                           text 
#&gt;   &lt;chr&gt;   &lt;chr&gt;                                           &lt;chr&gt;
#&gt; 1 Amezrou Tree surgeon butchered my tree - will it be ok? &quot;&quot;

thread$comments %&gt;% 
  select(comment_id, author, comment) %&gt;% 
  as_tibble()
#&gt; # A tibble: 176 &#215; 3
#&gt;    comment_id  author           comment                                         
#&gt;    &lt;chr&gt;       &lt;chr&gt;            &lt;chr&gt;                                           
#&gt;  1 1           mikpgod          &quot;It&#39;ll grow back, probably won&#39;t be able to tel…
#&gt;  2 1_1         Amezrou          &quot;I really hope so&amp;&quot;                             
#&gt;  3 1_1_1       mikpgod          &quot;Hazel&#39;s difficult to kill.&quot;                    
#&gt;  4 1_1_1_1     symetry_myass    &quot;&amp;gt; Hazel&#39;s difficult to kill.\n\nI see an Um…
#&gt;  5 1_1_1_2     Amezrou          &quot;Yeah but what\u0019s it going to look like whe…
#&gt;  6 1_1_1_2_1   EpidonoTheFool   &quot;Not very good in my opinion a lot of weak grow…
#&gt;  7 1_1_1_2_2   Cold-Pack-7653   &quot;It will eventually look normal but its going t…
#&gt;  8 1_1_1_2_3   lethal_moustache &quot;Look for images of coppiced trees. Yours will …
#&gt;  9 1_1_1_2_3_1 LeGrandePoobah   &quot;This is an interesting article. I\u0019m not s…
#&gt; 10 1_1_1_2_3_2 treecarefanatic  &quot;this is pollarding not coppicing&quot;              
#&gt; # … with 166 more rows

<sup>Created on 2023-02-23 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年2月23日 20:44:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75545026.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定