英文:
R Query of a Wikimedia server
问题
我正在尝试查询Cameo数据库。
如果我使用以下URL https://cameo.mfa.org/api.php?action=query&pageids=17051&prop=extracts&format=json,那么我在线上得到一个有效的输出。
然而,如果我使用:
library(httr)
library(jsonlite)
base_url <- "https://cameo.mfa.org/api.php"
query_param <- list(action = "query",
pageids = "17051",
format = "json",
prop = "extracts"
)
parsed_content <- httr::GET(base_url, query_param)
jsonlite::fromJSON(content(parsed_content, as = "text", encoding = "UTF-8"))
那么 jsonlite
会失败,因为输出是HTML格式而不是JSON。
你对此有什么建议吗?
英文:
I am trying to query the Cameo database.
If I use the URL https://cameo.mfa.org/api.php?action=query&pageids=17051&prop=extracts&format=json, then I get, online, a valid output.
However, if I use:
library(httr)
library(jsonlite)
base_url <- "https://cameo.mfa.org/api.php"
query_param <- list(action = "query",
pageids = "17051",
format = "json",
prop = "extracts"
)
parsed_content <- httr::GET(base_url, query_param)
jsonlite::fromJSON(content(parsed_content, as = "text", encoding = "UTF-8"))
Then jsonlite
fails because the output is in html format and not json.
Do you have any advice on this?
答案1
得分: 2
httr::GET
的第二个参数是config=
,这不是你应该分配query_param
的位置。而应该将其命名为query=query_param
。
res <- httr::GET(base_url, query = query_param)
res
# 响应 [https://cameo.mfa.org/api.php?action=query&pageids=17051&format=json&prop=extracts]
# 日期:2023年07月03日 15:06
# 状态:200
# 内容类型:application/json; charset=utf-8
# 大小:5.22千字节
str(httr::content(res))
# 3个元素的列表
# $ batchcomplete: 字符串 ""
# $ warnings : 1个列表
# ..$ extracts: 1个列表
# .. ..$ *: 字符串 "HTML可能存在格式错误和/或不平衡,可能省略内联图像。使用需谨慎。已知问题包括li"| 被截断
# $ query : 1个列表
# ..$ pages: 1个列表
# .. ..$ 17051: 4个元素的列表
# .. .. ..$ pageid : 整数 17051
# .. .. ..$ ns : 整数 0
# .. .. ..$ title : 字符串 "Copper"
# .. .. ..$ extract: 字符串 "<h2><span id=\"Description\">Description</span></h2>\n<p>A reddish-brown, ductile, metallic element. Copper is "| 被截断
英文:
The second argument to httr::GET
is config=
, which is not where you should be assigning query_param
. Instead name it as query=query_param
.
res <- httr::GET(base_url, query = query_param)
res
# Response [https://cameo.mfa.org/api.php?action=query&pageids=17051&format=json&prop=extracts]
# Date: 2023-07-03 15:06
# Status: 200
# Content-Type: application/json; charset=utf-8
# Size: 5.22 kB
str(httr::content(res))
# List of 3
# $ batchcomplete: chr ""
# $ warnings :List of 1
# ..$ extracts:List of 1
# .. ..$ *: chr "HTML may be malformed and/or unbalanced and may omit inline images. Use at your own risk. Known problems are li"| __truncated__
# $ query :List of 1
# ..$ pages:List of 1
# .. ..$ 17051:List of 4
# .. .. ..$ pageid : int 17051
# .. .. ..$ ns : int 0
# .. .. ..$ title : chr "Copper"
# .. .. ..$ extract: chr "<h2><span id=\"Description\">Description</span></h2>\n<p>A reddish-brown, ductile, metallic element. Copper is "| __truncated__
答案2
得分: 1
以下是翻译好的代码部分:
library(httr)
library(jsonlite)
url <- httr::parse_url("https://cameo.mfa.org/api.php")
url$query <- list(
action = "query",
pageids = "17051",
format = "json",
prop = "extracts"
)
json <- jsonlite::fromJSON(httr::build_url(url))
json$query$pages
#> $`17051`
#> $`17051`$pageid
#> [1] 17051
#>
#> $`17051`$ns
#> [1] 0
#>
#> $`17051`$title
#> [1] "Copper"
#>
#> $`17051`$extract
#> [1] "<h2><span id=\"Description\">Description</span></h2>\n<p>A reddish-brown, ductile, metallic element. Copper is present [...]"
Created on 2023-07-03 with reprex v2.0.2
<details>
<summary>英文:</summary>
A bit different approach:
``` r
library(httr)
library(jsonlite)
url <- httr::parse_url("https://cameo.mfa.org/api.php")
url$query <- list(
action = "query",
pageids = "17051",
format = "json",
prop = "extracts"
)
json <- jsonlite::fromJSON(httr::build_url(url))
json$query$pages
#> $`17051`
#> $`17051`$pageid
#> [1] 17051
#>
#> $`17051`$ns
#> [1] 0
#>
#> $`17051`$title
#> [1] "Copper"
#>
#> $`17051`$extract
#> [1] "<h2><span id=\"Description\">Description</span></h2>\n<p>A reddish-brown, ductile, metallic element. Copper is present [...]"
<sup>Created on 2023-07-03 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论