英文:
Querying API for every unique resource in a package using R
问题
我正在编写一个脚本,用于从一个开放数据网站使用CKAN下载包中的所有唯一的Excel文件。我目前正在尝试编写一个函数,该函数将循环遍历唯一数据集ID的列表,获取每个ID的URL并将数据集下载到我的计算机。但是,我在编写这个函数时遇到了问题。
到目前为止,该函数只给我返回了包中的第一个数据集,但还有3个需要下载的数据集。
有人知道我哪里出错了吗?
英文:
I'm writing a script to download all unique excel files in a package from an open data site using CKAN. I'm currently trying to write a function that cycles through a list of the unique dataset IDs, gets the URL for each ID and downloads the dataset to my computer. I'm however having trouble writing the function.
So far the function only gives me the first dataset in the package, but there are 3 more that need to be downloaded.
library(tidyverse)
library(ckanr)
library(jsonlite)
library(readxl)
library(curl)
library(janitor)
library(mlr3misc)
url <- "http://osmdatacatalog.alberta.ca/" # set url to access data
ckanr_setup(url = url)
x <- resource_search(q = "name:wetland monitoring benthic invertebrate community", limit = 10) # get id of data
id <- ids(x$results)
id_download <- function(id) {
for (i in id)
a <- resource_show(i)
b <- a$url
destfile <- paste("C:/Users/Name/Documents/Database_updates/OSM_benthic_invertebrates/",basename(b))
curl::curl_download(b, destfile)
}
Anyone know where I'm getting this wrong?
答案1
得分: 0
for循环需要在其后加上大括号。大括号内的内容是在循环中执行的。
看起来所有的文件可能都具有相同的名称?如果是这样,它们可能会互相覆盖。以防万一,可能有意义的是向destfile名称添加一些内容,以确保所有文件名都是唯一的。这对我有用:
dir.create("invertebrates")
url <- "http://osmdatacatalog.alberta.ca/" # 设置访问数据的URL
ckanr_setup(url = url)
x <- resource_search(q = "name:wetland monitoring benthic invertebrate community", limit = 10) # 获取数据的ID
id <- ids(x$results)
id_download <- function(id) {
for (i in id){
a <- resource_show(i)
b <- a$url
destfile <- paste0("./invertebrates/",
substr(i, 1,4),
basename(b))
curl::curl_download(b, destfile)
}
}
id_download(id)
英文:
The for loop needs to have brackets following it. The stuff inside the brackets is what gets executed in a loop.
It also looks like all the files might have the same name? If they do they might overwrite each other. Just in case it might make sense to add something to the destfile name so that you're sure all the file names will be unique. This worked for me:
dir.create("invertebrates")
url <- "http://osmdatacatalog.alberta.ca/" # set url to access data
ckanr_setup(url = url)
x <- resource_search(q = "name:wetland monitoring benthic invertebrate community", limit = 10) # get id of data
id <- ids(x$results)
id_download <- function(id) {
for (i in id){
a <- resource_show(i)
b <- a$url
destfile <- paste0("./invertebrates/",
substr(i, 1,4),
basename(b))
curl::curl_download(b, destfile)
}
}
id_download(id)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论