“使用R Selenium中的remDr$navigate时出现’httr调用中的未定义错误’,如何修复?”

huangapple go评论58阅读模式
英文:

Getting 'Undefined error in httr call' when using remDr$navigate with R Selenium - how to fix it?

问题

remDr$navigate (url) 错误:httr 调用中未定义的错误。httr 输出:length(url) == 1 不是 TRUE

这是发生在以下网址上的:
url <- "https://eprel.ec.europa.eu/screen/product/airconditioners"

远程驱动程序没有配置错误。可能的原因是什么,如何解决?也许这个网站无法爬取?

我尝试检查:

  • URL 的规范(是正确的)
  • 使用 get() 和 navigateTo() 函数。

我的系统是64位的,R也是64位的,但是Chrome和Firefox的webdriver只有32位的可用版本。这可能是问题吗?

ChatGPT说这可能与RSelenium的设置或与特定网站的兼容性有关。也许对于这个网站,应该使用另一个爬取工具或包?

英文:

remDr$navigate (url) error: Undefined error in httr call. httr output: length(url) == 1 is not TRUE

This happens for URL
url &lt;- &quot;https://eprel.ec.europa.eu/screen/product/airconditioners&quot;

Remote driver is configured without errors. What could be the cause and how to solve it? Is it perhaps that this website is not scrape-able?

I tried checking:

  • the specification of URL (it's correct)
  • using get() and navigateTo functions.

My system is 64-bit, and so is R, but the only available versions for webdriver for Chrome and Firefox are 32-bit. Could that be the problem?

The ChatGPT says it could be related to the RSelenium setup or compatibility with the specific website.
Could it be that for this website, another scraping tool/package should be used?

答案1

得分: 0

试试这个替代方案:

library(tidyverse)
library(httr2)

"https://eprel.ec.europa.eu/api/products/airconditioners?_page=1&amp;_limit=25&amp;indoorSoundPowerCoolingMin=1&amp;indoorSoundPowerCoolingMax=99&amp;outdoorSoundPowerCoolingMin=0&amp;outdoorSoundPowerCoolingMax=99&amp;indoorSoundPowerHeatingMin=1&amp;indoorSoundPowerHeatingMax=99&amp;outdoorSoundPowerHeatingMin=0&amp;outdoorSoundPowerHeatingMax=99&amp;coolingDesignLoadMin=0.1&amp;coolingDesignLoadMax=99.9&amp;heatingDesignLoadMin=0.1&amp;heatingDesignLoadMax=99.9&amp;sort0=onMarketStartDateTS&amp;order0=DESC&amp;sort1=energyClass&amp;order1=DESC" %>%
  request() %>%
  req_perform() %>%
  resp_body_json(simplifyVector = TRUE, check_type = FALSE) %>%
  pluck("hits")
英文:

Try this instead:

library(tidyverse)
library(httr2)

&quot;https://eprel.ec.europa.eu/api/products/airconditioners?_page=1&amp;_limit=25&amp;indoorSoundPowerCoolingMin=1&amp;indoorSoundPowerCoolingMax=99&amp;outdoorSoundPowerCoolingMin=0&amp;outdoorSoundPowerCoolingMax=99&amp;indoorSoundPowerHeatingMin=1&amp;indoorSoundPowerHeatingMax=99&amp;outdoorSoundPowerHeatingMin=0&amp;outdoorSoundPowerHeatingMax=99&amp;coolingDesignLoadMin=0.1&amp;coolingDesignLoadMax=99.9&amp;heatingDesignLoadMin=0.1&amp;heatingDesignLoadMax=99.9&amp;sort0=onMarketStartDateTS&amp;order0=DESC&amp;sort1=energyClass&amp;order1=DESC&quot; %&gt;%
  request() %&gt;%
  req_perform() %&gt;%
  resp_body_json(simplifyVector = TRUE, check_type = FALSE) %&gt;%
  pluck(&quot;hits&quot;)

huangapple
  • 本文由 发表于 2023年5月24日 17:40:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76322132.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定