英文:
Getting "Request unsuccessful. Incapsula incident ID: 1220001110238789559-263723949970693063" while scraping website
问题
我正在尝试使用Scala Scraper来抓取网站。
代码:
val document = JsoupBrowser().get("https://esos.nv.gov/NotarySearchOnline/NotarySearchExternal")
println(document)
但是它出现了Incapsula错误:
<html style="height:100%">
<head>
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
<meta name="format-detection" content="telephone=no">
<meta name="viewport" content="initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<script type="text/javascript" src="/_Incapsula_Resource?SWJIYLWA=719d34d31c8e3a6e6fffd425f7e032f3"></script>
<script src="/bukd-fairers-stayld-his-do-you-she-beth-not-betr" async></script>
</head>
<body style="margin:0px;height:100%">
<iframe id="main-iframe" src="/_Incapsula_Resource?SWUDNSAI=31&xinfo=7-59020918-0%200NNN%20RT(1689249168226%20790)%20q(0%20-1%20-1%205)%20r(0%20-1)%20B12(11%2c1337794%2c0)%20U24&incident_id=1220001110238789559-263723949970693063&edet=12&cinfo=0b000000&rpinfo=0&cts=MC1%2bbffyCFI5wK%2bnKF5heCJ8mhMBzBg%2bUp1uiaJLs7YcyEY9lFttNsgaGx7ZImGK&mth=GET" frameborder="0" width="100%" height="100%" marginheight="0px" marginwidth="0px">
Request unsuccessful. Incapsula incident ID: 1220001110238789559-263723949970693063</iframe>
</body>
</html>
几天前,使用相同的代码可以正常抓取此网站,但现在出现了上述错误。
我尝试使用VPN进行抓取,但错误仍然相同。我该如何绕过此错误?
英文:
I'm trying to scrape website using scala scraper.
Code:
val document = JsoupBrowser().get("https://esos.nv.gov/NotarySearchOnline/NotarySearchExternal")
println(document)
But it is giving Incapsula error:
<html style="height:100%">
<head>
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
<meta name="format-detection" content="telephone=no">
<meta name="viewport" content="initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<script type="text/javascript" src="/_Incapsula_Resource?SWJIYLWA=719d34d31c8e3a6e6fffd425f7e032f3"></script>
<script src="/bukd-fairers-stayld-his-do-you-she-beth-not-betr" async></script>
</head>
<body style="margin:0px;height:100%">
<iframe id="main-iframe" src="/_Incapsula_Resource?SWUDNSAI=31&amp;xinfo=7-59020918-0%200NNN%20RT%281689249168226%20790%29%20q%280%20-1%20-1%205%29%20r%280%20-1%29%20B12%2811%2c1337794%2c0%29%20U24&amp;incident_id=1220001110238789559-263723949970693063&amp;edet=12&amp;cinfo=0b000000&amp;rpinfo=0&amp;cts=MC1%2bbffyCFI5wK%2bnKF5heCJ8mhMBzBg%2bUp1uiaJLs7YcyEY9lFttNsgaGx7ZImGK&amp;mth=GET" frameborder="0" width="100%" height="100%" marginheight="0px" marginwidth="0px">
Request unsuccessful. Incapsula incident ID: 1220001110238789559-263723949970693063</iframe>
</body>
</html>
Few days ago, scraping of this site was working fine with same code but now it's is giving above error.
I've tried to scrape it using VPN but still error is same. How can I surpass this error?
答案1
得分: 3
如果您访问由 iframe 返回的 链接,您将看到以下消息:
我猜这很明显。该网站有一个安全层,正在阻止您的请求。他们不希望网站中有自动化流量,如果他们检测到异常,他们可能会阻止您。不确定该页面是否提供 API。
英文:
If you go to the url returned by the iframe you will see the following message
I guess it's pretty clear. The site have a security layer that is blocking your request. They don't want automated traffic in their web, and if they detect some anomaly they can block you. Not sure if the page offers an API.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论