问题

I am trying to webscrape some football stats. I am unsure why I am getting the Error 403. I have searched online and I think it has something to do with restricted access? Is there anyway to get past this?

Thank you!

Here is what I have coded so far

I have tried using .session() or .text() but both have not seemed to work.

英文:

Thank you!

Here is what I have coded so far

I have tried using .session() or .text() but both have not seemed to work.

答案1

得分: 0

你应该将你的User-Agent请求头设置为常见的值。如果不设置，很容易被识别为非人类请求，许多网站会阻止没有常见User-Agent的请求。

下面的代码设置User-Agent，使得你的请求看起来像是来自Windows上的Chrome浏览器：

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'}

r = requests.get(test_url, headers=headers)

英文:

You should set your User-Agent request header to be something common. Without setting this it is very obvious that your request is not coming from a human and many websites will block requests without a common User-Agent.
Below code sets the User-Agent like your request is coming from a chrome browser for windows.

headers = {&#39;user-agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36&#39;}

r = requests.get(test_url, headers=headers)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

尝试使用Python和Jupyter Notebook进行网页抓取时收到403错误。

问题

答案1

如何在点击按钮时获取出现的数据？

Python日期比较未能给出正确结果，我是否遗漏了一些非常简单的东西？

如何根据权重为3D B样条上色

lxml解析不识别索引化。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论