Python的Requests Get无法用于“HTTP”请求。

huangapple go评论65阅读模式
英文:

Python Requests Get not Working for "HTTP" request

问题

我在使用Python的requests模块时遇到了一个问题,尝试使用HTTP协议获取网站的响应。

Requests可以用于HTTPS网站。

理想情况下,我试图开发一个脚本,它将获取一个HTTP网站,并检查它是否重定向到HTTPS网站。

import requests

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36' }

url = "https://www.google.com"

r = requests.get(url, headers=headers)

print(r.status_code)

然而,使用URL“http://www.google.com”会失败,它应该重定向到https://www.google.com,然后提供一些响应代码,但它失败了。

import requests

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36' }

url = "http://www.google.com"

r = requests.get(url, headers=headers)

print(r.status_code)

最终导致以下错误。请提供建议。

sock.connect(sa) TimeoutError: [WinError 10060] 由于连接方在一段时间后没有适当响应或建立连接失败,或由于连接的主机没有响应而导致连接尝试失败

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='www.google.com', port=80): 最大重试次数已超过,网址: / (由于NewConnectionError('<urllib3.connection.HTTPConnection对象,位于0x000001D3A496A9A0>: 无法建立新连接: [WinError 10060] 由于连接方在一段时间后没有适当响应或建立连接失败,或由于连接的主机没有响应而导致连接尝试失败'))

英文:

I have an issue with using python requests module while trying to get response with http protocol for websites.

Requests works for https sites.

Ideally I am trying to develop a script which would get a http website and check if it is redirecting to https website.

import requests

headers = {&#39;User-Agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36&#39; }

url = &quot;https://www.google.com&quot;

r = requests.get(url, headers=headers)

print(r.status_code)

Whereas using url with "http://www.google.com" fails, it should redirect https://www.google.com which should provide some response code, but it fails.

import requests

headers = {&#39;User-Agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36&#39; }

url = &quot;http://www.google.com&quot;

r = requests.get(url, headers=headers)

print(r.status_code)

Ends up with below errors with the below errors. Please advise.

> sock.connect(sa) TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

>urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='www.google.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001D3A496A9A0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))

答案1

得分: 1

您遇到请求错误的原因可能有多个,例如防火墙、代理设置或 ISP 限制可能阻止连接到非安全站点,这里有一个使用 requests 库进行重定向检测的解决方案。

import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36',
}

url = "http://www.google.com"

try:
    r = requests.get(url, headers=headers, timeout=10, allow_redirects=True)
    print(f"状态码: {r.status_code}")

    if r.url.startswith("https://"):
        print("网站已重定向到安全版本 (https)。")
    else:
        print("网站未重定向到安全版本 (https)。")

except requests.exceptions.RequestException as e:
    print(f"发生错误: {e}")

如果仍然不起作用,请检查您的网络设置、防火墙配置,或联系您的 ISP。

英文:

The reason you have these error with your requests could be multiple like a firewall, proxy settings, or ISP restrictions that block connections to non-secure sites, there is different solution, here is one using the requests library to redirects and detect if the website is redirecting to the secure version.

import requests

headers = {
    &#39;User-Agent&#39;: &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36&#39;
}

url = &quot;http://www.google.com&quot;

try:
    r = requests.get(url, headers=headers, timeout=10, allow_redirects=True)
    print(f&quot;Status Code: {r.status_code}&quot;)

    if r.url.startswith(&quot;https://&quot;):
        print(&quot;The website has redirected to the secure version (https).&quot;)
    else:
        print(&quot;The website did not redirect to the secure version (https).&quot;)

except requests.exceptions.RequestException as e:
    print(f&quot;An error occurred: {e}&quot;)

If that still dont work check your network settings, firewall configurations, or reach out to your ISP.

huangapple
  • 本文由 发表于 2023年4月4日 15:50:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/75926785.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定