Requests库在Python中未能获取数据。

huangapple go评论143阅读模式
英文:

Requests python not fetching data

问题

请问请求未获取任何数据的原因是什么?我尝试在我的虚拟机上运行它,一切正常,但当我在服务器上测试它时,却一点用都没有。我确保了我的环境中安装了请求库,并确保其版本是最新的。这是我使用的代码。

import requests

exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
    r = requests.get("https://www.instagram.com/"+i+"/")
    try:
        if r.apparent_encoding == 'Windows-1252':
            exist.append(i)
            print('Exist')
            url.append("https://www.instagram.com/"+i+"/")
        else:
            print("Not Exist")
    except:
        print('Not Exist')

当我运行这段代码时,它一直说不存在,而不是返回存在。我已经检查了帐户存在。如何解决这个问题?谢谢。

英文:

is there a reason why the requests isn't getting any data? I tried doing it on my VM, and it works normally but when i tested it in my server, it doesn't work at all. I've made sure that the requests library is installed in my environment I and also made sure that it's up-to-date with its version. here is the code that i use.

import requests

exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
    r = requests.get("https://www.instagram.com/"+i+"/")
    try:
        if r.apparent_encoding == 'Windows-1252':
            exist.append(i)
            print('Exist')
            url.append("https://www.instagram.com/"+i+"/")
        else:
            print("Not Exist")
    except:
        print('Not Exist')

When I ran this code instead of getting exists, it kept saying that it doesn't exist. I've checked that the account exists. How do I fix this problem? Thanks

答案1

得分: 1

以下是翻译好的部分:

使用 BeautifulSoup

import requests
from bs4 import BeautifulSoup

exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
    r = requests.get("https://www.instagram.com/" + i + "/", cookies={"sessionid": "your-session-id"})
    soup = BeautifulSoup(r.text, 'html.parser')
    title = soup.find('title').text
    if i in title:
        exist.append(i)
        print('存在')
        url.append("https://www.instagram.com/" + i + "/")
    else:
        print("不存在")

没有使用 BeautifulSoup

import requests

exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
    r = requests.get("https://www.instagram.com/" + i + "/", cookies={"sessionid": "your-session-id"})
    if r.apparent_encoding == 'Windows-1252':
        exist.append(i)
        print('存在')
        url.append("https://www.instagram.com/" + i + "/")
    else:
        print("不存在")

如何获取会话ID:

  1. 通过浏览器登录到您的Instagram帐户。
  2. 成功登录后,执行元素检查,并在“网络”选项卡上打开其中一个请求。
  3. cookie 中可以看到 sessionid 的值在请求头中。

Requests库在Python中未能获取数据。

英文:

The way I found to make sure the Instagram username exists or not is by setting the sessionid cookie on HTTP requests. Adding the cookie ensures that the HTTP request will use your Instagram account login session information. Use the BeautifulSoup Selector to be able to validate whether the Instagram username exists or not using the innerHTML of the title element.

import requests
from bs4 import BeautifulSoup

exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
    r = requests.get("https://www.instagram.com/"+i+"/", cookies={"sessionid":"your-session-id"})
    soup = BeautifulSoup(r.text, 'html.parser')
    title = soup.find('title').text
    if i in title:
        exist.append(i)
        print('Exist')
        url.append("https://www.instagram.com/"+i+"/")
    else:
        print("Not Exist")

Without BeautifulSoup

import requests

exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
    r = requests.get("https://www.instagram.com/"+i+"/", cookies={"sessionid":"your-session-id"})
    if r.apparent_encoding == 'Windows-1252':
        exist.append(i)
        print('Exist')
        url.append("https://www.instagram.com/"+i+"/")
    else:
        print("Not Exist")

How to get session id:

  1. Log in to your Instagram account via browser.
  2. After successful login, perform element inspection and open one of the requests on the Network tab.
  3. The sessionid value can be seen in the cookie in the Request Headers.

Requests库在Python中未能获取数据。

答案2

得分: 0

尝试将print(exist)print(url)语句移出循环,以显示现有用户名和URL的最终列表。

import requests

exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]

for i in cli:
    try:
        r = requests.get("https://www.instagram.com/" + i + "/")
        if r.apparent_encoding == 'Windows-1252':
            exist.append(i)
            print('Exist')
            url.append("https://www.instagram.com/" + i + "/")
        else:
            print("Not Exist")
    except requests.exceptions.RequestException as e:
        print('Error:', e)

print(exist)
print(url)
英文:

Try with this the print(exist) and print(url) statements are moved outside the loop to show the final lists of existing usernames and URLs.

import requests

exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]

for i in cli:
    try:
        r = requests.get("https://www.instagram.com/" + i + "/")
        if r.apparent_encoding == 'Windows-1252':
            exist.append(i)
            print('Exist')
            url.append("https://www.instagram.com/" + i + "/")
        else:
            print("Not Exist")
    except requests.exceptions.RequestException as e:
        print('Error:', e)

print(exist)
print(url)

huangapple
  • 本文由 发表于 2023年2月27日 16:41:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/75578320.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定