英文:
Requests python not fetching data
问题
请问请求未获取任何数据的原因是什么?我尝试在我的虚拟机上运行它,一切正常,但当我在服务器上测试它时,却一点用都没有。我确保了我的环境中安装了请求库,并确保其版本是最新的。这是我使用的代码。
import requests
exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
r = requests.get("https://www.instagram.com/"+i+"/")
try:
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
print('Exist')
url.append("https://www.instagram.com/"+i+"/")
else:
print("Not Exist")
except:
print('Not Exist')
当我运行这段代码时,它一直说不存在,而不是返回存在。我已经检查了帐户存在。如何解决这个问题?谢谢。
英文:
is there a reason why the requests isn't getting any data? I tried doing it on my VM, and it works normally but when i tested it in my server, it doesn't work at all. I've made sure that the requests library is installed in my environment I and also made sure that it's up-to-date with its version. here is the code that i use.
import requests
exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
r = requests.get("https://www.instagram.com/"+i+"/")
try:
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
print('Exist')
url.append("https://www.instagram.com/"+i+"/")
else:
print("Not Exist")
except:
print('Not Exist')
When I ran this code instead of getting exists, it kept saying that it doesn't exist. I've checked that the account exists. How do I fix this problem? Thanks
答案1
得分: 1
以下是翻译好的部分:
使用 BeautifulSoup
import requests
from bs4 import BeautifulSoup
exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
r = requests.get("https://www.instagram.com/" + i + "/", cookies={"sessionid": "your-session-id"})
soup = BeautifulSoup(r.text, 'html.parser')
title = soup.find('title').text
if i in title:
exist.append(i)
print('存在')
url.append("https://www.instagram.com/" + i + "/")
else:
print("不存在")
没有使用 BeautifulSoup
import requests
exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
r = requests.get("https://www.instagram.com/" + i + "/", cookies={"sessionid": "your-session-id"})
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
print('存在')
url.append("https://www.instagram.com/" + i + "/")
else:
print("不存在")
如何获取会话ID:
- 通过浏览器登录到您的Instagram帐户。
- 成功登录后,执行元素检查,并在“网络”选项卡上打开其中一个请求。
cookie
中可以看到sessionid
的值在请求头中。
英文:
The way I found to make sure the Instagram username exists or not is by setting the sessionid
cookie on HTTP requests. Adding the cookie ensures that the HTTP request will use your Instagram account login session information. Use the BeautifulSoup Selector to be able to validate whether the Instagram username exists or not using the innerHTML of the title
element.
import requests
from bs4 import BeautifulSoup
exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
r = requests.get("https://www.instagram.com/"+i+"/", cookies={"sessionid":"your-session-id"})
soup = BeautifulSoup(r.text, 'html.parser')
title = soup.find('title').text
if i in title:
exist.append(i)
print('Exist')
url.append("https://www.instagram.com/"+i+"/")
else:
print("Not Exist")
Without BeautifulSoup
import requests
exist=[]
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
r = requests.get("https://www.instagram.com/"+i+"/", cookies={"sessionid":"your-session-id"})
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
print('Exist')
url.append("https://www.instagram.com/"+i+"/")
else:
print("Not Exist")
How to get session id:
- Log in to your Instagram account via browser.
- After successful login, perform element inspection and open one of the requests on the
Network
tab. - The
sessionid
value can be seen in thecookie
in the Request Headers.
答案2
得分: 0
尝试将print(exist)
和print(url)
语句移出循环,以显示现有用户名和URL的最终列表。
import requests
exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
try:
r = requests.get("https://www.instagram.com/" + i + "/")
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
print('Exist')
url.append("https://www.instagram.com/" + i + "/")
else:
print("Not Exist")
except requests.exceptions.RequestException as e:
print('Error:', e)
print(exist)
print(url)
英文:
Try with this the print(exist) and print(url) statements are moved outside the loop to show the final lists of existing usernames and URLs.
import requests
exist = []
url = []
cli = ["nike", "duolingo", "duolingoindo"]
for i in cli:
try:
r = requests.get("https://www.instagram.com/" + i + "/")
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
print('Exist')
url.append("https://www.instagram.com/" + i + "/")
else:
print("Not Exist")
except requests.exceptions.RequestException as e:
print('Error:', e)
print(exist)
print(url)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论