英文:
Request for data that generates chart always empty
问题
以下是已翻译的部分:
我正在尝试使用Python的请求模块来爬取生成网站上图表的数据。
我的代码目前如下:
# 加载模块
import os
import json
import requests as r
# 发送请求的网址
postURL = <插入网站>
# 使用get方法获取cookie数据
cookie_intel = r.get(postURL, verify = False)
# 获取cookies
search_cookies = cookie_intel.cookies
#### 请求信息 ####
# API请求数据
post_data = <插入请求JSON>
# 头部信息
headers = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"}
# 结果
results_post = r.post(postURL, data=post_data, cookies=search_cookies, headers=headers, verify=False)
# 结果
print(results_post.json())
简单总结一下,我首先加载了该网站,然后进行了检查,在网络选项卡中找到了请求的URL,然后在负载选项卡中检查了所需的请求数据。然后,我从请求头部选项卡中获取了用户代理信息。
请求本身有效,但结果始终为空。我尝试过修改各种输入,但没有成功。我会非常感谢任何能帮助我解决这个问题的提示。提前感谢您!
英文:
I am trying to scrape data that generates a chart on a website using python's request module.
My code currently looks like this:
# load modules
import os
import json
import requests as r
# url to send the call to
postURL = <insert website>
# utiliz get to pull cookie data
cookie_intel = r.get(postURL, verify = False)
# get cookies
search_cookies = cookie_intel.cookies
#### Request Information ####
# API request data
post_data = <insert request json>
# header information
headers = {"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"}
# results
results_post = r.post(postURL, data = post_data, cookies = search_cookies, headers = headers, verify = False)
# result
print(results_post.json())
As a quick summary, I first loaded the site to then inspect it, from there I identified the url for the request in the network tab and then checked the required request data in the payload tab. Then I took the user-agent from the request headers tab.
The request itself works, however, it is always empty. I have tried altering all sorts of inputs but without success. I would highly appreciate any sort of tips that would help me to solve this issue. Thank you in advance!
答案1
得分: 6
在这种情况下,您在进行POST请求时,需要使用json=
而不是data=
,根据requests的文档。通过替换您的代码的这部分,您应该会获得预期的响应。
results_post = r.post(postURL, json=post_data, cookies=search_cookies, headers=headers, verify=False)
您还可以尝试其他爬取工具,比如Scrapy,来爬取这些数据,也可以考虑在云上运行爬虫,使用estela。
英文:
in this case you have to use json=
instead of data=
when making the post request according to the requests documentation . By replacing this part of your code you should get the expected response.
results_post = r.post(postURL, json = post_data, cookies = search_cookies, headers = headers, verify = False)
You can also try other scraping tools like Scrapy to crawl these data and maybe running the crawler on the cloud using estela.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论