问题

以下是已翻译的部分：

我正在尝试使用Python的请求模块来爬取生成网站上图表的数据。

我的代码目前如下：

# 加载模块
import os
import json
import requests as r

# 发送请求的网址
postURL = <插入网站>

# 使用get方法获取cookie数据
cookie_intel = r.get(postURL, verify = False)

# 获取cookies
search_cookies = cookie_intel.cookies

#### 请求信息 ####

# API请求数据
post_data = <插入请求JSON>

# 头部信息
headers = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"}

# 结果
results_post = r.post(postURL, data=post_data, cookies=search_cookies, headers=headers, verify=False)

# 结果
print(results_post.json())

简单总结一下，我首先加载了该网站，然后进行了检查，在网络选项卡中找到了请求的URL，然后在负载选项卡中检查了所需的请求数据。然后，我从请求头部选项卡中获取了用户代理信息。

请求本身有效，但结果始终为空。我尝试过修改各种输入，但没有成功。我会非常感谢任何能帮助我解决这个问题的提示。提前感谢您！

英文:

I am trying to scrape data that generates a chart on a website using python's request module.

My code currently looks like this:

# load modules
import os
import json
import requests as r

# url to send the call to
postURL = &lt;insert website&gt;

# utiliz get to pull cookie data
cookie_intel = r.get(postURL, verify = False)

# get cookies
search_cookies = cookie_intel.cookies

#### Request Information ####

# API request data
post_data = &lt;insert request json&gt;

# header information
headers = {&quot;user-agent&quot;:&quot;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36&quot;}

# results 
results_post = r.post(postURL, data = post_data, cookies = search_cookies, headers = headers, verify = False)

# result
print(results_post.json())

As a quick summary, I first loaded the site to then inspect it, from there I identified the url for the request in the network tab and then checked the required request data in the payload tab. Then I took the user-agent from the request headers tab.

The request itself works, however, it is always empty. I have tried altering all sorts of inputs but without success. I would highly appreciate any sort of tips that would help me to solve this issue. Thank you in advance!

答案1

得分: 6

在这种情况下，您在进行POST请求时，需要使用json=而不是data=，根据requests的文档。通过替换您的代码的这部分，您应该会获得预期的响应。

results_post = r.post(postURL, json=post_data, cookies=search_cookies, headers=headers, verify=False)

您还可以尝试其他爬取工具，比如Scrapy，来爬取这些数据，也可以考虑在云上运行爬虫，使用estela。

英文:

in this case you have to use json= instead of data= when making the post request according to the requests documentation . By replacing this part of your code you should get the expected response.

results_post = r.post(postURL, json = post_data, cookies = search_cookies, headers = headers, verify = False)

You can also try other scraping tools like Scrapy to crawl these data and maybe running the crawler on the cloud using estela.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

生成图表的数据请求始终为空。

问题

答案1

除了它不打开我的摄像头来获取我的面部坐标之外，它正在运行。

MacOS错误与Tkinter代码导致分段错误（SIGSEGV）

运行数千个相同爬虫实例

使用BeautifulSoup提取脚本内容的特定键。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论