如何在不使用Selenium或Puppeteer的情况下从网站上抓取数据?

huangapple go评论76阅读模式
英文:

How to scrape data from website without selenium,puppeteer

问题

我需要从一个网站获取数据供我的React应用程序使用。我创建了一个服务器,这个Node.js服务器在我的React应用程序发出请求时返回数据。例如,我可以使用请求(request)、fetch或axios从网站获取5行数据,但我需要点击一个按钮来获取更多数据。我不想使用像Selenium、Puppeteer这样的框架,因为在部署和创建服务器之后会出现问题。除了这些,是否有其他建议?

我尝试过使用Selenium和Puppeteer作为无头浏览器来解决这个问题,但在部署阶段出现了问题。

英文:

I need to fetch data from a site for my React app. I created a server and this Nodejs server returns a data in response when request from my React application.For example, I can get 5 lines of data (with requests,fetch,axios) from the website, but I need to click a button for more. I don't want to use frameworks such as Selenium, Puppeteer because after deploying and creating a server, there is a problem. Does anyone have any suggestions other than these?

I tried to use selenium,puppeteer as headless for this issue, but these caused problems during the deploy phase.

答案1

得分: 1

查看浏览器开发工具中的网络选项卡,并观察当您点击“more”按钮时发生了什么。很可能浏览器正在发送一个请求,其中包括分页参数(例如,start、limit),您可以在您的代码中复制该请求以获取您想要的内容。

英文:

Look at the network tab in the browser's devtools and watch what happens when you click the "more" button. It's likely the browser is sending a request that includes paging parameters (e.g. start, limit) and you can replicate that request in your code to get what you're after.

huangapple
  • 本文由 发表于 2023年6月26日 01:55:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/76551743.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定