Chromedp包:如何使用chromedp获取动态加载内容的网页的更新后的HTML源代码。

huangapple go评论161阅读模式
英文:

Chromedp Package: How to get updated HTML source of the webpage which has dynamically loaded contents by using chromedp

问题

我正在尝试在网页https://www.tokopedia.com/chocoapple/ready-stock-bnib-iphone-128gb-7-plus-jet-black-garansi-apple-1-tahun-10?src=topads上爬取视频链接。这些链接是通过"webyclip"服务生成的,该服务在页面加载后加载数据。我想在所有JavaScript和AJAX加载完成后获取页面的更新HTML源代码(类似于在浏览器上进行"检查元素")。如何通过chromedp包(https://github.com/knq/chromedp)实现这一目标?它是GoLang的一个无头浏览器。请帮忙。我是一个新手在进行网页爬取。

编辑:这与链接中提到的另一个问题不同。因为这是针对chromedp包的特定问题。评论中的问题是关于如何/使用什么来爬取动态内容的。

英文:

I am trying to scrape the video links on the web page, https://www.tokopedia.com/chocoapple/ready-stock-bnib-iphone-128gb-7-plus-jet-black-garansi-apple-1-tahun-10?src=topads
There are links, which are getting generated through "webyclip" service which loads the data after the page is loaded. I want the updated HTML source of the page after all the JavaScripts and AJAX are loaded (Similar when we do "Inspect element" on a browser). How to get it done through the chromedp package (https://github.com/knq/chromedp). It is a headless browser for GoLang. Please help. I am a newbie in web scraping.

EDIT: It is not similar to the another question mentioned in the link. As this is specific to chromedp package. The one in the comments ask for how to/ what to use to scrape dynamic contents.

答案1

得分: -3

经过多次尝试,最终我找到了方法并解决了我的问题。
你可以查看我的GitHub存储库以获取这个问题的解决方案。
谢谢。

英文:

After many attempts, Finally, I found the way and solved my query.
You can check my GitHub repository for this query.
Thank you.

huangapple
  • 本文由 发表于 2017年8月29日 19:45:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/45938288.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定