2017年8月29日 19:45:12go评论183阅读模式

英文:

Chromedp Package: How to get updated HTML source of the webpage which has dynamically loaded contents by using chromedp

问题

我正在尝试在网页https://www.tokopedia.com/chocoapple/ready-stock-bnib-iphone-128gb-7-plus-jet-black-garansi-apple-1-tahun-10?src=topads上爬取视频链接。这些链接是通过"webyclip"服务生成的，该服务在页面加载后加载数据。我想在所有JavaScript和AJAX加载完成后获取页面的更新HTML源代码（类似于在浏览器上进行"检查元素"）。如何通过chromedp包（https://github.com/knq/chromedp）实现这一目标？它是GoLang的一个无头浏览器。请帮忙。我是一个新手在进行网页爬取。

编辑：这与链接中提到的另一个问题不同。因为这是针对chromedp包的特定问题。评论中的问题是关于如何/使用什么来爬取动态内容的。

英文:

I am trying to scrape the video links on the web page, https://www.tokopedia.com/chocoapple/ready-stock-bnib-iphone-128gb-7-plus-jet-black-garansi-apple-1-tahun-10?src=topads
There are links, which are getting generated through "webyclip" service which loads the data after the page is loaded. I want the updated HTML source of the page after all the JavaScripts and AJAX are loaded (Similar when we do "Inspect element" on a browser). How to get it done through the chromedp package (https://github.com/knq/chromedp). It is a headless browser for GoLang. Please help. I am a newbie in web scraping.

EDIT: It is not similar to the another question mentioned in the link. As this is specific to chromedp package. The one in the comments ask for how to/ what to use to scrape dynamic contents.

答案1

得分: -3

经过多次尝试，最终我找到了方法并解决了我的问题。
你可以查看我的GitHub存储库以获取这个问题的解决方案。
谢谢。

英文:

After many attempts, Finally, I found the way and solved my query.
You can check my GitHub repository for this query.
Thank you.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Chromedp包：如何使用chromedp获取动态加载内容的网页的更新后的HTML源代码。

问题

答案1

导入”protoc-gen-openapiv2/options/annotations.proto”未找到或存在错误。

获取 curl localhost:8080 时出现 DNS 错误。

如何在Golang代码中禁用透明巨页？

调用TCP发送（网络包）从HTTP请求函数。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论