英文:
chromedp click is not working in my golang code. can you find what's wrong?
问题
我正在使用chromedp进行网络爬虫工作。
为了获取我想要的内容(页面的HTML),我需要点击一个特定的按钮。
所以我使用了chromedp.click和chromedp.outerhtml,但是我只能得到点击之前页面的HTML,而不是点击完成后页面的HTML。
你能看一下我的代码并给我建议如何修复吗?
func runCrawler(URL string, lineNum string, stationNm string) {
// 爬取设置
opts := append(chromedp.DefaultExecAllocatorOptions[:],
chromedp.Flag("headless", false))
// 创建Chrome实例
contextVar, cancelFunc := chromedp.NewExecAllocator(context.Background(), opts...)
defer cancelFunc()
contextVar, cancelFunc = chromedp.NewContext(contextVar)
defer cancelFunc()
var htmlContent string
err := chromedp.Run(contextVar,
chromedp.Navigate(URL),
chromedp.WaitVisible(".end_footer_area"),
chromedp.Click(".end_section.station_info_section > div.at_end.sofzqce > div > div.c10jv2ep.wrap_btn_schedule.schedule_time > button"),
chromedp.OuterHTML("html", &htmlContent, chromedp.ByQuery),
)
fmt.Println("html", htmlContent)
checkErr(err)
我还给出了主页和需要点击的按钮。
页面URL: https://pts.map.naver.com/end-subway/ends/web/11321/home
我需要点击的按钮区域:
非常感谢!
英文:
I'm working on scrapper with chromedp.
To get what i want (page html), i have to click a specific button.
So I used chromedp.click, and chromedp.outerhtml, but i only got html of page before click, not the html of page after click have done.
Can you see my code and advice me how to fix it?
func runCrawler(URL string, lineNum string, stationNm string) {
// settings for crawling
opts := append(chromedp.DefaultExecAllocatorOptions[:],
chromedp.Flag("headless", false))
// create chrome instance
contextVar, cancelFunc := chromedp.NewExecAllocator(context.Background(), opts...)
defer cancelFunc()
contextVar, cancelFunc = chromedp.NewContext(contextVar)
defer cancelFunc()
var htmlContent string
err := chromedp.Run(contextVar,
chromedp.Navigate(URL),
chromedp.WaitVisible(".end_footer_area"),
chromedp.Click(".end_section.station_info_section > div.at_end.sofzqce > div > div.c10jv2ep.wrap_btn_schedule.schedule_time > button"),
chromedp.OuterHTML("html", &htmlContent, chromedp.ByQuery),
)
fmt.Println("html", htmlContent)
checkErr(err)
i also give you homepage and button i need to click
Page URL: https://pts.map.naver.com/end-subway/ends/web/11321/home
Button Area I need to click:
Thank you very much
答案1
得分: 0
你想要获取的页面在一个新标签页中打开。
在这种情况下,我们可以使用chromedp.WaitNewTarget来创建一个通道,从中我们可以接收到新标签页的目标ID。然后使用chromedp.WithTargetID选项创建一个新的上下文,以便我们可以连接到新的标签页。从这里开始,一切都是你已经熟悉的。
package main
import (
"context"
"fmt"
"strings"
"github.com/chromedp/cdproto/target"
"github.com/chromedp/chromedp"
)
func main() {
opts := append(chromedp.DefaultExecAllocatorOptions[:],
chromedp.Flag("headless", false),
)
ctx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
defer cancel()
ctx, cancel = chromedp.NewContext(ctx)
defer cancel()
var htmlContent string
ch := chromedp.WaitNewTarget(ctx, func(i *target.Info) bool {
return strings.Contains(i.URL, "/timetable/web/")
})
err := chromedp.Run(ctx,
chromedp.Navigate("https://pts.map.naver.com/end-subway/ends/web/11321/home"),
chromedp.WaitVisible(".end_footer_area"),
chromedp.Click(".end_section.station_info_section > div.at_end.sofzqce > div > div.c10jv2ep.wrap_btn_schedule.schedule_time > button"),
)
if err != nil {
panic(err)
}
newCtx, cancel := chromedp.NewContext(ctx, chromedp.WithTargetID(<-ch))
defer cancel()
if err := chromedp.Run(newCtx,
chromedp.WaitReady(".table_schedule", chromedp.ByQuery),
chromedp.OuterHTML("html", &htmlContent, chromedp.ByQuery),
); err != nil {
panic(err)
}
fmt.Println("html", htmlContent)
}
英文:
The page you want to get is open in a new tab (target).
In this case, we can use chromedp.WaitNewTarget to create a chan from where we can receive the target id of the new tab. Then create a new context with the chromedp.WithTargetID option so that we can connect to the new tab. From here everything is what you are already familiar with.
package main
import (
"context"
"fmt"
"strings"
"github.com/chromedp/cdproto/target"
"github.com/chromedp/chromedp"
)
func main() {
opts := append(chromedp.DefaultExecAllocatorOptions[:],
chromedp.Flag("headless", false),
)
ctx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
defer cancel()
ctx, cancel = chromedp.NewContext(ctx)
defer cancel()
var htmlContent string
ch := chromedp.WaitNewTarget(ctx, func(i *target.Info) bool {
return strings.Contains(i.URL, "/timetable/web/")
})
err := chromedp.Run(ctx,
chromedp.Navigate("https://pts.map.naver.com/end-subway/ends/web/11321/home"),
chromedp.WaitVisible(".end_footer_area"),
chromedp.Click(".end_section.station_info_section > div.at_end.sofzqce > div > div.c10jv2ep.wrap_btn_schedule.schedule_time > button"),
)
if err != nil {
panic(err)
}
newCtx, cancel := chromedp.NewContext(ctx, chromedp.WithTargetID(<-ch))
defer cancel()
if err := chromedp.Run(newCtx,
chromedp.WaitReady(".table_schedule", chromedp.ByQuery),
chromedp.OuterHTML("html", &htmlContent, chromedp.ByQuery),
); err != nil {
panic(err)
}
fmt.Println("html", htmlContent)
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论