Go Colly如何找到所请求的元素?

huangapple go评论93阅读模式
英文:

Go Colly how to find requested element?

问题

我正在尝试使用colly来循环遍历特定表格的内容,但是该表格没有被识别出来,以下是我目前的代码:

package main

import (
	"fmt"
	
	"github.com/gocolly/colly"
)

func main() {
	c := colly.NewCollector(
		colly.AllowedDomains("wikipedia.org", "en.wikipedia.org"),
	)
	
	links := make([]string, 0)

	c.OnHTML("div.mw-parser-output", func(e *colly.HTMLElement) {
	    
		e.ForEach("table.wikitable.sortable.jquery-tablesorter > tbody > tr", func(_ int, elem *colly.HTMLElement) {
			fmt.Println(elem.ChildAttr("a[href]", "href"))
			links = append(links, elem.ChildAttr("a[href]", "href"))
		})
	})
	
	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL.String())
	})

	c.Visit("https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population")
	fmt.Println("Found urls for", len(links), "countries.")
}

我需要循环遍历表格中的所有tr元素。

英文:

I'm trying to get specific table to loop through its content using colly but table its not being recognized, here's what I have so far.

package main

import (
	"fmt"
	
	"github.com/gocolly/colly"
)

func main() {
	c := colly.NewCollector(
		colly.AllowedDomains("wikipedia.org", "en.wikipedia.org"),
	)
	
	links := make([]string, 0)

	c.OnHTML("div.mw-parser-output", func(e *colly.HTMLElement) {
	    
		e.ForEach("table.wikitable.sortable.jquery-tablesorter > tbody > tr", func(_ int, elem *colly.HTMLElement) {
			fmt.Println(elem.ChildAttr("a[href]", "href"))
			links = append(links, elem.ChildAttr("a[href]", "href"))
		})
	})
	
	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL.String())
	})

	c.Visit("https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population")
	fmt.Println("Found urls for", len(links), "countries.")
}

I need to loop thought all of the tr elements in the the table.

答案1

得分: 1

原来类的名称实际上是wikitable.sortable,尽管在Chrome控制台中显示为wikitable sortable jquery-tablesorter。我不知道为什么名称会有这样的差异,但这解决了我的问题。

英文:

Turns out name of the class is actually wikitable.sortable even though in chrome console is shown as wikitable sortable jquery-tablesorter. I dont know why the names are different like this but it solved the problem for me.

huangapple
  • 本文由 发表于 2022年12月28日 22:32:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/74941492.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定