Iterate over HTMLElement attributes with colly?

huangapple go评论79阅读模式
英文:

Iterate over HTMLElement attributes with colly?

问题

如在HTML结构中所见,attributes是一个私有属性:

// HTMLElement是HTML标签的表示。
type HTMLElement struct {
	// Name是标签的名称
	Name       string
	Text       string
	attributes []html.Attribute
	// Request是元素的HTML文档的请求对象
	Request *Request
	// Response是元素的HTML文档的响应对象
	Response *Response
	// DOM是页面的goquery解析的DOM对象。DOM是相对于当前HTMLElement的
	DOM *goquery.Selection
	// Index存储当前元素在由OnHTML回调匹配的所有元素中的位置
	Index int
}

有一些函数,比如.Attr()用于获取单个属性,但是如何遍历所有属性呢?似乎没有明显的方法来访问attributes或该结构的函数。

英文:

As seen in the HTML struct, the attributes is a private property:

// HTMLElement is the representation of a HTML tag.
type HTMLElement struct {
	// Name is the name of the tag
	Name       string
	Text       string
	attributes []html.Attribute
	// Request is the request object of the element's HTML document
	Request *Request
	// Response is the Response object of the element's HTML document
	Response *Response
	// DOM is the goquery parsed DOM object of the page. DOM is relative
	// to the current HTMLElement
	DOM *goquery.Selection
	// Index stores the position of the current element within all the elements matched by an OnHTML callback
	Index int
}

There are functions such a .Attr() for fetching a single attribute, but how would I iterate over all attributes? It seems that there is no obvious way to access attributes or functions from that struct.

答案1

得分: 1

通过访问底层的html.Node,我们可以进行迭代:

for _, node := range e.DOM.Nodes {
	fmt.Println(node.Attr)
}
英文:

By accessing the raw html.Node underneath, we can iterate:

for _, node := range e.DOM.Nodes {
	fmt.Println(node.Attr)
}

huangapple
  • 本文由 发表于 2022年7月27日 08:39:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/73131106.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定