如何在HTML中获取所有元素的名称(我的错误代码运行循环)

huangapple go评论145阅读模式
英文:

How can I get all elements name on html (My bad code run loop)

问题

我想获取一个文档树。
然后,首先,我显示了所有元素的名称。
但是我的代码运行了一个循环。
我该怎么办?

package main

import (
	"github.com/PuerkitoBio/goquery"
	"golang.org/x/net/html"
)

func getTagName(s *goquery.Selection) {
	for _, n := range s.Nodes {
		if n.Type != html.ElementNode {
			continue
		}
		println(n.Data)
		getTagName(s.Children())
	}
}

func main() {
	doc, _ := goquery.NewDocument("https://news.ycombinator.com/")
	doc.Find("html body").Each(func(_ int, s *goquery.Selection) {
		getTagName(s)
	})
}
英文:

I want to get a document tree.
Then, first, I displayed all elements name.
But my code run loop.
How can I do?

package main

import (
	"github.com/PuerkitoBio/goquery"
	"golang.org/x/net/html"
)

func getTagName(s *goquery.Selection) {
	for _, n := range s.Nodes {
		if n.Type != html.ElementNode {
			continue
		}
		println(n.Data)
		getTagName(s.Children())
	}
}

func main() {
	doc, _ := goquery.NewDocument("https://news.ycombinator.com/")
	doc.Find("html body").Each(func(_ int, s *goquery.Selection) {
		getTagName(s)
	})
}

答案1

得分: 1

看起来它与以下代码一起工作:

package main

import (
	"os"

	"github.com/PuerkitoBio/goquery"
	"golang.org/x/net/html"
)

var areWeLooping = make(map[*goquery.Selection]struct{})

func getTagName(s *goquery.Selection) {
	if _, weAreLooping := areWeLooping[s]; weAreLooping {
		println("检测到循环")
		os.Exit(1)
	}

	areWeLooping[s] = struct{}{}

	for _, n := range s.Nodes {
		if n.Type != html.ElementNode {
			continue
		}
		println(n.Data)
	}

	s.Children().Each(func(_ int, s *goquery.Selection) {
		getTagName(s)
	})
}

func main() {
	doc, _ := goquery.NewDocument("https://news.ycombinator.com/")
	doc.Find("html body").Children().Each(func(_ int, s *goquery.Selection) {
		getTagName(s)
	})
}

在循环内部使用getTagName(s.Children())会引起问题。

英文:

It seems to work with this:

package main

import (
	"os"

	"github.com/PuerkitoBio/goquery"
	"golang.org/x/net/html"
)

var areWeLooping = make(map[*goquery.Selection]struct{})

func getTagName(s *goquery.Selection) {
	if _, weAreLooping := areWeLooping
展开收缩
; weAreLooping { println("loop detected") os.Exit(1) } areWeLooping
展开收缩
= struct{}{} for _, n := range s.Nodes { if n.Type != html.ElementNode { continue } println(n.Data) } s.Children().Each(func(_ int, s *goquery.Selection) { getTagName(s) }) } func main() { doc, _ := goquery.NewDocument("https://news.ycombinator.com/") doc.Find("html body").Children().Each(func(_ int, s *goquery.Selection) { getTagName(s) }) }

Having getTagName(s.Children()) inside the loop was causing trouble.

huangapple
  • 本文由 发表于 2015年10月17日 16:44:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/33184619.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定