英文:
goquery: stop parsing when another element is reached
问题
假设我有这个HTML页面。我想使用Go和goquery来解析它:
<html>
<head><!--页面头部内容--></head>
<body>
<h1 class="h1-class">Heading 1</h1>
<div class="div-class">Stuff1</div>
<div class="div-class">Stuff2</div>
<h1 class="h1-class">Heading 2</h1>
<div class="div-class">Stuff3</div>
<div class="div-class">Stuff4</div>
</body>
</html>
恰好,我只想获取Heading 2之前的那些DIV,并跳过其余部分。下面的代码可以很好地获取所有的DIV:
doc := GetGoQueryDocument(url) //在其他地方定义
doc.Find("div.div-class").Each(func(_ int, theDiv *goquery.Selection){
//对每个theDiv进行操作
//问题是它会找到Heading 2下面的div.div-class元素。
//我想跳过那些元素。
})
有没有办法告诉goquery跳过位于特定标签和类名下方的元素?谢谢任何提示!
英文:
Suppose I have this HTML page. I want to parse it using Go and goquery:
<html>
<head><!--Page header stuff--></head>
<body>
<h1 class="h1-class">Heading 1</h1>
<div class="div-class">Stuff1</div>
<div class="div-class">Stuff2</div>
<h1 class="h1-class">Heading 2</h1>
<div class="div-class">Stuff3</div>
<div class="div-class">Stuff4</div>
</body>
</html>
As it happens, I'd like only to get those DIVs before Heading 2 and skip the rest. This code works great to get all DIVs:
doc := GetGoQueryDocument(url) //Defined elsewhere
doc.Find("div.div-class").Each(func(_ int, theDiv *goquery.Selection){
//do stuff with each theDiv
//The problem is that it finds div.div-class elements below Heading 2.
//I want to skip those.
})
Is there any way to tell goquery to skip elements located beneath a certain tag and classname? Thanks for any tips!
答案1
得分: 2
是的,实际上非常简单:
doc.Find(".h1-class").First().NextUntil(".h1-class")
我建议你阅读一下 godoc:https://godoc.org/github.com/PuerkitoBio/goquery
它解释了你可以如何操作选择器的不同方式。
英文:
Yes, actually pretty easy:
doc.Find(".h1-class").First().NextUntil(".h1-class")
I would recommend you read through the godoc: https://godoc.org/github.com/PuerkitoBio/goquery
It explains all of the different ways you can manipulate the selection.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论