goquery:当达到另一个元素时停止解析。

huangapple go评论78阅读模式
英文:

goquery: stop parsing when another element is reached

问题

假设我有这个HTML页面。我想使用Gogoquery来解析它:

<html>
    <head><!--页面头部内容--></head>
    <body>
         <h1 class="h1-class">Heading 1</h1>
             <div class="div-class">Stuff1</div>
             <div class="div-class">Stuff2</div>
         <h1 class="h1-class">Heading 2</h1>
             <div class="div-class">Stuff3</div>
             <div class="div-class">Stuff4</div>
    </body>
</html>

恰好,我只想获取Heading 2之前的那些DIV,并跳过其余部分。下面的代码可以很好地获取所有的DIV:

 doc := GetGoQueryDocument(url) //在其他地方定义
 doc.Find("div.div-class").Each(func(_ int, theDiv *goquery.Selection){
     //对每个theDiv进行操作
     //问题是它会找到Heading 2下面的div.div-class元素。
     //我想跳过那些元素。
 })

有没有办法告诉goquery跳过位于特定标签和类名下方的元素?谢谢任何提示!

英文:

Suppose I have this HTML page. I want to parse it using Go and goquery:

&lt;html&gt;
    &lt;head&gt;&lt;!--Page header stuff--&gt;&lt;/head&gt;
    &lt;body&gt;
         &lt;h1 class=&quot;h1-class&quot;&gt;Heading 1&lt;/h1&gt;
             &lt;div class=&quot;div-class&quot;&gt;Stuff1&lt;/div&gt;
             &lt;div class=&quot;div-class&quot;&gt;Stuff2&lt;/div&gt;
         &lt;h1 class=&quot;h1-class&quot;&gt;Heading 2&lt;/h1&gt;
             &lt;div class=&quot;div-class&quot;&gt;Stuff3&lt;/div&gt;
             &lt;div class=&quot;div-class&quot;&gt;Stuff4&lt;/div&gt;
    &lt;/body&gt;
&lt;/html&gt;

As it happens, I'd like only to get those DIVs before Heading 2 and skip the rest. This code works great to get all DIVs:

 doc := GetGoQueryDocument(url) //Defined elsewhere
 doc.Find(&quot;div.div-class&quot;).Each(func(_ int, theDiv *goquery.Selection){
     //do stuff with each theDiv
     //The problem is that it finds div.div-class elements below Heading 2.
     //I want to skip those.
 })

Is there any way to tell goquery to skip elements located beneath a certain tag and classname? Thanks for any tips!

答案1

得分: 2

是的,实际上非常简单:

doc.Find(".h1-class").First().NextUntil(".h1-class")

我建议你阅读一下 godoc:https://godoc.org/github.com/PuerkitoBio/goquery

它解释了你可以如何操作选择器的不同方式。

英文:

Yes, actually pretty easy:

doc.Find(&quot;.h1-class&quot;).First().NextUntil(&quot;.h1-class&quot;)

I would recommend you read through the godoc: https://godoc.org/github.com/PuerkitoBio/goquery

It explains all of the different ways you can manipulate the selection.

huangapple
  • 本文由 发表于 2017年1月11日 05:29:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/41578768.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定