GoQuery选择提取器不起作用

huangapple go评论78阅读模式
英文:

GoQuery selection extractor not working

问题

我正在尝试从HTML片段中提取目标属性并添加到一个切片中。

<div class="pagination pagination-responsive">
    <ul class="list-unstyled">
        <li class="active">
            <a rel="start" target="1" href="/s/Cambridge--MA--United-States">1</a>
        </li>
        <li>
            <a rel="next" target="2" href="/s/Cambridge--MA--United-States?page=2">2</a>
        </li>
        <li>
            <a target="3" href="/s/Cambridge--MA--United-States?page=3">3</a>
        </li>
        <li class="gap"><span class="gap">&hellip;</span></li>
        <li>
            <a target="17" href="/s/Cambridge--MA--United-States?page=17">17</a>
        </li>
        <li class="next next_page">
            <a target="2" rel="next" href="/s/Cambridge--MA--United-States?page=2">
                <span class="screen-reader-only">Next</span>
                <i class="icon icon-caret-right"></i>
            </a>
        </li>
    </ul>
</div>
pageCounts := doc.Find(".pagination-responsive .list-unstyled")
for page := range pageCounts.Nodes {
    pageIterator := pageCounts.Eq(page)
    li := pageIterator.Find("li a")
    href, _ := li.Attr("target")
    fmt.Println(href)
}

请问有人可以指出我可能遗漏了什么吗?

英文:

I am attempting to extract the target attribute and add to a slice from the HTML snippet

&lt;div class=&quot;pagination pagination-responsive&quot;&gt;
      &lt;ul class=&quot;list-unstyled&quot;&gt; 
          &lt;li class=&quot;active&quot;&gt;
             &lt;a rel=&quot;start&quot; target=&quot;1&quot; href=&quot;/s/Cambridge--MA--United-States&quot;&gt;1&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li&gt;
             &lt;a rel=&quot;next&quot; target=&quot;2&quot; href=&quot;/s/Cambridge--MA--United-States?page=2&quot;&gt;2&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li&gt;
              &lt;a target=&quot;3&quot; href=&quot;/s/Cambridge--MA--United-States?page=3&quot;&gt;3&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li class=&quot;gap&quot;&gt;&lt;span class=&quot;gap&quot;&gt;&amp;hellip;&lt;/span&gt;
		  &lt;/li&gt; 
          &lt;li&gt;
            &lt;a target=&quot;17&quot; href=&quot;/s/Cambridge--MA--United-States?page=17&quot;&gt;17&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li class=&quot;next next_page&quot;&gt;&lt;a target=&quot;2&quot; rel=&quot;next&quot; href=&quot;/s/Cambridge--MA--United-States?page=2&quot;&gt;
          &lt;span class=&quot;screen-reader-only&quot;&gt;Next&lt;/span&gt;&lt;i class=&quot;icon icon-caret-right&quot;&gt;&lt;/i&gt;&lt;/a&gt;
          &lt;/li&gt;
        &lt;/ul&gt;
        &lt;/div&gt;
    &lt;/div&gt;

pageCounts := doc.Find(&quot;.pagination-responsive .list-unstyled&quot;)
	for page := range pageCounts.Nodes {
		pageIterator := pageCounts.Eq(page)
		li := pageIterator.Find(&quot;li a&quot;)
		href, _ := li.Attr(&quot;target&quot;)
		fmt.Println(href)
	}

Can someone please indicate what I might be missing here?

答案1

得分: 3

li := pageIterator.Find("li a")实际上是一个元素序列,但你只获取了第一个元素的属性。在这方面,它有点像jquery。你实际上想要做的是遍历所有的链接,而Each方法将会是你的好朋友。我发现使用它比使用Eq方法更容易。

以下是适用于我的代码片段:

var html = `
<div class="pagination pagination-responsive">
      <ul class="list-unstyled"> 
          <li class="active">
             <a rel="start" target="1" href="/s/Cambridge--MA--United-States">1</a>
          </li> 
          <li>
             <a rel="next" target="2" href="/s/Cambridge--MA--United-States?page=2">2</a>
          </li> 
          <li>
              <a target="3" href="/s/Cambridge--MA--United-States?page=3">3</a>
          </li> 
          <li class="gap"><span class="gap">&hellip;</span>
          </li> 
          <li>
            <a target="17" href="/s/Cambridge--MA--United-States?page=17">17</a>
          </li> 
          <li class="next next_page"><a target="2" rel="next" href="/s/Cambridge--MA--United-States?page=2">
          <span class="screen-reader-only">Next</span><i class="icon icon-caret-right"></i></a>
          </li>
        </ul>
        </div>
    </div>
`

func main() {
	doc, err := goquery.NewDocumentFromReader(strings.NewReader(html))
	pageCounts := doc.Find(".pagination-responsive .list-unstyled")
	pageCounts.Each(func(_ int, ul *goquery.Selection) {
		links := ul.Find("li a")
		links.Each(func(_ int, li *goquery.Selection) {
			if val, ok := li.Attr("target"); ok {
				fmt.Println(val)
			}
		})
	})
}

希望对你有帮助!

英文:

li := pageIterator.Find(&quot;li a&quot;) is actually a sequence of elements, but you only take the attr of the first one. It is kinda like jquery in this regard. What you actually want to do is iterate over all links, and Each is going to be your friend here. I find it much easier than iterating with Eq.

This snippet works for me:

var html = `
&lt;div class=&quot;pagination pagination-responsive&quot;&gt;
      &lt;ul class=&quot;list-unstyled&quot;&gt; 
          &lt;li class=&quot;active&quot;&gt;
             &lt;a rel=&quot;start&quot; target=&quot;1&quot; href=&quot;/s/Cambridge--MA--United-States&quot;&gt;1&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li&gt;
             &lt;a rel=&quot;next&quot; target=&quot;2&quot; href=&quot;/s/Cambridge--MA--United-States?page=2&quot;&gt;2&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li&gt;
              &lt;a target=&quot;3&quot; href=&quot;/s/Cambridge--MA--United-States?page=3&quot;&gt;3&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li class=&quot;gap&quot;&gt;&lt;span class=&quot;gap&quot;&gt;&amp;hellip;&lt;/span&gt;
          &lt;/li&gt; 
          &lt;li&gt;
            &lt;a target=&quot;17&quot; href=&quot;/s/Cambridge--MA--United-States?page=17&quot;&gt;17&lt;/a&gt;
          &lt;/li&gt; 
          &lt;li class=&quot;next next_page&quot;&gt;&lt;a target=&quot;2&quot; rel=&quot;next&quot; href=&quot;/s/Cambridge--MA--United-States?page=2&quot;&gt;
          &lt;span class=&quot;screen-reader-only&quot;&gt;Next&lt;/span&gt;&lt;i class=&quot;icon icon-caret-right&quot;&gt;&lt;/i&gt;&lt;/a&gt;
          &lt;/li&gt;
        &lt;/ul&gt;
        &lt;/div&gt;
    &lt;/div&gt;

`

func main() {
	doc, err := goquery.NewDocumentFromReader(strings.NewReader(html))
	pageCounts := doc.Find(&quot;.pagination-responsive .list-unstyled&quot;)
	pageCounts.Each(func(_ int, ul *goquery.Selection) {
		links := ul.Find(&quot;li a&quot;)
		links.Each(func(_ int, li *goquery.Selection) {
			if val, ok := li.Attr(&quot;target&quot;); ok {
				fmt.Println(val)
			}
		})
	})
}

huangapple
  • 本文由 发表于 2015年10月21日 13:12:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/33251406.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定