英文:
What does `:first-of-type` really mean?
问题
I would like to use a single jsoup selector expression that returns only the first element out of the two div
elements with the class heading-h3
.
div.heading-h3:first-child
This selector will return the first div
element with the class heading-h3
.
英文:
Given the following HTML:
<p>paragraph text 1</p>
<p>paragraph text 1</p>
<div class="heading-h3">Category Title 1</div>
<p>1. <a href="#item1">
<strong>Item One</strong>
</a>
<br>2. <a href="#item2">
<strong>Item Two</strong>
</a>
<br>3. <a href="#item3">
<strong>Item Three</strong>
</a>
<br>4. <a href="#item4">
<strong>Item Four</strong>
</a>
<div class="heading-h3">Category Title 2</div>
<p>1. <a href="#item11">
<strong>Item Eleven</strong>
</a>
<br>2. <a href="#item12">
<strong>Item Twelve</strong>
</a>
<br>3. <a href="#item13">
<strong>Item Thirteen</strong>
</a>
<br>4. <a href="#item14">
<strong>Item Fourteen</strong>
</a>
I would like to use a single jsoup selector expression that returns only the first element out of the two <div class="heading-h3">
.
That is, if select("div.heading-h3")
returns two elements and select("div.heading-h3").first()
return only the first element of the two, I would like to use a single jsoup expression that does not resort to Elements.first() to limit the result set to a single (first) element.
At first, I thought that "div.heading-h3:first-of-type"
would accomplish that, but when tested, it returns no elements at all.
What am I missing in the interpretation of the :first-of-type
"structural pseudo selectors"? Is it possible to accomplish what I want in a single jsoup
selector? i.e. without resorting to Elements.first()?
答案1
得分: 1
Attempting div.heading-h3:first-of-type
(with the same exact HTML typed in the question) at https://try.jsoup.org/ actually works as I originally expected:
<img src="https://i.stack.imgur.com/2iERW.png" width="262" height="441">
But in my Java program this doesn't work because the actual HTML being parsed by my program is much larger.
Assuming that the jsoup
version at https://try.jsoup.org/ is the latest and greatest, I can only conclude that there are some practical limitations to jsoup
which makes it behave inconsistently when dealing with huge or "difficult" (to jsoup
) HTML.
This comment in a different SO thread suggests that "jsoup can alter (fix) the DOM...", which to me means "consistency or correctness is not guaranteed".
英文:
Attempting div.heading-h3:first-of-type
(with the same exact HTML typed in the question) at https://try.jsoup.org/ actually works as I originally expected:
<img src="https://i.stack.imgur.com/2iERW.png" width="262" height="441">
But in my Java program this doesn't work because the actual HTML being parsed by my program is much larger.
Assuming that the jsoup
version at https://try.jsoup.org/ is the latest and greatest, I can only conclude that there are some practical limitations to jsoup
which makes it behave inconsistently when dealing with huge or "difficult" (to jsoup
) HTML.
This comment in a different SO thread suggests that "jsoup can alter (fix) the DOM...", which to me means "consistency or correctness is not guaranteed".
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论