如何从具有与其他类相同名称的span类中按标签获取元素?

huangapple go评论82阅读模式
英文:

How can I get elements by tag from a span class which has the same name as other classes?

问题

我有困难从具有与其他相同名称的单个类中获取内容。在之前的尝试中,它还会给我带来我不想要的其他类的内容。我找到了一个可行的解决方案,但我不认为它足够稳固。

所以首先我连接:

Document doc = Jsoup.connect("https://www.imdb.com/list/ls005750764/").get();

然后我选择一个类:

Elements rating = doc.select("div.ipl-rating-star.small");

这个类包含2个类,其中一个是我想要的,即:

"span.ipl-rating__star"

其他类位于"div.ipl-rating-star.small"之外,因此我的程序不会重复出现标签名称。在这里,我将内容添加到一个数组列表中。

添加到数组列表:

for(Element g: rating) {
    ratings.add(g.getElementsByTag("span").text());
}

当我打印数组列表的内容时,我得到了我想要的内容,更重要的是,它只来自我想要的类,因为解析被强制限制在我选择的

类内。

现在我的主要关注点是这个

类中的另一个类,因为尽管两个标签都是,但程序在某种程度上没有与之混淆。对此有什么想法吗?

英文:

I am having trouble getting content from a single span class that has the same name as others. In my previous attempts, it gave me content I did not want from other span classes in addition. I found a solution that works but that I do not believe is sturdy enough.

So first I connect:

Document doc = Jsoup.connect("https://www.imdb.com/list/ls005750764/").get();

Then I select a class:

Elements rating = doc.select("div.ipl-rating-star.small");

This class contains 2 span classes, one of which I want, which is:

"span.ipl-rating__star"

The other span classes are outside of "div.ipl-rating-star.small", so there is no recurrences of the tag name from my program. Here I am adding the content to an array list.

Add to array list:

for(Element g: rating) {
    ratings.add(g.getElementsByTag("span").text());
}

When I print the contents of the array list I get exactly what I want, more importantly it is only coming from the span class I want it to come from because parsing is forced to stay within the div class I selected.

My main concern now is the other span class within this div class because the program is somehow not getting confused with that even though both the tags are span. Any ideas on this would be helpful.

答案1

得分: 0

我成功获取了星级评分,使用以下代码:

doc.select("div[class^=\"ipl-rating-star small\"]").select("span[class=\"ipl-rating-star__rating\"]").text().split(" ")

这将返回一个 String[],其中包含你所需要的星级评分。

请查看文档以获取一些可供复制的良好示例:https://jsoup.org/cookbook/extracting-data/dom-navigation

英文:

I managed to get the star ratings with:

doc.select("div[class^=\"ipl-rating-star small\"]").select("span[class=\"ipl-rating-star__rating\"]").text().split(" ")

This returns a String[] which has the star ratings you're after.

pls check the docs for some good examples to copy from: https://jsoup.org/cookbook/extracting-data/dom-navigation

huangapple
  • 本文由 发表于 2020年7月30日 00:49:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/63158582.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定