什么是<cbn-root> HTML元素?以及如何使用Java解析它?

huangapple go评论74阅读模式
英文:

What is the <cbn-root> html element? And how to parser it by Java?

问题

我正在尝试编写一个Java程序,以监视网站https://www.drpciv.ro/drpciv-booking/formular/23/exchangingForeignDriverLicence上是否有可用的预订位置。

但是当我使用Chrome或Edge查看页面源代码时,正文部分只显示&lt;cbn-root&gt;&lt;/cbn-root&gt;。但是使用Chrome的检查功能时,我可以看到完整的正文内容。当我尝试在Java中使用HtmlUnit获取网页内容时,它只获取到&lt;cbn-root&gt;&lt;/cbn-root&gt;,没有实际的内容。

尝试过在Google中搜索&lt;cbn-root&gt;,但没有看到有用的信息。
想知道元素&lt;cbn-root&gt;是什么,以及如何在这种情况下在Java中读取实际内容。

谢谢

英文:

I was trying to write a Java program to monitor if there are reserved spots becoming available on this website: https://www.drpciv.ro/drpciv-booking/formular/23/exchangingForeignDriverLicence

But when I view page source with Chrome or Edge, the body part show only &lt;cbn-root&gt;&lt;/cbn-root&gt;. But using Chrome's Inspect function I can see the complete body. When I try to get the content of the webpage in Java with HtmlUnit it gets only &lt;cbn-root&gt;&lt;/cbn-root&gt; and no real content either.

Tried to google &lt;cbn-root&gt;, but didn't see any useful information.
Wonder what the element <cbn-root> is and how to read the real content in Java in this case.

Thank you

答案1

得分: 0

尝试访问 https://stackoverflow.com/questions/44867425/beautiful-soup-cant-find-tags
它解释了后端 JavaScript 是异步加载的,因此您的 GET 请求实际上无法获取该标签。在此处阅读更多信息。

英文:

Try https://stackoverflow.com/questions/44867425/beautiful-soup-cant-find-tags
It explains that the backend JS is loaded async and your GET request cant acutally get the tag. Read more here.

答案2

得分: 0

public static void main(String[] args) throws IOException {
String url = "https://www.drpciv.ro/drpciv-booking/formular/23/exchangingForeignDriverLicence";

try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX)) {
    webClient.getOptions().setThrowExceptionOnScriptError(false);

    HtmlPage page = webClient.getPage(url);
    System.out.println(" ---- ");
    webClient.waitForBackgroundJavaScript(10_000);

    System.out.println(" ---- ");
    System.out.println(page.asXml());
}

}

英文:

At least with the upcoming version 2.43.0 the tag gets replaced.

public static void main(String[] args) throws IOException {
    String url = &quot;https://www.drpciv.ro/drpciv-booking/formular/23/exchangingForeignDriverLicence&quot;;

    try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX)) {
        webClient.getOptions().setThrowExceptionOnScriptError(false);

        HtmlPage page = webClient.getPage(url);
        System.out.println(&quot; ---- &quot;);
        webClient.waitForBackgroundJavaScript(10_000);

        System.out.println(&quot; ---- &quot;);
        System.out.println(page.asXml());
    }
}

huangapple
  • 本文由 发表于 2020年8月9日 16:38:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/63324230.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定