问题

如何在Java中从URL获取特定单词。就像我想从调用类似blablabla的类中获取数据一样。
这是我的代码。

URL url = new URL("https://www.doviz.com/");
URLConnection connect = url.openConnection();
InputStream is = connect.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String line = null;
while((line = br.readLine()) != null)
{
    System.out.println(line);
}

英文:

How can i get spesific words from an url in java. Like i want to take datas from class which calling like blablabla.
Here is my code.

    URL url = new URL(&quot;https://www.doviz.com/&quot;);
    URLConnection connect = url.openConnection();
    InputStream is = connect.getInputStream();
    BufferedReader br = new BufferedReader(new InputStreamReader(is));
    String line = null;
    while((line = br.readLine()) != null)
    {
        System.out.println(line);
    }

答案1

得分: 1

请看一下 Jsoup，这将允许您获取网页的内容而不是HTML代码。可以说它会扮演浏览器的角色，它会将HTML标签解析为人类可读的文本。

一旦您在字符串中获取了页面的内容，您可以使用任何算法来计算单词的出现次数。

使用它的简单示例：

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
/*   ........  */
String URL = "https://www.doviz.com/";
Document doc = Jsoup.connect(URL).get();
String text = doc.body().text();
System.out.println(text);

编辑

如果您不想使用解析器（正如您在评论中提到您不想使用外部库），您将获得页面的整个HTML代码，以下是如何做到这一点：

try {
    URL url = new URL("https://www.doviz.com/");

    BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
    String str;
    while ((str = in.readLine()) != null) {
        str = in.readLine().toString();
        System.out.println(str);
        /* str每次获取新行，如果您想将整个文本存储在str中，可以使用连接操作（str += in.readLine().toString()） */
    }
    in.close();
} catch (Exception e) {}

英文:

Take a look at Jsoup , this will allow you to get the content of a web page and NOT the HTML code. Let's say it will play the role of the browser, it will parse the HTML tags into a human readable text.

Once you will get the content of your page in a String, you can count the occurrences of your word using any algorithm of occurrences count.

Simple example to use it:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
/*   ........  */
String URL = &quot;https://www.doviz.com/&quot;;       
Document doc = Jsoup.connect(URL).get();
String text = doc.body().text();
System.out.println(text);

EDIT

If you don't want to use a parser (as you mentioned in the comment that you don't want external libraries), you will get the whole HTML code of the page, that's how you can do it

try {
	URL url = new URL(&quot;https://www.doviz.com/&quot;);       

	BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
	String str;
	while ((str = in.readLine()) != null) {
		str = in.readLine().toString();
		System.out.println(str);
        /*str will get each time the new line, if you want to store the whole text in str 
           you can use concatenation (str+ = in.readLine().toString())*/
	}
	in.close();
} catch (Exception e) {}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Java中从URL获取特定单词

问题

答案1

如何在Java中使用类型变量创建带有参数的类型？

无法解决 ‘FirebaseRecyclerOptions’ 中的 ‘getCurrentList’ 方法。

如何在 from 路由中使用方法变量

Java字符串中的重复字符问题。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论