2020年4月9日 23:09:04go评论266阅读模式

英文:

How to get html code while hitting one website using spring boot and store this whole HTML data in one string variable?

问题

我试图找一些关于如何使用Spring Boot在访问任何网站时获取HTML数据的资料，但是我没有找到任何最佳示例。有谁可以帮助我提供解决方案吗？

英文:

I tried to find some stuffs regarding how to get HTML data while hitting any of the website using spring boot but I didn't get any of the best example stuffs.Can anyone help me to give solution for this?

答案1

得分: 0

你可以使用HTML解析器，例如JSoup来完成这个任务。

演示：

import java.io.IOException;
import org.jsoup.Jsoup;

public class JSoupDemo {    
    public static void main(String[] args) throws IOException {
        String webPage = "http://www.example.com";
        String html = Jsoup.connect(webPage).get().html();
        System.out.println(html);
    }
}

输出：

<!doctype html>
<html>
 <head> 
  <title>Example Domain</title> 
  <meta charset="utf-8"> 
  <meta http-equiv="Content-type" content="text/html; charset=utf-8"> 
  <meta name="viewport" content="width=device-width, initial-scale=1"> 
  <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
        
    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style> 
 </head> 
 <body> 
  <div> 
   <h1>Example Domain</h1> 
   <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> 
   <p><a href="https://www.iana.org/domains/example">More information...</a></p> 
  </div>   
 </body>
</html>

**或者，**你也可以使用java.io.BufferedReader来完成，如下所示：

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;

public class Main {
    public static void main(String[] args) {    
        try (BufferedReader br = new BufferedReader(
                new InputStreamReader(new URL("http://www.example.com").openStream()))) {
            String line;
            StringBuilder sb = new StringBuilder();
            while ((line = br.readLine()) != null) {
                sb.append(line);
                sb.append(System.lineSeparator());
            }
            System.out.println(sb);
        } catch (MalformedURLException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

英文:

You can use an HTML parser e.g. JSoup to do it.

Demo:

import java.io.IOException;
import org.jsoup.Jsoup;

public class JSoupDemo {    
	public static void main(String[] args) throws IOException {
		String webPage = &quot;http://www.example.com&quot;;
		String html = Jsoup.connect(webPage).get().html();
		System.out.println(html);
	}
}

Output:

&lt;!doctype html&gt;
&lt;html&gt;
 &lt;head&gt; 
  &lt;title&gt;Example Domain&lt;/title&gt; 
  &lt;meta charset=&quot;utf-8&quot;&gt; 
  &lt;meta http-equiv=&quot;Content-type&quot; content=&quot;text/html; charset=utf-8&quot;&gt; 
  &lt;meta name=&quot;viewport&quot; content=&quot;width=device-width, initial-scale=1&quot;&gt; 
  &lt;style type=&quot;text/css&quot;&gt;
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, &quot;Segoe UI&quot;, &quot;Open Sans&quot;, &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif;
        
    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    &lt;/style&gt; 
 &lt;/head&gt; 
 &lt;body&gt; 
  &lt;div&gt; 
   &lt;h1&gt;Example Domain&lt;/h1&gt; 
   &lt;p&gt;This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.&lt;/p&gt; 
   &lt;p&gt;&lt;a href=&quot;https://www.iana.org/domains/example&quot;&gt;More information...&lt;/a&gt;&lt;/p&gt; 
  &lt;/div&gt;   
 &lt;/body&gt;
&lt;/html&gt;

Alternatively, you can do it using java.io.BufferedReader as follows:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;

public class Main {
	public static void main(String[] args) {    
		try (BufferedReader br = new BufferedReader(
				new InputStreamReader(new URL(&quot;http://www.example.com&quot;).openStream()))) {
			String line;
			StringBuilder sb = new StringBuilder();
			while ((line = br.readLine()) != null) {
				sb.append(line);
				sb.append(System.lineSeparator());
			}
			System.out.println(sb);
		} catch (MalformedURLException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
	}
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How to get html code while hitting one website using spring boot and store this whole HTML data in one string variable?

问题

答案1

Delete方法不会删除 – Java

如何使用 ReplaceFirst() 进行不区分大小写的文本替换。

如何在Laravel上显示图片？

在Java中将多行文本作为水印添加到图像上。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论