2013年2月18日 00:58:05go评论130阅读模式

英文:

Create pdf from html in golang

问题

如何在Google Go中从HTML输入创建PDF文件？如果目前还不可能，是否有任何旨在解决此问题的初始方案？

我正在寻找类似于PHP中的TCPDF的解决方案。

英文:

How to create PDF files from an HTML input in Google Go? If it is not possible yet, are there any initations that aims to solve this problem?

I'm looking for a solution like TCPDF in php.

答案1

得分: 17

关于gopdf（https://github.com/signintech/gopdf），它似乎是你正在寻找的。

英文:

what about gopdf (https://github.com/signintech/gopdf).

It seems like you are looking for.

答案2

得分: 17

>安装

go get -u github.com/SebastiaanKlippert/go-wkhtmltopdf

go version go1.9.2 linux/amd64

>代码

   import (
    	&quot;fmt&quot;
    	&quot;strings&quot;
    	wkhtml &quot;github.com/SebastiaanKlippert/go-wkhtmltopdf&quot;
    )  
    
      func main(){
                 pdfg, err :=  wkhtml.NewPDFGenerator()
               if err != nil{
            	  return
              }
              htmlStr := `&lt;html&gt;&lt;body&gt;&lt;h1 style=&quot;color:red;&quot;&gt;This is an html
 from pdf to test color&lt;h1&gt;&lt;img src=&quot;http://api.qrserver.com/v1/create-qr-
code/?data=HelloWorld&quot; alt=&quot;img&quot; height=&quot;42&quot; width=&quot;42&quot;&gt;&lt;/img&gt;&lt;/body&gt;&lt;/html&gt;`
            
              pdfg.AddPage(wkhtml.NewPageReader(strings.NewReader(htmlStr)))
            
   
              // 在内部缓冲区中创建PDF文档
              err = pdfg.Create()
              if err != nil {
            	  log.Fatal(err)
              }
            
               //你的PDF名称
               err = pdfg.WriteFile(&quot;./Your_pdfname.pdf&quot;)
              if err != nil {
            	  log.Fatal(err)
              }
            
              fmt.Println(&quot;完成&quot;)
        }

以上代码用于将HTML转换为PDF，具有正确的背景图像和嵌入的CSS样式标签

查看仓库

查看拉取请求改进文档

推荐使用（来自https://wkhtmltopdf.org/status.html）：

不要使用wkhtmltopdf处理任何不受信任的HTML-请确保对任何用户提供的HTML/JS进行清理，否则可能导致完全接管运行的服务器！请考虑使用强制访问控制系统，如AppArmor或SELinux，请参阅推荐的AppArmor策略。

如果您将其用于报告生成（即使用您控制的HTML），还可以考虑使用WeasyPrint或商业工具Prince-请注意，我与这两个项目都没有关联，请进行尽职调查。

如果您将其用于转换使用动态JS的网站，请考虑使用puppeteer或其许多封装之一。

英文:

>Installation

go get -u github.com/SebastiaanKlippert/go-wkhtmltopdf

go version go1.9.2 linux/amd64

>code

   import (
    	&quot;fmt&quot;
    	&quot;strings&quot;
    	wkhtml &quot;github.com/SebastiaanKlippert/go-wkhtmltopdf&quot;
    )  
    
      func main(){
                 pdfg, err :=  wkhtml.NewPDFGenerator()
               if err != nil{
            	  return
              }
              htmlStr := `&lt;html&gt;&lt;body&gt;&lt;h1 style=&quot;color:red;&quot;&gt;This is an html
 from pdf to test color&lt;h1&gt;&lt;img src=&quot;http://api.qrserver.com/v1/create-qr-
code/?data=HelloWorld&quot; alt=&quot;img&quot; height=&quot;42&quot; width=&quot;42&quot;&gt;&lt;/img&gt;&lt;/body&gt;&lt;/html&gt;`
            
              pdfg.AddPage(wkhtml.NewPageReader(strings.NewReader(htmlStr)))
            
   
              // Create PDF document in internal buffer
              err = pdfg.Create()
              if err != nil {
            	  log.Fatal(err)
              }
            
               //Your Pdf Name
               err = pdfg.WriteFile(&quot;./Your_pdfname.pdf&quot;)
              if err != nil {
            	  log.Fatal(err)
              }
            
              fmt.Println(&quot;Done&quot;)
        }

The Above code Works for Converting html to pdf in golang with proper background image and Embedded Css Style Tags

Check repo

See Pull request Documentation Improved

Recommendations (from https://wkhtmltopdf.org/status.html) :

Do not use wkhtmltopdf with any untrusted HTML – be sure to sanitize any user-supplied HTML/JS, otherwise it can lead to complete takeover of the server it is running on! Please consider using a Mandatory Access Control system like AppArmor or SELinux, see recommended AppArmor policy.

If you’re using it for report generation (i.e. with HTML you control), also consider using WeasyPrint or the commercial tool Prince – note that I’m not affiliated with either project, and do your diligence.

If you’re using it to convert a site which uses dynamic JS, consider using puppeteer or one of the many wrappers it has.

答案3

得分: 7

还有这个包wkhtmltopdf-go，它使用了libwkhtmltox库。不过我不确定它有多稳定。

英文:

There is also this package wkhtmltopdf-go, which uses the libwkhtmltox library. I am not sure how stable it is though.

答案4

得分: 7

The function page.PrintToPDF() works great.

Here is an example using it with chromedp (go get -u github.com/chromedp/chromedp):

import (
	"context"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"os"
	"time"
	"github.com/chromedp/cdproto/emulation"
	"github.com/chromedp/cdproto/page"
	"github.com/chromedp/chromedp"
)
func main() {
		taskCtx, cancel := chromedp.NewContext(
			context.Background(),
			chromedp.WithLogf(log.Printf),
		)
		defer cancel()
		var pdfBuffer []byte
		if err := chromedp.Run(taskCtx, pdfGrabber("https://www.wikipedia.org", "body", &pdfBuffer)); err != nil {
			log.Fatal(err)
		}
		if err := ioutil.WriteFile("coolsite.pdf", pdfBuffer, 0644); err != nil {
			log.Fatal(err)
		}
}
func pdfGrabber(url string, sel string, res *[]byte) chromedp.Tasks {
	start := time.Now()
	return chromedp.Tasks{
		emulation.SetUserAgentOverride("WebScraper 1.0"),
		chromedp.Navigate(url),
		// wait for footer element is visible (ie, page is loaded)
		// chromedp.ScrollIntoView(`footer`),
		chromedp.WaitVisible("body", chromedp.ByQuery),
		// chromedp.Text(`h1`, &res, chromedp.NodeVisible, chromedp.ByQuery),
		chromedp.ActionFunc(func(ctx context.Context) error {
			buf, _, err := page.PrintToPDF().WithPrintBackground(true).Do(ctx)
			if err != nil {
				return err
			}
			*res = buf
			//fmt.Printf("h1 contains: '%s'\n", res)
			fmt.Printf("\nTook: %f secs\n", time.Since(start).Seconds())
			return nil
		}),
	}
}

The above will load wikipedia.org in chrome headless and wait for body to show up and then save it as pdf.

results in terminal:

$ go run main.go
https://www.wikipedia.org
Scraping url now...
Took: 2.772797 secs

英文:

The function page.PrintToPDF() works great.

Here is an example using it with chromedp (go get -u github.com/chromedp/chromedp):

import (
&quot;context&quot;
&quot;fmt&quot;
&quot;io/ioutil&quot;
&quot;log&quot;
&quot;net/http&quot;
&quot;os&quot;
&quot;time&quot;
&quot;github.com/chromedp/cdproto/emulation&quot;
&quot;github.com/chromedp/cdproto/page&quot;
&quot;github.com/chromedp/chromedp&quot;
)
func main() {
taskCtx, cancel := chromedp.NewContext(
context.Background(),
chromedp.WithLogf(log.Printf),
)
defer cancel()
var pdfBuffer []byte
if err := chromedp.Run(taskCtx, pdfGrabber(&quot;https://www.wikipedia.org&quot;, &quot;body&quot;, &amp;pdfBuffer)); err != nil {
log.Fatal(err)
}
if err := ioutil.WriteFile(&quot;coolsite.pdf&quot;, pdfBuffer, 0644); err != nil {
log.Fatal(err)
}
}
func pdfGrabber(url string, sel string, res *[]byte) chromedp.Tasks {
start := time.Now()
return chromedp.Tasks{
emulation.SetUserAgentOverride(&quot;WebScraper 1.0&quot;),
chromedp.Navigate(url),
// wait for footer element is visible (ie, page is loaded)
// chromedp.ScrollIntoView(`footer`),
chromedp.WaitVisible(`body`, chromedp.ByQuery),
// chromedp.Text(`h1`, &amp;res, chromedp.NodeVisible, chromedp.ByQuery),
chromedp.ActionFunc(func(ctx context.Context) error {
buf, _, err := page.PrintToPDF().WithPrintBackground(true).Do(ctx)
if err != nil {
return err
}
*res = buf
//fmt.Printf(&quot;h1 contains: &#39;%s&#39;\n&quot;, res)
fmt.Printf(&quot;\nTook: %f secs\n&quot;, time.Since(start).Seconds())
return nil
}),
}
}

The above will load wikipedia.org in chrome headless and wait for body to show up and then save it as pdf.

results in terminal:

$ go run main.go
https://www.wikipedia.org
Scraping url now...
Took: 2.772797 secs

答案5

得分: 4

我不认为我理解你的要求。由于HTML是一种标记语言，它需要上下文来渲染（CSS和屏幕尺寸）。我看到的现有实现通常会在无头浏览器中打开页面，并以此方式创建PDF。

就我个人而言，我会使用现有的包并从Go中调用外部程序。这个看起来不错；甚至在这个答案中也推荐了它。

如果你真的决定要完全在Go中实现，可以看看这个WebKit封装。我不确定你会用什么来生成PDF，但至少这是一个起点。

英文:

I don't think I understand your requirements. Since HTML is a markup language, it needs context to render (CSS and a screen size). Existing implementations I've seen generally open the page in a headless browser and create a PDF that way.

Personally, I would just use an existing package and shell out from Go. This one looks good; it's even recommended in this answer.

If you're really determined to implement it all in Go, check out this WebKit wrapper. I'm not sure what you'd use for generating PDFs, but but at least it's a start.

答案6

得分: 3

我正在创建一个替代库，以更简单的方式创建PDF（https://github.com/johnfercher/maroto）。它使用gofpdf，具有网格系统和一些组件，如Bootstrap。

英文:

I'm creating an alternative lib to create PDFs in a simpler way (https://github.com/johnfercher/maroto). It uses gofpdf and have a grid system and some components like Bootstrap.

答案7

得分: 1

另一个选择是Athena。它有一个用Go编写的微服务，或者可以用作命令行界面。

英文:

Another option is Athena. It has a microservice written in Go or it can be used as a CLI.

答案8

得分: -4

另一个选择是UniHTML（基于容器的，带有API），它与UniPDF相互操作，可用于基于HTML模板创建PDF报告等。

它在容器中使用无头Chrome引擎，因此渲染效果完美，并具备所有HTML功能。与UniPDF的结合还提供了其他优势，例如自动生成目录、大纲等。还可以添加密码保护、添加PDF表单、数字签名等功能。

要为磁盘上的HTML模板创建PDF，可以执行以下操作：

package main
import (
	"fmt"
	"os"
	"github.com/unidoc/unihtml"
	"github.com/unidoc/unipdf/v3/common/license"
	"github.com/unidoc/unipdf/v3/creator"
)
func main() {
	// 设置UniDoc许可证。
	if err := license.SetMeteredKey("我的API密钥在这里"); err != nil {
		fmt.Printf("错误：设置计量密钥失败：%v\n", err)
		os.Exit(1)
	}
	// 与UniHTML服务器建立连接。
	if err := unihtml.Connect(":8080"); err != nil {
		fmt.Printf("错误：连接失败：%v\n", err)
		os.Exit(1)
	}
	// 获取新的PDF Creator。
	c := creator.New()
	// AddTOC启用目录生成。
	c.AddTOC = true
	chapter := c.NewChapter("Points")
	// 读取sample.html文件的内容并加载到转换中。
	htmlDocument, err := unihtml.NewDocument("sample.html")
	if err != nil {
		fmt.Printf("错误：NewDocument失败：%v\n", err)
		os.Exit(1)
	}
	// 在创建者的上下文中绘制HTML文档文件。
	if err = chapter.Add(htmlDocument); err != nil {
		fmt.Printf("错误：绘制失败：%v\n", err)
		os.Exit(1)
	}
	if err = c.Draw(chapter); err != nil {
		fmt.Printf("错误：绘制失败：%v\n", err)
		os.Exit(1)
	}
	// 将结果文件写入PDF。
	if err = c.WriteToFile("sample.pdf"); err != nil {
		fmt.Printf("错误：%v\n", err)
		os.Exit(1)
	}
}

我在这里写了一篇关于UniHTML的介绍文章，如果需要更多信息可能会有用（https://www.unidoc.io/post/html-for-pdf-reports-in-go）。

声明：我是UniPDF的原始开发者。

英文:

Another option is UniHTML (container-based with API) which interoperates with UniPDF which is useful to create PDF reports and such based on HTML templates.

It uses a headless-chrome engine in a container, so the rendering is perfect and has all HTML features. The combination with UniPDF gives additional advantages, such as automatic table of content generation, outlines and such. As well as ability to add password protection, add PDF forms, digital signatures and such.

To create a PDF for an HTML template on disk, it can be done by:

package main
import (
&quot;fmt&quot;
&quot;os&quot;
&quot;github.com/unidoc/unihtml&quot;
&quot;github.com/unidoc/unipdf/v3/common/license&quot;
&quot;github.com/unidoc/unipdf/v3/creator&quot;
)
func main() {
// Set the UniDoc license.
if err := license.SetMeteredKey(&quot;my api key goes here&quot;); err != nil {
fmt.Printf(&quot;Err: setting metered key failed: %v\n&quot;, err)
os.Exit(1)
}
// Establish connection with the UniHTML Server.
if err := unihtml.Connect(&quot;:8080&quot;); err != nil {
fmt.Printf(&quot;Err:  Connect failed: %v\n&quot;, err)
os.Exit(1)
}
// Get new PDF Creator.
c := creator.New()
// AddTOC enables Table of Contents generation.
c.AddTOC = true
chapter := c.NewChapter(&quot;Points&quot;)
// Read the content of the sample.html file and load it to the conversion.
htmlDocument, err := unihtml.NewDocument(&quot;sample.html&quot;)
if err != nil {
fmt.Printf(&quot;Err: NewDocument failed: %v\n&quot;, err)
os.Exit(1)
}
// Draw the html document file in the context of the creator.
if err = chapter.Add(htmlDocument); err != nil {
fmt.Printf(&quot;Err: Draw failed: %v\n&quot;, err)
os.Exit(1)
}
if err = c.Draw(chapter); err != nil {
fmt.Printf(&quot;Err: Draw failed: %v\n&quot;, err)
os.Exit(1)
}
// Write the result file to PDF.
if err = c.WriteToFile(&quot;sample.pdf&quot;); err != nil {
fmt.Printf(&quot;Err: %v\n&quot;, err)
os.Exit(1)
}
}

I have written an introduction article to UniHTML [here] which might be useful if more information is needed (https://www.unidoc.io/post/html-for-pdf-reports-in-go).

Disclosure: I am the original developer of UniPDF.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用golang从HTML创建PDF

问题

答案1

答案2

答案3

答案4

答案5

答案6

答案7

答案8

Golang anonymous struct in initializing slice of pointers

在VSCode中更严格地检查Golang代码的规范性。

Adding query Parameters to Go Json Rest

为什么 `sync.WaitGroup` 无法完成？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。