使用golang从HTML创建PDF

huangapple go评论100阅读模式
英文:

Create pdf from html in golang

问题

如何在Google Go中从HTML输入创建PDF文件?如果目前还不可能,是否有任何旨在解决此问题的初始方案?

我正在寻找类似于PHP中的TCPDF的解决方案。

英文:

How to create PDF files from an HTML input in Google Go? If it is not possible yet, are there any initations that aims to solve this problem?

I'm looking for a solution like TCPDF in php.

答案1

得分: 17

关于gopdf(https://github.com/signintech/gopdf),它似乎是你正在寻找的。

英文:

what about gopdf (https://github.com/signintech/gopdf).

It seems like you are looking for.

答案2

得分: 17

>安装

go get -u github.com/SebastiaanKlippert/go-wkhtmltopdf

go version go1.9.2 linux/amd64

>代码

   import (
    	"fmt"
    	"strings"
    	wkhtml "github.com/SebastiaanKlippert/go-wkhtmltopdf"
    )  
    
      func main(){
                 pdfg, err :=  wkhtml.NewPDFGenerator()
               if err != nil{
            	  return
              }
              htmlStr := `<html><body><h1 style="color:red;">This is an html
 from pdf to test color<h1><img src="http://api.qrserver.com/v1/create-qr-
code/?data=HelloWorld" alt="img" height="42" width="42"></img></body></html>`
            
              pdfg.AddPage(wkhtml.NewPageReader(strings.NewReader(htmlStr)))
            
   
              // 在内部缓冲区中创建PDF文档
              err = pdfg.Create()
              if err != nil {
            	  log.Fatal(err)
              }
            
               //你的PDF名称
               err = pdfg.WriteFile("./Your_pdfname.pdf")
              if err != nil {
            	  log.Fatal(err)
              }
            
              fmt.Println("完成")
        }

以上代码用于将HTML转换为PDF,具有正确的背景图像和嵌入的CSS样式标签

查看仓库

查看拉取请求改进文档

推荐使用(来自https://wkhtmltopdf.org/status.html):

不要使用wkhtmltopdf处理任何不受信任的HTML-请确保对任何用户提供的HTML/JS进行清理,否则可能导致完全接管运行的服务器!请考虑使用强制访问控制系统,如AppArmor或SELinux,请参阅推荐的AppArmor策略。

如果您将其用于报告生成(即使用您控制的HTML),还可以考虑使用WeasyPrint或商业工具Prince-请注意,我与这两个项目都没有关联,请进行尽职调查。

如果您将其用于转换使用动态JS的网站,请考虑使用puppeteer或其许多封装之一。

英文:

>Installation

go get -u github.com/SebastiaanKlippert/go-wkhtmltopdf

go version go1.9.2 linux/amd64

>code

   import (
    	"fmt"
    	"strings"
    	wkhtml "github.com/SebastiaanKlippert/go-wkhtmltopdf"
    )  
    
      func main(){
                 pdfg, err :=  wkhtml.NewPDFGenerator()
               if err != nil{
            	  return
              }
              htmlStr := `<html><body><h1 style="color:red;">This is an html
 from pdf to test color<h1><img src="http://api.qrserver.com/v1/create-qr-
code/?data=HelloWorld" alt="img" height="42" width="42"></img></body></html>`
            
              pdfg.AddPage(wkhtml.NewPageReader(strings.NewReader(htmlStr)))
            
   
              // Create PDF document in internal buffer
              err = pdfg.Create()
              if err != nil {
            	  log.Fatal(err)
              }
            
               //Your Pdf Name
               err = pdfg.WriteFile("./Your_pdfname.pdf")
              if err != nil {
            	  log.Fatal(err)
              }
            
              fmt.Println("Done")
        }

The Above code Works for Converting html to pdf in golang with proper background image and Embedded Css Style Tags

Check repo

See Pull request Documentation Improved

Recommendations (from https://wkhtmltopdf.org/status.html) :

Do not use wkhtmltopdf with any untrusted HTML – be sure to sanitize any user-supplied HTML/JS, otherwise it can lead to complete takeover of the server it is running on! Please consider using a Mandatory Access Control system like AppArmor or SELinux, see recommended AppArmor policy.

If you’re using it for report generation (i.e. with HTML you control), also consider using WeasyPrint or the commercial tool Prince – note that I’m not affiliated with either project, and do your diligence.

If you’re using it to convert a site which uses dynamic JS, consider using puppeteer or one of the many wrappers it has.

答案3

得分: 7

还有这个包wkhtmltopdf-go,它使用了libwkhtmltox库。不过我不确定它有多稳定。

英文:

There is also this package wkhtmltopdf-go, which uses the libwkhtmltox library. I am not sure how stable it is though.

答案4

得分: 7

The function page.PrintToPDF() works great.

Here is an example using it with chromedp (go get -u github.com/chromedp/chromedp):

import (
	"context"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"os"
	"time"

	"github.com/chromedp/cdproto/emulation"
	"github.com/chromedp/cdproto/page"
	"github.com/chromedp/chromedp"
)

func main() {
		taskCtx, cancel := chromedp.NewContext(
			context.Background(),
			chromedp.WithLogf(log.Printf),
		)
		defer cancel()
		var pdfBuffer []byte
		if err := chromedp.Run(taskCtx, pdfGrabber("https://www.wikipedia.org", "body", &pdfBuffer)); err != nil {
			log.Fatal(err)
		}
		if err := ioutil.WriteFile("coolsite.pdf", pdfBuffer, 0644); err != nil {
			log.Fatal(err)
		}
}

func pdfGrabber(url string, sel string, res *[]byte) chromedp.Tasks {

	start := time.Now()
	return chromedp.Tasks{
		emulation.SetUserAgentOverride("WebScraper 1.0"),
		chromedp.Navigate(url),
		// wait for footer element is visible (ie, page is loaded)
		// chromedp.ScrollIntoView(`footer`),
		chromedp.WaitVisible("body", chromedp.ByQuery),
		// chromedp.Text(`h1`, &res, chromedp.NodeVisible, chromedp.ByQuery),
		chromedp.ActionFunc(func(ctx context.Context) error {
			buf, _, err := page.PrintToPDF().WithPrintBackground(true).Do(ctx)
			if err != nil {
				return err
			}
			*res = buf
			//fmt.Printf("h1 contains: '%s'\n", res)
			fmt.Printf("\nTook: %f secs\n", time.Since(start).Seconds())
			return nil
		}),
	}
}

The above will load wikipedia.org in chrome headless and wait for body to show up and then save it as pdf.

results in terminal:

$ go run main.go
https://www.wikipedia.org
Scraping url now...

Took: 2.772797 secs
英文:

The function page.PrintToPDF() works great.

Here is an example using it with chromedp (go get -u github.com/chromedp/chromedp):

import (
"context"
"fmt"
"io/ioutil"
"log"
"net/http"
"os"
"time"
"github.com/chromedp/cdproto/emulation"
"github.com/chromedp/cdproto/page"
"github.com/chromedp/chromedp"
)
func main() {
taskCtx, cancel := chromedp.NewContext(
context.Background(),
chromedp.WithLogf(log.Printf),
)
defer cancel()
var pdfBuffer []byte
if err := chromedp.Run(taskCtx, pdfGrabber("https://www.wikipedia.org", "body", &pdfBuffer)); err != nil {
log.Fatal(err)
}
if err := ioutil.WriteFile("coolsite.pdf", pdfBuffer, 0644); err != nil {
log.Fatal(err)
}
}
func pdfGrabber(url string, sel string, res *[]byte) chromedp.Tasks {
start := time.Now()
return chromedp.Tasks{
emulation.SetUserAgentOverride("WebScraper 1.0"),
chromedp.Navigate(url),
// wait for footer element is visible (ie, page is loaded)
// chromedp.ScrollIntoView(`footer`),
chromedp.WaitVisible(`body`, chromedp.ByQuery),
// chromedp.Text(`h1`, &res, chromedp.NodeVisible, chromedp.ByQuery),
chromedp.ActionFunc(func(ctx context.Context) error {
buf, _, err := page.PrintToPDF().WithPrintBackground(true).Do(ctx)
if err != nil {
return err
}
*res = buf
//fmt.Printf("h1 contains: '%s'\n", res)
fmt.Printf("\nTook: %f secs\n", time.Since(start).Seconds())
return nil
}),
}
}

The above will load wikipedia.org in chrome headless and wait for body to show up and then save it as pdf.

results in terminal:

$ go run main.go
https://www.wikipedia.org
Scraping url now...
Took: 2.772797 secs

答案5

得分: 4

我不认为我理解你的要求。由于HTML是一种标记语言,它需要上下文来渲染(CSS和屏幕尺寸)。我看到的现有实现通常会在无头浏览器中打开页面,并以此方式创建PDF。

就我个人而言,我会使用现有的包并从Go中调用外部程序。这个看起来不错;甚至在这个答案中也推荐了它。

如果你真的决定要完全在Go中实现,可以看看这个WebKit封装。我不确定你会用什么来生成PDF,但至少这是一个起点。

英文:

I don't think I understand your requirements. Since HTML is a markup language, it needs context to render (CSS and a screen size). Existing implementations I've seen generally open the page in a headless browser and create a PDF that way.

Personally, I would just use an existing package and shell out from Go. This one looks good; it's even recommended in this answer.

If you're really determined to implement it all in Go, check out this WebKit wrapper. I'm not sure what you'd use for generating PDFs, but but at least it's a start.

答案6

得分: 3

我正在创建一个替代库,以更简单的方式创建PDF(https://github.com/johnfercher/maroto)。它使用gofpdf,具有网格系统和一些组件,如Bootstrap

英文:

I'm creating an alternative lib to create PDFs in a simpler way (https://github.com/johnfercher/maroto). It uses gofpdf and have a grid system and some components like Bootstrap.

答案7

得分: 1

另一个选择是Athena。它有一个用Go编写的微服务,或者可以用作命令行界面。

英文:

Another option is Athena. It has a microservice written in Go or it can be used as a CLI.

答案8

得分: -4

另一个选择是UniHTML(基于容器的,带有API),它与UniPDF相互操作,可用于基于HTML模板创建PDF报告等。

它在容器中使用无头Chrome引擎,因此渲染效果完美,并具备所有HTML功能。与UniPDF的结合还提供了其他优势,例如自动生成目录、大纲等。还可以添加密码保护、添加PDF表单、数字签名等功能。

要为磁盘上的HTML模板创建PDF,可以执行以下操作:

package main

import (
	"fmt"
	"os"

	"github.com/unidoc/unihtml"
	"github.com/unidoc/unipdf/v3/common/license"
	"github.com/unidoc/unipdf/v3/creator"
)

func main() {
	// 设置UniDoc许可证。
	if err := license.SetMeteredKey("我的API密钥在这里"); err != nil {
		fmt.Printf("错误:设置计量密钥失败:%v\n", err)
		os.Exit(1)
	}

	// 与UniHTML服务器建立连接。
	if err := unihtml.Connect(":8080"); err != nil {
		fmt.Printf("错误:连接失败:%v\n", err)
		os.Exit(1)
	}

	// 获取新的PDF Creator。
	c := creator.New()

	// AddTOC启用目录生成。
	c.AddTOC = true

	chapter := c.NewChapter("Points")

	// 读取sample.html文件的内容并加载到转换中。
	htmlDocument, err := unihtml.NewDocument("sample.html")
	if err != nil {
		fmt.Printf("错误:NewDocument失败:%v\n", err)
		os.Exit(1)
	}

	// 在创建者的上下文中绘制HTML文档文件。
	if err = chapter.Add(htmlDocument); err != nil {
		fmt.Printf("错误:绘制失败:%v\n", err)
		os.Exit(1)
	}

	if err = c.Draw(chapter); err != nil {
		fmt.Printf("错误:绘制失败:%v\n", err)
		os.Exit(1)
	}


	// 将结果文件写入PDF。
	if err = c.WriteToFile("sample.pdf"); err != nil {
		fmt.Printf("错误:%v\n", err)
		os.Exit(1)
	}
}

我在这里写了一篇关于UniHTML的介绍文章,如果需要更多信息可能会有用(https://www.unidoc.io/post/html-for-pdf-reports-in-go)。

声明:我是UniPDF的原始开发者。

英文:

Another option is UniHTML (container-based with API) which interoperates with UniPDF which is useful to create PDF reports and such based on HTML templates.

It uses a headless-chrome engine in a container, so the rendering is perfect and has all HTML features. The combination with UniPDF gives additional advantages, such as automatic table of content generation, outlines and such. As well as ability to add password protection, add PDF forms, digital signatures and such.

To create a PDF for an HTML template on disk, it can be done by:

package main
import (
"fmt"
"os"
"github.com/unidoc/unihtml"
"github.com/unidoc/unipdf/v3/common/license"
"github.com/unidoc/unipdf/v3/creator"
)
func main() {
// Set the UniDoc license.
if err := license.SetMeteredKey("my api key goes here"); err != nil {
fmt.Printf("Err: setting metered key failed: %v\n", err)
os.Exit(1)
}
// Establish connection with the UniHTML Server.
if err := unihtml.Connect(":8080"); err != nil {
fmt.Printf("Err:  Connect failed: %v\n", err)
os.Exit(1)
}
// Get new PDF Creator.
c := creator.New()
// AddTOC enables Table of Contents generation.
c.AddTOC = true
chapter := c.NewChapter("Points")
// Read the content of the sample.html file and load it to the conversion.
htmlDocument, err := unihtml.NewDocument("sample.html")
if err != nil {
fmt.Printf("Err: NewDocument failed: %v\n", err)
os.Exit(1)
}
// Draw the html document file in the context of the creator.
if err = chapter.Add(htmlDocument); err != nil {
fmt.Printf("Err: Draw failed: %v\n", err)
os.Exit(1)
}
if err = c.Draw(chapter); err != nil {
fmt.Printf("Err: Draw failed: %v\n", err)
os.Exit(1)
}
// Write the result file to PDF.
if err = c.WriteToFile("sample.pdf"); err != nil {
fmt.Printf("Err: %v\n", err)
os.Exit(1)
}
}

I have written an introduction article to UniHTML [here] which might be useful if more information is needed (https://www.unidoc.io/post/html-for-pdf-reports-in-go).

Disclosure: I am the original developer of UniPDF.

huangapple
  • 本文由 发表于 2013年2月18日 00:58:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/14923570.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定