如何从以页面形式显示的在线电子书制作PDF?

huangapple go评论108阅读模式
英文:

How to make a PDF from an online ebook that is displayed page by page?

问题

我想将像这样的书保存为PDF文件 https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/index.html,它按页显示书的内容。

如何做到这一点?

到目前为止,我唯一成功的事情是逐页打印成PDF,然后将单独的PDF页合并。

有没有办法在Python或其他脚本中自动完成这个操作?

英文:

I would like to save into PDF books like this one to PDF https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/index.html that shows a book page by page.

How to do it?

The only thing that I managed so far is to print page by page into a pdf, and then combine separate pdf pages.

Is there a way to do it automatically in Python or other scripts?

答案1

得分: 1

你可以使用requests直接下载文档图片,并使用PIL保存为PDF。例如:

  1. import requests
  2. from PIL import Image # pip install Pillow
  3. from io import BytesIO
  4. pdf_path = "doc.pdf"
  5. url = 'https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/assets/page-images/page-113088-{}.jpg'
  6. images = [
  7. Image.open(BytesIO(requests.get(url.format(f'{p:04}'), verify=False).content))
  8. for p in range(1, 4) # <— 在这里增加页面数(现在将保存前3页)
  9. ]
  10. # 借鉴自此答案:https://stackoverflow.com/a/47283224/10035985
  11. images[0].save(
  12. pdf_path, "PDF", resolution=100.0, save_all=True, append_images=images[1:]
  13. )

在Firefox中打开生成的doc.pdf

如何从以页面形式显示的在线电子书制作PDF?

英文:

You can download the document images directly with requests and save to PDF with PIL. For example:

  1. import requests
  2. from PIL import Image # pip install Pillow
  3. from io import BytesIO
  4. pdf_path = &quot;doc.pdf&quot;
  5. url = &#39;https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/assets/page-images/page-113088-{}.jpg&#39;
  6. images = [
  7. Image.open(BytesIO(requests.get(url.format(f&#39;{p:&gt;04}&#39;), verify=False).content))
  8. for p in range(1, 4) # &lt;-- increase number of pages here (now it will save first 3 pages)
  9. ]
  10. # borrowing from this answer: https://stackoverflow.com/a/47283224/10035985
  11. images[0].save(
  12. pdf_path, &quot;PDF&quot; ,resolution=100.0, save_all=True, append_images=images[1:]
  13. )

The resulting doc.pdf opened in Firefox:

如何从以页面形式显示的在线电子书制作PDF?

huangapple
  • 本文由 发表于 2023年6月19日 15:55:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76504653.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定