如何从以页面形式显示的在线电子书制作PDF?

huangapple go评论74阅读模式
英文:

How to make a PDF from an online ebook that is displayed page by page?

问题

我想将像这样的书保存为PDF文件 https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/index.html,它按页显示书的内容。

如何做到这一点?

到目前为止,我唯一成功的事情是逐页打印成PDF,然后将单独的PDF页合并。

有没有办法在Python或其他脚本中自动完成这个操作?

英文:

I would like to save into PDF books like this one to PDF https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/index.html that shows a book page by page.

How to do it?

The only thing that I managed so far is to print page by page into a pdf, and then combine separate pdf pages.

Is there a way to do it automatically in Python or other scripts?

答案1

得分: 1

你可以使用requests直接下载文档图片,并使用PIL保存为PDF。例如:

import requests
from PIL import Image  # pip install Pillow
from io import BytesIO

pdf_path = "doc.pdf"
url = 'https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/assets/page-images/page-113088-{}.jpg'

images = [
    Image.open(BytesIO(requests.get(url.format(f'{p:04}'), verify=False).content))
    for p in range(1, 4)  # <— 在这里增加页面数(现在将保存前3页)
]

# 借鉴自此答案:https://stackoverflow.com/a/47283224/10035985
images[0].save(
    pdf_path, "PDF", resolution=100.0, save_all=True, append_images=images[1:]
)

在Firefox中打开生成的doc.pdf

如何从以页面形式显示的在线电子书制作PDF?

英文:

You can download the document images directly with requests and save to PDF with PIL. For example:

import requests
from PIL import Image # pip install Pillow
from io import BytesIO

pdf_path = &quot;doc.pdf&quot;
url = &#39;https://kcenter.korean.go.kr/repository/ebook/culture/SB_step3/assets/page-images/page-113088-{}.jpg&#39;

images = [
    Image.open(BytesIO(requests.get(url.format(f&#39;{p:&gt;04}&#39;), verify=False).content))
    for p in range(1, 4)  # &lt;-- increase number of pages here (now it will save first 3 pages)
]

# borrowing from this answer: https://stackoverflow.com/a/47283224/10035985
images[0].save(
    pdf_path, &quot;PDF&quot; ,resolution=100.0, save_all=True, append_images=images[1:]
)

The resulting doc.pdf opened in Firefox:

如何从以页面形式显示的在线电子书制作PDF?

huangapple
  • 本文由 发表于 2023年6月19日 15:55:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76504653.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定