2023年3月12日 18:27:57go评论174阅读模式

英文:

PyPDF2:Creating EncodedStreamObject is not currently supported

问题

以下是代码部分的中文翻译：

# 导入所需的库
from PyPDF2 import PdfReader, PdfWriter
# 替换文本内容的列表
replacements = [("Failed", "Passed")]
# 打开原始PDF文件
pdf = PdfReader(open("2.pdf", "rb"))
# 创建一个新的PDF写入对象
writer = PdfWriter()
# 遍历PDF的每一页
for page in pdf.pages:
    # 获取页面内容
    contents = page.get_contents().get_data()
    # 遍历替换文本
    for (a, b) in replacements:
        contents = contents.replace(str.encode(a), str.encode(b))
    # 更新页面内容
    page.get_contents().set_data(str(contents))
    # 将页面添加到新PDF中
    writer.add_page(page)
# 将修改后的PDF保存为新文件
with open("2_modified.pdf", "wb") as f:
    writer.write(f)

请注意，代码中的中文内容已经被翻译。如有其他问题，请提出。

英文:

The following code tries to edit part of text in a PDF file:

from PyPDF2 import PdfReader, PdfWriter
replacements = [(&quot;Failed&quot;, &quot;Passed&quot;)]
pdf = PdfReader(open(&quot;2.pdf&quot;, &quot;rb&quot;))
writer = PdfWriter()
for page in pdf.pages:
    contents = page.get_contents().get_data()
    #print(contents) old contents
    for (a, b) in replacements:
        contents = contents.replace(str.encode(a), str.encode(b))
    #print(contents) new contents which has &#39;Passed&#39; as new value
    page.get_contents().set_data(str(contents)) #Issue occurs here
    writer.add_page(page)
with open(&quot;2_modified.pdf&quot;, &quot;wb&quot;) as f:
writer.write(f)

Keep getting into below issue:

> Traceback (most recent call last): 
File "/pdf_editor.py", line 14, in <module> 
    page.get_contents().set_data(str(contents)) #Issue occurs here 
File "/venv/lib/python3.9/site-packages/PyPDF2/generic/_data_structures.py", line 839, in set_data 
    raise PdfReadError("Creating EncodedStreamObject is not currently supported") 
PyPDF2.errors.PdfReadError: Creating EncodedStreamObject is not currently supported

I tried with solutions mentioned here which did not work, also found this github link which has a lable "bug" but with no further updates.

UPDATE: 
I had tried the library which was in comments earlier did not pursue for two reasons:

Seems not used widely
Kept getting one or other issue last one being 'apply_redact_annotations' error

So wanted to know any other work around or any other good libraries to achieve this

答案1

得分: 1

我正在回答问题，而不是标题。虽然PyPDF2（现在与PyPDF合并）可以解码流对象以实时获取其数据，但它不支持隐式编码。虽然可能可以显式创建编码流，但我发现直接处理完全解码的文档更容易。我喜欢使用qpdf --qdf in.pdf uncompressed.pdf。

顺便说一下，“encoded”意味着“compressed”（“Deflate”很流行）。

英文:

I am answering the question in lieu of the title. While PyPDF2 (now merged with PyPDF) can decode encoded stream objects for their data on the fly, it does not support implicit encoding. While it is probably possible to create encoded streams explicitly, I find it easier just to work on fully decoded documents. I like using qpdf --qdf in.pdf uncompressed.pdf.

By the way, "encoded" means "compressed" ("Deflate" is popular).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

PyPDF2: 创建编码流对象（EncodedStreamObject）目前不受支持。

问题

答案1

在Jinja2中渲染时如何转义双引号？

SQLAlchemy 2.0 在调用 .values().returning() 时引发 “NotImplementedError”。

如何修复我KivyMD应用中的ReferenceError错误？

在使用 `collect_list()` 后访问数值。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。