英文:
PDFBox: ExtractImages JPEG2000 images not extracting
问题
我正在尝试使用PDFBox从PDF文件中提取所有图像。对于包含JPEG和PNG图像的PDF,它运行良好。但对于OpenJPEG2000图像,它无法正常工作,我遇到了以下异常:
获取到以下错误:
org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
SEVERE: 无法读取JPEG2000图像:未安装Java Advanced Imaging (JAI) Image I/O工具
在所有版本的PDFBox中,都出现了相同的异常。也尝试过使用独立的jar包。
我已经在pom.xml中包含了必要的依赖。
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>jbig2-imageio</artifactId>
</dependency>
<!-- 由于法律原因(不兼容的许可证),这两个依赖项仅用于测试,不得分发。 -->
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
</dependency>
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-jpeg2000</artifactId>
</dependency>
将不胜感激地接受任何帮助。
英文:
I am trying to extract all the images in a PDF file using PDFBox. Its working fine for the pdf containing jpeg and png images. But it is not working for OpenJPEG2000 images. I am getting the below exception:
Getting the below error:
org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
SEVERE: Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed
In all version of PDFBox, same exception is coming. Tried with standalone jar as well.
I included the necessary dependencies in pom.xml as well.
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>jbig2-imageio</artifactId>
</dependency>
<!-- For legal reasons (incompatible license), these two dependencies
are to be used only in the tests and may not be distributed. -->
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
</dependency>
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-jpeg2000</artifactId>
</dependency>
Any help will be appreciated.
答案1
得分: 1
将与图像处理相关的 .jar 文件复制到 lib 子目录中,然后使用以下命令行:
java -cp "pdfbox-app-2.0.21.jar;lib/*" org.apache.pdfbox.tools.PDFBox ExtractImages <parameters>
在 Windows 系统中使用分号 ";",在 Linux 系统中使用冒号 ":"。
org.apache.pdfbox.tools.PDFBox
是主类的名称。
英文:
Copy the imageing related .jar files into the lib subdirectory, and then use this command line:
java -cp "pdfbox-app-2.0.21.jar;lib/*" org.apache.pdfbox.tools.PDFBox ExtractImages <parameters>
Use ";" on windows, ":" on linux.
org.apache.pdfbox.tools.PDFBox
is the name of the main class.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论