英文:
pdfbox application fat jar gives “Cannot read JBIG2 image: jbig2-imageio is not installed” but works ok running from IDE
问题
我在构建一个使用pdfbox的应用程序时遇到了问题。
当我从IDE运行应用程序时(我使用netbeans 8.1),它能够读取带有jbig2图像的书籍(我的pom.xml中有jbig2的maven依赖项)。
问题是当我构建应用程序创建一个fat jar文件时。
当我使用相同的输入PDF运行这个fat jar文件时,它会产生以下错误:
"无法读取JBIG2图像:未安装jbig2-imageio"
有些帖子评论了这个错误,但似乎不能解决我的问题(他们说必须在pom中添加一个maven依赖项,但是该依赖项已经在我的pom中)。
我还检查了jbig2库类是否在这个fat jar文件中,所以我不知道发生了什么。
我已经将问题隔离在一个类似于这样的小应用程序中:
public static void main( String[] args )
{
String fileName = null;
if( args.length == 0 )
{
fileName = "test.pdf";
}
else
{
fileName = args[0];
}
PdfDocumentWrapper doc = null;
try
{
PdfboxFactory factory = new PdfboxFactory();
doc = factory.createPdfDocumentWrapper();
doc.loadPdf( fileName );
for( int ii = 0; ii < doc.getNumberOfPages(); ii++ )
{
int pageNum = ii+1;
System.out.println("\n\nProcessing page: " + pageNum + "\n---------------------------------");
List<ImageWrapper> imageList = doc.getImagesOfPage(ii);
int jj=0;
for( ImageWrapper image: imageList )
{
jj++;
System.out.println(String.format(" Page[%d]. Image[%d] -> bounds: %s",
pageNum, jj, image.getBounds().toString() ) );
}
}
}
catch( Exception ex )
{
ex.printStackTrace();
}
finally
{
if( doc != null )
{
try
{
doc.close();
}
catch( Exception ex )
{
ex.printStackTrace();
}
}
}
}
我已经将完整的隔离示例项目放在这里(目的是帮助解决问题):
http://www.frojasg1.com/20200504.PdfImageExtractor.zip
当我从IDE运行该应用程序时,会产生以下输出:
Processing page: 1
Page[1]. Image[1] -> bounds: java.awt.Rectangle[x=17,y=33,width=442,height=116]
Page[1]. Image[2] -> bounds: java.awt.Rectangle[x=53,y=513,width=376,height=124]
Page[1]. Image[3] -> bounds: java.awt.Rectangle[x=101,y=250,width=285,height=5]
当我从命令行运行应用程序时,会产生以下输出:
$ java -jar ./PdfImageExtractor-v1.0-SNAPSHOT-all.jar
Processing page: 1
---------------------------------
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
有人知道为什么fat jar文件不能读取jbig2图像吗?
英文:
I am facing a problem when I build an application that uses pdfbox.
The application is able to read books with jbig2 images when I run it from IDE (I use netbeans 8.1) (I have maven dependencies for jbig2 in pom.xml).
The problem is when I build the application creating a fat jar.
When I run the fat jar with the same input pdf, it gives the following error:
“Cannot read JBIG2 image: jbig2-imageio is not installed”
The threads that comment that error, do not seem to solve my problem (they say that a maven dependency has to be added to pom, but that dependency is already on my pom).
I have also checked that jbig2 library classes are inside the fat jar, so I have no idea of what is happening.
I have isolated the problem in a tinny application that looks like this:
public static void main( String[] args )
{
String fileName = null;
if( args.length == 0 )
{
fileName = "test.pdf";
}
else
{
fileName = args[0];
}
PdfDocumentWrapper doc = null;
try
{
PdfboxFactory factory = new PdfboxFactory();
doc = factory.createPdfDocumentWrapper();
doc.loadPdf( fileName );
for( int ii = 0; ii < doc.getNumberOfPages(); ii++ )
{
int pageNum = ii+1;
System.out.println("\n\nProcessing page: " + pageNum +"\n---------------------------------");
List<ImageWrapper> imageList = doc.getImagesOfPage(ii);
int jj=0;
for( ImageWrapper image: imageList )
{
jj++;
System.out.println(String.format(" Page[%d]. Image[%d] -> bounds: %s",
pageNum, jj, image.getBounds().toString() ) );
}
}
}
catch( Exception ex )
{
ex.printStackTrace();
}
finally
{
if( doc != null )
{
try
{
doc.close();
}
catch( Exception ex )
{
ex.printStackTrace();
}
}
}
}
I have placed the whole isolated example project here (with the purpose to help to solve the issue):
http://www.frojasg1.com/20200504.PdfImageExtractor.zip
When I run that application from IDE, it produces the following output:
Processing page: 1
---------------------------------
Page[1]. Image[1] -> bounds: java.awt.Rectangle[x=17,y=33,width=442,height=116]
Page[1]. Image[2] -> bounds: java.awt.Rectangle[x=53,y=513,width=376,height=124]
Page[1]. Image[3] -> bounds: java.awt.Rectangle[x=101,y=250,width=285,height=5]
------------------------------------------------------------------------
When I run the application from command line, it gives the following output:
$ java -jar ./PdfImageExtractor-v1.0-SNAPSHOT-all.jar
Processing page: 1
---------------------------------
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
Does anybody know why the fat jar is not able to read jbig2 images?
答案1
得分: 3
如果有人因相同错误而在这里遇到问题:在使用Apache PDFBox读取PDF文件时出现“无法读取JBIG2图像:未安装jbig2-imageio”,首先您需要确保在类路径中有一个附加的org.apache.pdfbox:jbig2-imageio
依赖项。
然后,您可能还需要调用ImageIO.scanForPlugins()
,如pdfbox-jbig2
库的README.md
中所述(在我的情况下,这不是必需的)。
有关更多信息,请参阅PDFBox文档中关于JAI图像I/O的部分,涵盖了如何读取JBIG2图像。
英文:
If anyone stumbles here for the same error: Cannot read JBIG2 image: jbig2-imageio is not installed
when reading a PDF file with Apache PDFBox, first you need to make sure you have an additional org.apache.pdfbox:jbig2-imageio
dependency in your classpath.
Then you may also require calling ImageIO.scanForPlugins()
as noted in the README.md
of the pdfbox-jbig2
library (in my case it was not necessary).
For more information refer to the section on JAI Image I/O in PDFBox docs that covers reading JBIG2 images as well.
答案2
得分: 2
我在pdfbox用户邮件列表中发布了相同的问题,以下是答案:
您的fat-jar包含多个ImageIO库。您只是将所有文件合并到一个大的jar包中,并覆盖了这些ImageIO库的配置文件。查看目录"/META-INF/services"。JBig插件的文件被另一个插件的文件覆盖了。要么您合并这些文件,要么不要创建一个包含所有依赖项的大的jar包。
解决方案如下:
非常感谢,就是这个问题!
我已经能够创建这些META-INF文件:
$ find src/serviceManifests/
src/serviceManifests/
src/serviceManifests/META-INF
src/serviceManifests/META-INF/services
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageReaderSpi
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageWriterSpi
将它们与ImageIO库中的文件合并
通过将以下行添加到pom.xml文件:
...
...
...
问题已解决。
英文:
I posted the same question in pdfbox users mailing list, and here is the answer:
Your fat-jar consists of several ImageIO libs. You are simply merging all files
to one big jar and overwriting the config files of those ImageIO libs. Have a
look at the directory "/META-INF/services". The files of the JBig" plugin are
overwritten by files of another plugin. Either you merge those files or don't
create one big jar of all deps.
And the solution:
Thank you very much, it was that!
I have been able to create these META-INF files:
$ find src/serviceManifests/
src/serviceManifests/
src/serviceManifests/META-INF
src/serviceManifests/META-INF/services
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageReaderSpi
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageWriterSpi
merging them from the ones in ImageIO jars
By adding these lines to pom.xml:
<properties>
<service.declaration.dir>src/serviceManifests</service.declaration.dir>
<service.files.path>META-INF/services</service.files.path>
</properties>
<build>
<resources>
...
<resource>
<directory>${service.declaration.dir}</directory>
<includes>
<include>${service.files.path}/*</include>
</includes>
</resource>
...
</resources>
...
</build>
Problem solved.
答案3
得分: 0
在 ImageIO.scanForPlugins() 前添加 ImageIO.scanForPlugins(); 也可以解决这个问题。
英文:
add ImageIO.scanForPlugins(); before ImageIO.scanForPlugins();
can also solve this problem
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论