pdfbox application fat jar gives “Cannot read JBIG2 image: jbig2-imageio is not installed” but works ok running from IDE

huangapple go评论76阅读模式
英文:

pdfbox application fat jar gives “Cannot read JBIG2 image: jbig2-imageio is not installed” but works ok running from IDE

问题

我在构建一个使用pdfbox的应用程序时遇到了问题。
当我从IDE运行应用程序时(我使用netbeans 8.1),它能够读取带有jbig2图像的书籍(我的pom.xml中有jbig2的maven依赖项)。
问题是当我构建应用程序创建一个fat jar文件时。
当我使用相同的输入PDF运行这个fat jar文件时,它会产生以下错误:

"无法读取JBIG2图像:未安装jbig2-imageio"

有些帖子评论了这个错误,但似乎不能解决我的问题(他们说必须在pom中添加一个maven依赖项,但是该依赖项已经在我的pom中)。

我还检查了jbig2库类是否在这个fat jar文件中,所以我不知道发生了什么。

我已经将问题隔离在一个类似于这样的小应用程序中:

public static void main( String[] args )
{
    String fileName = null;
    if( args.length == 0 )
    {
        fileName = "test.pdf";
    }
    else
    {
        fileName = args[0];
    }

    PdfDocumentWrapper doc = null;
    try
    {
        PdfboxFactory factory = new PdfboxFactory();
        doc = factory.createPdfDocumentWrapper();
        doc.loadPdf( fileName );
        for( int ii = 0; ii < doc.getNumberOfPages(); ii++ )
        {
            int pageNum = ii+1;
            System.out.println("\n\nProcessing page: " + pageNum + "\n---------------------------------");
            List<ImageWrapper> imageList = doc.getImagesOfPage(ii);

            int jj=0;
            for( ImageWrapper image: imageList )
            {
                jj++;
                System.out.println(String.format("  Page[%d]. Image[%d] -> bounds: %s",
                        pageNum, jj, image.getBounds().toString() ) );
            }
        }
    }
    catch( Exception ex )
    {
        ex.printStackTrace();
    }
    finally
    {
        if( doc != null )
        {
            try
            {
                doc.close();
            }
            catch( Exception ex )
            {
                ex.printStackTrace();
            }
        }
    }
}

我已经将完整的隔离示例项目放在这里(目的是帮助解决问题):
http://www.frojasg1.com/20200504.PdfImageExtractor.zip

当我从IDE运行该应用程序时,会产生以下输出:

Processing page: 1

Page[1]. Image[1] -> bounds: java.awt.Rectangle[x=17,y=33,width=442,height=116]
Page[1]. Image[2] -> bounds: java.awt.Rectangle[x=53,y=513,width=376,height=124]
Page[1]. Image[3] -> bounds: java.awt.Rectangle[x=101,y=250,width=285,height=5]

当我从命令行运行应用程序时,会产生以下输出:

$ java -jar ./PdfImageExtractor-v1.0-SNAPSHOT-all.jar

Processing page: 1
---------------------------------
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed

有人知道为什么fat jar文件不能读取jbig2图像吗?

英文:

I am facing a problem when I build an application that uses pdfbox.
The application is able to read books with jbig2 images when I run it from IDE (I use netbeans 8.1) (I have maven dependencies for jbig2 in pom.xml).
The problem is when I build the application creating a fat jar.
When I run the fat jar with the same input pdf, it gives the following error:

“Cannot read JBIG2 image: jbig2-imageio is not installed”

The threads that comment that error, do not seem to solve my problem (they say that a maven dependency has to be added to pom, but that dependency is already on my pom).

I have also checked that jbig2 library classes are inside the fat jar, so I have no idea of what is happening.

I have isolated the problem in a tinny application that looks like this:

public static void main( String[] args )
{
String fileName = null;
if( args.length == 0 )
{
fileName = &quot;test.pdf&quot;;
}
else
{
fileName = args[0];
}
PdfDocumentWrapper doc = null;
try
{
PdfboxFactory factory = new PdfboxFactory();
doc = factory.createPdfDocumentWrapper();
doc.loadPdf( fileName );
for( int ii = 0; ii &lt; doc.getNumberOfPages(); ii++ )
{
int pageNum = ii+1;
System.out.println(&quot;\n\nProcessing page: &quot; + pageNum +&quot;\n---------------------------------&quot;);
List&lt;ImageWrapper&gt; imageList = doc.getImagesOfPage(ii);
int jj=0;
for( ImageWrapper image: imageList )
{
jj++;
System.out.println(String.format(&quot;  Page[%d]. Image[%d] -&gt; bounds: %s&quot;,
pageNum, jj, image.getBounds().toString() ) );
}
}
}
catch( Exception ex )
{
ex.printStackTrace();
}
finally
{
if( doc != null )
{
try
{
doc.close();
}
catch( Exception ex )
{
ex.printStackTrace();
}
}
}
}

I have placed the whole isolated example project here (with the purpose to help to solve the issue):
http://www.frojasg1.com/20200504.PdfImageExtractor.zip

When I run that application from IDE, it produces the following output:

Processing page: 1
---------------------------------
Page[1]. Image[1] -&gt; bounds: java.awt.Rectangle[x=17,y=33,width=442,height=116]
Page[1]. Image[2] -&gt; bounds: java.awt.Rectangle[x=53,y=513,width=376,height=124]
Page[1]. Image[3] -&gt; bounds: java.awt.Rectangle[x=101,y=250,width=285,height=5]
------------------------------------------------------------------------

When I run the application from command line, it gives the following output:

$ java -jar ./PdfImageExtractor-v1.0-SNAPSHOT-all.jar
Processing page: 1
---------------------------------
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed

Does anybody know why the fat jar is not able to read jbig2 images?

答案1

得分: 3

如果有人因相同错误而在这里遇到问题:在使用Apache PDFBox读取PDF文件时出现“无法读取JBIG2图像:未安装jbig2-imageio”,首先您需要确保在类路径中有一个附加的org.apache.pdfbox:jbig2-imageio依赖项

然后,您可能还需要调用ImageIO.scanForPlugins(),如pdfbox-jbig2库的README.md中所述(在我的情况下,这不是必需的)。

有关更多信息,请参阅PDFBox文档中关于JAI图像I/O的部分,涵盖了如何读取JBIG2图像。

英文:

If anyone stumbles here for the same error: Cannot read JBIG2 image: jbig2-imageio is not installed when reading a PDF file with Apache PDFBox, first you need to make sure you have an additional org.apache.pdfbox:jbig2-imageio dependency in your classpath.

Then you may also require calling ImageIO.scanForPlugins() as noted in the README.md of the pdfbox-jbig2 library (in my case it was not necessary).

For more information refer to the section on JAI Image I/O in PDFBox docs that covers reading JBIG2 images as well.

答案2

得分: 2

我在pdfbox用户邮件列表中发布了相同的问题,以下是答案:

您的fat-jar包含多个ImageIO库。您只是将所有文件合并到一个大的jar包中,并覆盖了这些ImageIO库的配置文件。查看目录"/META-INF/services"。JBig插件的文件被另一个插件的文件覆盖了。要么您合并这些文件,要么不要创建一个包含所有依赖项的大的jar包。

解决方案如下:

非常感谢,就是这个问题!

我已经能够创建这些META-INF文件:

$ find src/serviceManifests/
src/serviceManifests/
src/serviceManifests/META-INF
src/serviceManifests/META-INF/services
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageReaderSpi
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageWriterSpi

将它们与ImageIO库中的文件合并

通过将以下行添加到pom.xml文件:

src/serviceManifests
META-INF/services


...

${service.declaration.dir}

${service.files.path}/*


...

...

问题已解决。

英文:

I posted the same question in pdfbox users mailing list, and here is the answer:

Your fat-jar consists of several ImageIO libs. You are simply merging all files 
to one big jar and overwriting the config files of those ImageIO libs. Have a 
look at the directory &quot;/META-INF/services&quot;. The files of the JBig&quot; plugin are 
overwritten by files of another plugin. Either you merge those files or don&#39;t 
create one big jar of all deps.

And the solution:

Thank you very much, it was that!

I have been able to create these META-INF files:

$ find src/serviceManifests/
src/serviceManifests/
src/serviceManifests/META-INF
src/serviceManifests/META-INF/services
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageReaderSpi
src/serviceManifests/META-INF/services/javax.imageio.spi.ImageWriterSpi

merging them from the ones in ImageIO jars

By adding these lines to pom.xml:

&lt;properties&gt;
&lt;service.declaration.dir&gt;src/serviceManifests&lt;/service.declaration.dir&gt;
&lt;service.files.path&gt;META-INF/services&lt;/service.files.path&gt;
&lt;/properties&gt;
&lt;build&gt;
&lt;resources&gt;
...
&lt;resource&gt;
&lt;directory&gt;${service.declaration.dir}&lt;/directory&gt;
&lt;includes&gt;
&lt;include&gt;${service.files.path}/*&lt;/include&gt;
&lt;/includes&gt;
&lt;/resource&gt;
...
&lt;/resources&gt;
...
&lt;/build&gt;

Problem solved.

答案3

得分: 0

在 ImageIO.scanForPlugins() 前添加 ImageIO.scanForPlugins(); 也可以解决这个问题。

英文:

add ImageIO.scanForPlugins(); before ImageIO.scanForPlugins();
can also solve this problem

huangapple
  • 本文由 发表于 2020年5月4日 21:49:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/61593831.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定