使用Java 13编译带有Tika的应用程序 – 加载模块时出现问题

huangapple go评论74阅读模式
英文:

Compiling application with Tika with Java 13 - problems loading modules

问题

我正在尝试将一个使用Tika的Java应用程序从Oracle JDK 1.8迁移到OpenJDK 13。

我的集成开发环境是Eclipse。

我已经创建了文件module-info.java,用于指示应用程序所需的模块。

为了能够使用Tika类,如AbstractParserDetector等,我在module-info.java中添加了requires org.apache.tika.core;

我的代码还使用了类org.apache.tika.parser.pdf.PDFParserConfig来提取嵌入式图像:

PDFParserConfig pdfConfig = new PDFParserConfig();
pdfConfig.setExtractInlineImages(true);
context.set(PDFParserConfig.class, pdfConfig);

我遇到了编译错误:

无法解析类型PDFParserConfig

Eclipse建议在module-info.java中添加requires org.apache.tika.parsers;Eclipse建议截图

当我将这个模块要求添加到module-info.java时,应用程序可以正确编译。

也就是说,在这个阶段,我们在module-info.java中包含了:

module myapp {
	/** 其他模块... */ 
	requires org.apache.tika.core;
	requires org.apache.tika.parsers;
}

然而,在尝试执行已编译的应用程序时,我们遇到了错误:

初始化引导层时发生错误
java.lang.module.FindException: 无法为 C:\Users\Admin\.m2\repository\org\apache\tika\tika-parsers\1.24\tika-parsers-1.24.jar 推导模块描述符
Caused by: java.lang.module.InvalidModuleDescriptorException: 提供程序类 org.apache.tika.parser.onenote.OneNoteParser 不在模块中

在Eclipse中检查项目的库时,我可以看到tika-core和tika-parsers(v1.24)都是模块化的:Eclipse Java Build Path

总之:如果我不将org.apache.tika.parsers添加为必需模块,应用程序将无法编译;如果我添加它,我会得到运行时错误,提示org.apache.tika.parser.onenote.OneNoteParser不在模块中。

我已经检查了这些软件包的JAR文件,以查看它们的依赖关系。核心软件包似乎没问题:

$ jar --file=tika-core-1.24.jar --describe-module

找不到模块描述符。派生自动模块。

org.apache.tika.core@1.24 automatic
requires java.base mandated
contains org.apache.tika
包含 org.apache.tika.concurrent
包含 org.apache.tika.config
包含 org.apache.tika.detect
包含 org.apache.tika.embedder
包含 org.apache.tika.exception
包含 org.apache.tika.extractor
包含 org.apache.tika.fork
包含 org.apache.tika.io
包含 org.apache.tika.language
包含 org.apache.tika.language.detect
包含 org.apache.tika.language.translate
包含 org.apache.tika.metadata
包含 org.apache.tika.mime
包含 org.apache.tika.parser
包含 org.apache.tika.parser.digest
包含 org.apache.tika.parser.external
包含 org.apache.tika.sax
包含 org.apache.tika.sax.xpath
包含 org.apache.tika.utils

...但是'parsers'的JAR文件会报错:

$ jar --file=tika-parsers-1.24.jar --describe-module

无法为:tika-parsers-1.24.jar 推导模块描述符
提供程序类 org.apache.tika.parser.onenote.OneNoteParser 不在模块中

这是否意味着解析器的JAR软件包形式不正确?
是否有任何解决方法?

谢谢。

编辑:
如果我尝试使用版本1.24.1,我会得到执行错误:

初始化引导层时发生错误
java.lang.module.FindException: 无法为 C:\Users\Admin\.m2\repository\org\apache\tika\tika-parsers.24.1\tika-parsers-1.24.1.jar 推导模块描述符
Caused by: java.lang.module.InvalidModuleDescriptorException: 提供程序类 org.apache.tika.parser.external.CompositeExternalParser 不在模块中

也就是说:失败的类是CompositeExternalParser,而不是OneNoreParser

检查tika-parsers-1.42.1.jarMETA-INF/services/org.apache.tika.parser.Parser,我可以看到条目org.apache.tika.parser.external.CompositeExternalParser,但该包不包含这个类。

因此,这似乎是这个META-INF文件中的错误。这是否是在编译软件包并将其提交到Maven Central时出现的错误?

我找到了一个JIRA问题,TIKA-2929,其中他们说“Apache Tika需要在Java类路径上,而不是模块路径上”。我已经尝试过这个,但是如前所述,如果我不将其添加到模块路径并设置requires org.apache.tika.parsers;,我会得到编译错误。

这是一个难题...

英文:

I'm trying to migrate a Java application that uses Tika from OracleJDK 1.8 to OPenJDK 13.

My IDE is Eclipse.

I have created the file module-info.java to indicate the required modules for my application.

In order to be able to use Tika classes such as AbstractParser, Detector, etc., I have added requires org.apache.tika.core; in module-info.java.

My code also uses the class org.apache.tika.parser.pdf.PDFParserConfig to extract embedded images:

PDFParserConfig pdfConfig = new PDFParserConfig();
pdfConfig.setExtractInlineImages(true);
context.set(PDFParserConfig.class, pdfConfig);'

I get the compilation error:

PDFParserConfig cannot be resolved to a type

Eclipse suggests to add requires org.apache.tika.parsers; to module-info.java: Eclipse suggestion screenshot.

When I add this module requirement to module-info.java, the application compiles properly.

That is, at this stage we have included in module-info.java:

module myapp {
	/** others ... */ 
	requires org.apache.tika.core;
	requires org.apache.tika.parsers;
}

However, when trying to execute the compiled application, we get the error:

Error occurred during initialization of boot layer
java.lang.module.FindException: Unable to derive module descriptor for C:\Users\Admin\.m2\repository\org\apache\tika\tika-parsers.24\tika-parsers-1.24.jar
Caused by: java.lang.module.InvalidModuleDescriptorException: Provider class org.apache.tika.parser.onenote.OneNoteParser not in module

Inspecting the project Libraries in Eclipse, I can see that tika-core and tika-parsers (v1.24) are both modular: Eclipse Java Build Path

In conclusion: If I don't add org.apache.tika.parsers as a required module, the application won't compile, and if I add it I get the runtime error saying org.apache.tika.parser.onenote.OneNoteParser is not in the module.

I have inspected the JAR files for these packages to see the dependencies they have. The core packages seems to be right:

$ jar --file=tika-core-1.24.jar --describe-module

No module descriptor found. Derived automatic module.

org.apache.tika.core@1.24 automatic
requires java.base mandated
contains org.apache.tika
contains org.apache.tika.concurrent
contains org.apache.tika.config
contains org.apache.tika.detect
contains org.apache.tika.embedder
contains org.apache.tika.exception
contains org.apache.tika.extractor
contains org.apache.tika.fork
contains org.apache.tika.io
contains org.apache.tika.language
contains org.apache.tika.language.detect
contains org.apache.tika.language.translate
contains org.apache.tika.metadata
contains org.apache.tika.mime
contains org.apache.tika.parser
contains org.apache.tika.parser.digest
contains org.apache.tika.parser.external
contains org.apache.tika.sax
contains org.apache.tika.sax.xpath
contains org.apache.tika.utils

...but the 'parsers' jar gives an error:

$ jar --file=tika-parsers-1.24.jar --describe-module

Unable to derive module descriptor for: tika-parsers-1.24.jar
Provider class org.apache.tika.parser.onenote.OneNoteParser not in module

Does this mean the jar package for parsers is not well formed?
Is there any workaround for this?

Thank you.

EDIT:
If I try with version 1.24.1, I get the execution error:

Error occurred during initialization of boot layer
java.lang.module.FindException: Unable to derive module descriptor for C:\Users\Admin\.m2\repository\org\apache\tika\tika-parsers.24.1\tika-parsers-1.24.1.jar
Caused by: java.lang.module.InvalidModuleDescriptorException: Provider class org.apache.tika.parser.external.CompositeExternalParser not in module

That is: the failing class is CompositeExternalParser instead of OneNoreParser.

Inspecting META-INF/services/org.apache.tika.parser.Parser of tika-parsers-1.42.1.jarI can see the entryorg.apache.tika.parser.external.CompositeExternalParser` but the package does not contain this class.

So, it seems to be an error in this META-INF file. Id this due to an error when compiling the package and submitting it to Maven Central?

I've found a JIRA issue, TIKA-2929, where they say "Apache Tika needs to be on the Java Classpath, not the module path". I've tried this, but, as explained before, I get a compilation error if I don't add it to the module path and set requires org.apache.tika.parsers;.

This is a hard puzzle...

答案1

得分: 1

遇到了相同的问题。
还发现了在META-INF/services/中的org.apache.tika.parser.Parser(以及org.apache.tika.parser.Detector)中的错误条目。
一个快速的解决办法是...

  • 解压这些文件
  • 删除那些似乎引用了不存在类的行
  • 将它们重新打包到JAR文件中

在此之后,我的项目编译通过了。
当然,这不是长期的解决方案,但由于甚至我尝试过的较早版本也遇到了这个问题,这可能会帮助一些人。

英文:

Ran into the same issues.<br>
Also found the faulty entries in<br>
org.apache.tika.parser.Parser (and also org.apache.tika.parser.Detector) in META-INF/services/ <br>

A quick fix is to ...

  • Unpack those files
  • delete the lines that seem to reference non existing classes
  • pack them back into the jar <br>

My project compiled after that.<br>
For sure no longterm solution, but since even older versions i tried ran into that problem, it might help out some people.

huangapple
  • 本文由 发表于 2020年6月5日 20:32:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/62215464.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定