java JarEntry getSize() 返回 -1

huangapple go评论97阅读模式
英文:

java JarEntry getSize() returns -1

问题

EDIT 3

事实证明,一些JAR文件会正确报告entry.getSize(),而一些则不会。我从2013年开始创建的所有JAR文件(在OSX Java 8及更高版本上)都有效,而其他一些文件,如mongo-spark-connector-10.0.0.jar有效。而其他一些文件,如antlr-runtime-4.7.2.jarmongodb-driver-core-4.9.0.jarhadoop-azure-3.2.0.jar则不会。但是,通过JarFile访问时,所有文件都会正确报告大小。

我有一个有效的JarInputStream js,是从数据库中获取的(即不使用文件系统上的文件)。如果byte[]、流和解压缩存在一些细微差别,那么JarInputStream的设置如下所示:

  1. byte[] bb = 获取完整的字节集; // 打印bb.length是14523,所以没问题
  2. ByteArrayInputStream bas = new ByteArrayInputStream(bb);
  3. JarInputStream js = new JarInputStream(bas);

我以以下方式迭代它:

  1. JarEntry entry;
  2. while ((entry = js.getNextJarEntry()) != null) {
  3. if (!entry.isDirectory() && entry.getName().endsWith(".class")) {
  4. String className = entry.getName().replace('/', '.').substring(0, entry.getName().length() - 6);
  5. long len = entry.getSize();
  6. long zlen = entry.getCompressedSize();
  7. System.out.println(" class [" + className + "]: Z " + zlen + "; unZ " + len);
  8. if (len > 0) {
  9. byte[] classBytes = new byte[len];
  10. js.read(classBytes);
  11. System.out.println("captured [" + className + "]");
  12. classBytesMap.put(className, classBytes);
  13. }
  14. }
  15. // ...
  16. }

循环“有效”,因为它正确地提取了所有类名,因此显然它正确地遍历了输入流。然而,entry.getSize()entry.getCompressedSize始终为-1。这是openjdk版本“17.0.7”(2023-04-18)。

Javadoc说明-1表示大小未知,但在继续处理流之前必须有一种大小或其他处理该条目的方法。

我不固守这种方法;最终目标是遍历JarInputStream并提取类名-类字节条目。

EDIT

作为测试,将byte[] bb数组写回文件XX2.jar,然后使用常规的JarFile类进行检查。这是有效的

  1. JarFile jf = new JarFile("XX2.jar");
  2. for (java.util.Enumeration<JarEntry> e = jf.entries(); e.hasMoreElements();) {
  3. entry = e.nextElement();
  4. if (!entry.isDirectory() && entry.getName().endsWith(".class")) {
  5. String className = entry.getName().replace('/', '.').substring(0, entry.getName().length() - 6);
  6. int x = (int) entry.getSize();
  7. int cx = (int) entry.getCompressedSize();
  8. System.out.println(" class [" + className + "]: Z " + cx + "; unZ " + x);
  9. }
  10. }

然而,尝试使用常规的FileInputStream读取该文件,如我们在许多其他SO示例中看到的,不起作用

  1. JarInputStream jarInputStream = new JarInputStream(new FileInputStream("XX2.jar"));
  2. 循环运行但大小和compressedSize仍然为-1

EDIT 2

以下是演示问题的完整示例:

  1. import java.util.jar.JarEntry;
  2. import java.util.jar.JarInputStream;
  3. import java.util.jar.JarFile;
  4. import java.io.FileInputStream;
  5. class jartest {
  6. public static void showEntry(JarEntry entry) {
  7. if (!entry.isDirectory() && entry.getName().endsWith(".class")) {
  8. String className = entry.getName().replace('/', '.').substring(0, entry.getName().length() - 6);
  9. int x = (int) entry.getSize();
  10. int cx = (int) entry.getCompressedSize();
  11. System.out.println(" class [" + className + "]: Z " + cx + "; unZ " + x);
  12. }
  13. }
  14. public static void main(String[] args) {
  15. try {
  16. JarEntry entry;
  17. String fname = "XX2.jar";
  18. // 这个有效
  19. JarFile jf = new JarFile(fname);
  20. for (java.util.Enumeration<JarEntry> e = jf.entries(); e.hasMoreElements();) {
  21. entry = e.nextElement();
  22. showEntry(entry);
  23. }
  24. // 这个不起作用
  25. JarInputStream js = new JarInputStream(new FileInputStream(fname));
  26. while ((entry = js.getNextJarEntry()) != null) {
  27. showEntry(entry);
  28. }
  29. } catch(Exception e) {
  30. System.out.println("fail: " + e);
  31. }
  32. }
  33. }

$ java jartest
class [grun$ExecutionContextImpl]: Z 706; unZ 1245
class [grun]: Z 3210; unZ 6476
class [grun$ExecutionContextImpl]: Z -1; unZ -1
class [grun]: Z -1; unZ -1

有什么线索吗?总的来说,看起来流不起作用,但基于文件名的逻辑有效。

英文:

EDIT 3

It turns out some jar files will properly report entry.getSize() with JarStreamInput and some don't. All the jar files I created going back to 2013 (on OSX Java 8 and higher) work and various others like mongo-spark-connector-10.0.0.jar work. Others like antlr-runtime-4.7.2.jar, mongodb-driver-core-4.9.0.jar, and hadoop-azure-3.2.0.jar do not. But all files properly report size when accessed via JarFile.

I have a valid JarInputStream js sourced from a database (i.e. not using files on filesystem). In case there are some nuances to byte[], streams, and unzipping, this is how the JarInputStream is set up:

  1. byte[] bb = get complete set of bytes; // print bb.length is 14523 so OK
  2. ByteArrayInputStream bas = new ByteArrayInputStream(bb);
  3. JarInputStream js = new JarInputStream(bas);

I iterate it in this way:

  1. JarEntry entry;
  2. while ((entry = js.getNextJarEntry()) != null) {
  3. if (!entry.isDirectory() &amp;&amp; entry.getName().endsWith(&quot;.class&quot;)) {
  4. String className = entry.getName().replace(&#39;/&#39;, &#39;.&#39;).substring(0, entry\
  5. .getName().length() - 6);
  6. long len = entry.getSize();
  7. long zlen = entry.getCompressedSize();
  8. System.out.println(&quot; class [&quot; + className + &quot;]: Z &quot; + zlen + &quot;; unZ &quot; + len);
  9. if(len &gt; 0) {
  10. byte[] classBytes = new byte[len];
  11. js.read(classBytes);
  12. System.out.println(&quot;captured [&quot; + className + &quot;]&quot;);
  13. classBytesMap.put(className, classBytes);
  14. }
  15. ...

The loop "works" because it correctly pulls all the class names out so it is clearly walking the input stream properly. However, both entry.getSize() and entry.getCompressedSize are always -1. This is openjdk version "17.0.7" 2023-04-18.

The javadoc states that -1 means the size is unknown but there must be a size or some other means by which to process just this entry before moving further down the stream.

I am not wed to this approach; the ultimate goal is to walk a JarInputStream and extract classname-classbyte entries.

EDIT

As a test, the byte[] bb array is written back to file XX2.jar and then examined using regular JarFile classes. This works:

  1. JarFile jf = new JarFile(&quot;XX2.jar&quot;);
  2. for (java.util.Enumeration&lt;JarEntry&gt; e = jf.entries(); e.hasMoreElements();) {
  3. entry = e.nextElement();
  4. if (!entry.isDirectory() &amp;&amp; entry.getName().endsWith(&quot;.class&quot;)) {
  5. String className = entry.getName().replace(&#39;/&#39;, &#39;.&#39;).substring(0, entry\
  6. .getName().length() - 6);
  7. int x = (int)entry.getSize();
  8. int cx = (int)entry.getCompressedSize();
  9. System.out.println(&quot; class [&quot; + className + &quot;]: Z &quot; + cx + &quot;; unZ &quot; + \
  10. x);
  11. }
  12. }
  13. class [org.bson.AbstractBsonReader$1]: Z 506; unZ 792
  14. class [org.bson.AbstractBsonReader$Context]: Z 497; unZ 1227
  15. class [org.bson.AbstractBsonReader$Mark]: Z 813; unZ 2201

However, trying to read that file using a regular FileInputStream as we see in many other SO examples does not work:

  1. JarInputStream jarInputStream = new JarInputStream(new FileInputStream(&quot;XX2.jar&quot;));
  2. loop runs but size and compressedSize are STILL -1.

EDIT 2

Here is a complete example that demos the problem:

  1. import java.util.jar.JarEntry;
  2. import java.util.jar.JarInputStream;
  3. import java.util.jar.JarFile;
  4. import java.io.FileInputStream;
  5. class jartest {
  6. public static void showEntry(JarEntry entry) {
  7. if (!entry.isDirectory() &amp;&amp; entry.getName().endsWith(&quot;.class&quot;)) {
  8. String className = entry.getName().replace(&#39;/&#39;, &#39;.&#39;).substring(0, entry.get\
  9. Name().length() - 6);
  10. int x = (int)entry.getSize();
  11. int cx = (int)entry.getCompressedSize();
  12. System.out.println(&quot; class [&quot; + className + &quot;]: Z &quot; + cx + &quot;; unZ &quot; + x);
  13. }
  14. }
  15. public static void main(String[] args) {
  16. try {
  17. JarEntry entry;
  18. String fname = &quot;XX2.jar&quot;;
  19. // THIS WORKS
  20. JarFile jf = new JarFile(fname);
  21. for (java.util.Enumeration&lt;JarEntry&gt; e = jf.entries(); e.hasMoreElements();\
  22. ) {
  23. entry = e.nextElement();
  24. showEntry(entry);
  25. }
  26. // THIS DOES NOT WORK
  27. JarInputStream js = new JarInputStream(new FileInputStream(fname));
  28. while ((entry = js.getNextJarEntry()) != null) {
  29. showEntry(entry);
  30. }
  31. } catch(Exception e) {
  32. System.out.println(&quot;fail: &quot; + e);
  33. }
  34. }
  35. }
  36. $ java jartest
  37. class [grun$ExecutionContextImpl]: Z 706; unZ 1245
  38. class [grun]: Z 3210; unZ 6476
  39. class [grun$ExecutionContextImpl]: Z -1; unZ -1
  40. class [grun]: Z -1; unZ -1

Any clues? Overall it looks like streams are not working but filename based logic does work.

答案1

得分: 1

根据Java文档,-1 表示相应的大小是未知的。

根据源代码,ZipEntry / JarEntry 的 size 字段的值是从ZIP或JAR文件本身读取的。逻辑有点复杂1,但立即解释为什么会得到-1的原因是代码无法从LOC头部提取大小,或者因为LOC头部中的大小为-1。

要弄清楚您的情况实际发生了什么,您需要手动解码JAR文件的头部。请注意,JAR文件实际上是一个带有清单的ZIP文件,因此您可以使用ZIP文件格式维基百科页面中的格式描述作为参考。

正如@g00se的评论所建议的,问题可能是您尝试读取的JAR文件不完整或格式不正确。因此,另一种选择是将字节写入文件,然后查看ZIP工具或jar命令是否能够读取该文件。


1 - 经典ZIP和ZIP64在LOC头部中以不同的方式表示大小。ZipEntry代码必须解析这一点。

英文:

According to the javadoc, -1 means that the respective size is unknown.

According to the source code, the values of a ZipEntry / JarEntry's size fields are read from the ZIP or JAR file itself. The logic is a bit complicated<sup>1</sup>, but immediate explanation for getting a -1 is that the code was unable to extract sizes from the LOC header, or because the sizes were -1 in the LOC header.

To figure out what is actually going on in your case, you will need to decode the JAR file's headers by hand. Note that a JAR file is actually a ZIP file with a manifest, so you can use the format description in the ZIP file format Wikipedia page as a reference.

As suggested by @g00se's comment, the problem could be that the JAR file you are trying to read is incomplete or malformed. So another alternative would be to write the bytes to a file and see if a ZIP tool or the jar command can read the file.


<sup>1 - Classic ZIP and ZIP64 represent the sizes differently in the LOC headers. The ZipEntry code has to unpick this.</sup>

huangapple
  • 本文由 发表于 2023年6月11日 22:15:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76450896.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定