2023年6月11日 22:15:45go评论97阅读模式

英文:

java JarEntry getSize() returns -1

问题

EDIT 3

事实证明，一些JAR文件会正确报告entry.getSize()，而一些则不会。我从2013年开始创建的所有JAR文件（在OSX Java 8及更高版本上）都有效，而其他一些文件，如mongo-spark-connector-10.0.0.jar有效。而其他一些文件，如antlr-runtime-4.7.2.jar、mongodb-driver-core-4.9.0.jar和hadoop-azure-3.2.0.jar则不会。但是，通过JarFile访问时，所有文件都会正确报告大小。

我有一个有效的JarInputStream js，是从数据库中获取的（即不使用文件系统上的文件）。如果byte[]、流和解压缩存在一些细微差别，那么JarInputStream的设置如下所示：

byte[] bb = 获取完整的字节集; // 打印bb.length是14523，所以没问题
ByteArrayInputStream bas = new ByteArrayInputStream(bb);
JarInputStream js = new JarInputStream(bas);

我以以下方式迭代它：

JarEntry entry;
while ((entry = js.getNextJarEntry()) != null) {
    if (!entry.isDirectory() && entry.getName().endsWith(".class")) {
        String className = entry.getName().replace('/', '.').substring(0, entry.getName().length() - 6);
                        
        long len = entry.getSize();
        long zlen = entry.getCompressedSize();
        System.out.println("  class [" + className + "]: Z " + zlen + "; unZ " + len);
        if (len > 0) {
            byte[] classBytes = new byte[len];
            js.read(classBytes);
            System.out.println("captured [" + className + "]");
            classBytesMap.put(className, classBytes);
        }
    }
    // ...
}

循环“有效”，因为它正确地提取了所有类名，因此显然它正确地遍历了输入流。然而，entry.getSize()和entry.getCompressedSize始终为-1。这是openjdk版本“17.0.7”（2023-04-18）。

Javadoc说明-1表示大小未知，但在继续处理流之前必须有一种大小或其他处理该条目的方法。

我不固守这种方法；最终目标是遍历JarInputStream并提取类名-类字节条目。

EDIT

作为测试，将byte[] bb数组写回文件XX2.jar，然后使用常规的JarFile类进行检查。这是有效的：

JarFile jf = new JarFile("XX2.jar");
for (java.util.Enumeration<JarEntry> e = jf.entries(); e.hasMoreElements();) {
    entry = e.nextElement();
    if (!entry.isDirectory() && entry.getName().endsWith(".class")) {
        String className = entry.getName().replace('/', '.').substring(0, entry.getName().length() - 6);
        int x = (int) entry.getSize();
        int cx = (int) entry.getCompressedSize();
        System.out.println("  class [" + className + "]: Z " + cx + "; unZ " + x);
    }
}

然而，尝试使用常规的FileInputStream读取该文件，如我们在许多其他SO示例中看到的，不起作用：

JarInputStream jarInputStream = new JarInputStream(new FileInputStream("XX2.jar"));
循环运行，但大小和compressedSize仍然为-1。

EDIT 2

以下是演示问题的完整示例：

import java.util.jar.JarEntry;
import java.util.jar.JarInputStream;
import java.util.jar.JarFile;
import java.io.FileInputStream;
class jartest {
    public static void showEntry(JarEntry entry) {
        if (!entry.isDirectory() && entry.getName().endsWith(".class")) {
            String className = entry.getName().replace('/', '.').substring(0, entry.getName().length() - 6);
            int x = (int) entry.getSize();
            int cx = (int) entry.getCompressedSize();
            System.out.println("  class [" + className + "]: Z " + cx + "; unZ " + x);
        }
    }
    public static void main(String[] args) {
        try {
            JarEntry entry;
            String fname = "XX2.jar";
            // 这个有效
            JarFile jf = new JarFile(fname);
            for (java.util.Enumeration<JarEntry> e = jf.entries(); e.hasMoreElements();) {
                entry = e.nextElement();
                showEntry(entry);
            }
            // 这个不起作用
            JarInputStream js = new JarInputStream(new FileInputStream(fname));
            while ((entry = js.getNextJarEntry()) != null) {
                showEntry(entry);
            }
        } catch(Exception e) {
            System.out.println("fail: " + e);
        }
    }
}

$ java jartest
class [grun$ExecutionContextImpl]: Z 706; unZ 1245
class [grun]: Z 3210; unZ 6476
class [grun$ExecutionContextImpl]: Z -1; unZ -1
class [grun]: Z -1; unZ -1

有什么线索吗？总的来说，看起来流不起作用，但基于文件名的逻辑有效。

英文:

EDIT 3

It turns out some jar files will properly report entry.getSize() with JarStreamInput and some don't. All the jar files I created going back to 2013 (on OSX Java 8 and higher) work and various others like mongo-spark-connector-10.0.0.jar work. Others like antlr-runtime-4.7.2.jar, mongodb-driver-core-4.9.0.jar, and hadoop-azure-3.2.0.jar do not. But all files properly report size when accessed via JarFile.

I have a valid JarInputStream js sourced from a database (i.e. not using files on filesystem). In case there are some nuances to byte[], streams, and unzipping, this is how the JarInputStream is set up:

byte[] bb = get complete set of bytes; // print bb.length is 14523 so OK
ByteArrayInputStream bas = new ByteArrayInputStream(bb);
JarInputStream js = new JarInputStream(bas);

I iterate it in this way:

        JarEntry entry;
while ((entry = js.getNextJarEntry()) != null) {
if (!entry.isDirectory() &amp;&amp; entry.getName().endsWith(&quot;.class&quot;)) {
String className = entry.getName().replace(&#39;/&#39;, &#39;.&#39;).substring(0, entry\
.getName().length() - 6);
long len = entry.getSize();
long zlen = entry.getCompressedSize();
System.out.println(&quot;  class [&quot; + className + &quot;]: Z &quot; + zlen + &quot;; unZ &quot; + len);
if(len &gt; 0) {
byte[] classBytes = new byte[len];
js.read(classBytes);
System.out.println(&quot;captured [&quot; + className + &quot;]&quot;);
classBytesMap.put(className, classBytes);
}
...

The loop "works" because it correctly pulls all the class names out so it is clearly walking the input stream properly. However, both entry.getSize() and entry.getCompressedSize are always -1. This is openjdk version "17.0.7" 2023-04-18.

The javadoc states that -1 means the size is unknown but there must be a size or some other means by which to process just this entry before moving further down the stream.

I am not wed to this approach; the ultimate goal is to walk a JarInputStream and extract classname-classbyte entries.

EDIT

As a test, the byte[] bb array is written back to file XX2.jar and then examined using regular JarFile classes. This works:

        JarFile jf = new JarFile(&quot;XX2.jar&quot;);
for (java.util.Enumeration&lt;JarEntry&gt; e = jf.entries(); e.hasMoreElements();) {
entry = e.nextElement();
if (!entry.isDirectory() &amp;&amp; entry.getName().endsWith(&quot;.class&quot;)) {
String className = entry.getName().replace(&#39;/&#39;, &#39;.&#39;).substring(0, entry\
.getName().length() - 6);
int x = (int)entry.getSize();
int cx = (int)entry.getCompressedSize();
System.out.println(&quot;  class [&quot; + className + &quot;]: Z &quot; + cx + &quot;; unZ &quot; + \
x);
}
}
class [org.bson.AbstractBsonReader$1]: Z 506; unZ 792
class [org.bson.AbstractBsonReader$Context]: Z 497; unZ 1227
class [org.bson.AbstractBsonReader$Mark]: Z 813; unZ 2201

However, trying to read that file using a regular FileInputStream as we see in many other SO examples does not work:

      JarInputStream jarInputStream = new JarInputStream(new FileInputStream(&quot;XX2.jar&quot;));
loop runs but size and compressedSize are STILL -1.

EDIT 2

Here is a complete example that demos the problem:

import java.util.jar.JarEntry;
import java.util.jar.JarInputStream;
import java.util.jar.JarFile;
import java.io.FileInputStream;
class jartest {
public static void showEntry(JarEntry entry) {
if (!entry.isDirectory() &amp;&amp; entry.getName().endsWith(&quot;.class&quot;)) {
String className = entry.getName().replace(&#39;/&#39;, &#39;.&#39;).substring(0, entry.get\
Name().length() - 6);
int x = (int)entry.getSize();
int cx = (int)entry.getCompressedSize();
System.out.println(&quot;  class [&quot; + className + &quot;]: Z &quot; + cx + &quot;; unZ &quot; + x);
}
}
public static void main(String[] args) {
try {
JarEntry entry;
String fname = &quot;XX2.jar&quot;;
// THIS WORKS                                                               
JarFile jf = new JarFile(fname);
for (java.util.Enumeration&lt;JarEntry&gt; e = jf.entries(); e.hasMoreElements();\
) {
entry =	e.nextElement();
showEntry(entry);
}
// THIS DOES NOT WORK                                                       
JarInputStream js = new JarInputStream(new FileInputStream(fname));
while ((entry = js.getNextJarEntry()) != null) {
showEntry(entry);
}
} catch(Exception e) {
System.out.println(&quot;fail: &quot; + e);
}
}
}
$ java jartest
class [grun$ExecutionContextImpl]: Z 706; unZ 1245
class [grun]: Z 3210; unZ 6476
class [grun$ExecutionContextImpl]: Z -1; unZ -1
class [grun]: Z -1; unZ -1

Any clues? Overall it looks like streams are not working but filename based logic does work.

答案1

得分: 1

根据Java文档，-1 表示相应的大小是未知的。

根据源代码，ZipEntry / JarEntry 的 size 字段的值是从ZIP或JAR文件本身读取的。逻辑有点复杂¹，但立即解释为什么会得到-1的原因是代码无法从LOC头部提取大小，或者因为LOC头部中的大小为-1。

要弄清楚您的情况实际发生了什么，您需要手动解码JAR文件的头部。请注意，JAR文件实际上是一个带有清单的ZIP文件，因此您可以使用ZIP文件格式维基百科页面中的格式描述作为参考。

正如@g00se的评论所建议的，问题可能是您尝试读取的JAR文件不完整或格式不正确。因此，另一种选择是将字节写入文件，然后查看ZIP工具或jar命令是否能够读取该文件。

^{1 - 经典ZIP和ZIP64在LOC头部中以不同的方式表示大小。ZipEntry代码必须解析这一点。}

英文:

According to the javadoc, -1 means that the respective size is unknown.

According to the source code, the values of a ZipEntry / JarEntry's size fields are read from the ZIP or JAR file itself. The logic is a bit complicated<sup>1</sup>, but immediate explanation for getting a -1 is that the code was unable to extract sizes from the LOC header, or because the sizes were -1 in the LOC header.

To figure out what is actually going on in your case, you will need to decode the JAR file's headers by hand. Note that a JAR file is actually a ZIP file with a manifest, so you can use the format description in the ZIP file format Wikipedia page as a reference.

As suggested by @g00se's comment, the problem could be that the JAR file you are trying to read is incomplete or malformed. So another alternative would be to write the bytes to a file and see if a ZIP tool or the jar command can read the file.

<sup>1 - Classic ZIP and ZIP64 represent the sizes differently in the LOC headers. The ZipEntry code has to unpick this.</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

java JarEntry getSize() 返回 -1

问题

EDIT 3

EDIT

EDIT 2

EDIT 3

EDIT

EDIT 2

答案1

访问通过引用的私有字符串

为什么在插入了多个项目后，数组列表ratingItemList显示为空？

Upsource – PKIX路径SSL证书问题

返回Java对象从Mono

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论