2020年9月24日 06:15:25go评论104阅读模式

英文:

How can I download a single file from a large remote zip file in Java?

问题

我正在尝试从大小约为3-5 GB的给定zip文件中下载一个小文件（0.3 KB）。

我目前一直在使用本地库libfragmentzip，使用JNA，这非常快，但使用本地库会有自己的问题（比如不跨平台）。

我尝试过这个解决方案，但速度要慢得多，需要几分钟，而不是使用libfragmentzip，后者似乎只需要几秒钟。

这个是一个测试zip文件的URL（扩展名是.ipsw，但实际上是zip格式）。我要下载的文件是BuildManifest.plist，位于zip文件的根目录中。

有没有一种快速的方法可以从远程zip文件中下载单个文件，而不使用本地库？

英文:

I'm trying to download a small file (0.3 KB) from a given zip file that's around 3-5 GB in size.

I have currently been using the native library libfragmentzip using JNA, which is very fast, but has issues of its own that come with using native libraries (like not being cross-platform).

I have tried this solution, but it is much slower and ends up taking minutes compared to using libfragmentzip, which only seems to take seconds.

This is a URL to a test zip file (the extension is .ipsw but it is really a zip). The file I am trying to download is BuildManifest.plist, in the root of the zip.

Is there a fast way to download a single file from a remote zip file without using a native library?

答案1

得分: 2

你可以将 BuildManifest.plist 插入到URL的末尾。

例如：

http://updates-http.cdn-apple.com/2021SpringFCS/fullrestores/071-34317/E63B034D-2116-42D0-9FBD-97A3D9060F68/BuildManifest.plist

英文:

You can insert BuildManifest.plist at the end of the URL.

For example:

http://updates-http.cdn-apple.com/2021SpringFCS/fullrestores/071-34317/E63B034D-2116-42D0-9FBD-97A3D9060F68/BuildManifest.plist

答案2

得分: 1

使用Apache Commons Compress和一个由HTTP支持的自定义ByteChannel实现：

var url = new URL(...);
var fileName = "file.txt";
var dest = Path.of(fileName);
try (var zip = new ZipFile(new HttpChannel(url), "zip", "UTF8", true, true);
    var stream = zip.getInputStream(zip.getEntry(fileName))) {
    Files.copy(stream, dest, StandardCopyOption.REPLACE_EXISTING);
}

HttpChannel（修改自JCodec）：

public class HttpChannel implements SeekableByteChannel {
    private final URL url;
    private ReadableByteChannel ch;
    private long pos;
    private long length;
    public HttpChannel(URL url) {
        this.url = url;
    }
    @Override
    public long position() {
        return pos;
    }
    @Override
    public SeekableByteChannel position(long newPosition) throws IOException {
        if (newPosition == pos) {
            return this;
        } else if (ch != null) {
            ch.close();
            ch = null;
        }
        pos = newPosition;
        return this;
    }
    @Override
    public long size() throws IOException {
        ensureOpen();
        return length;
    }
    @Override
    public SeekableByteChannel truncate(long size) {
        throw new UnsupportedOperationException("在HTTP上不支持截断。");
    }
    @Override
    public int read(ByteBuffer buffer) throws IOException {
        ensureOpen();
        int read = ch.read(buffer);
        if (read != -1)
            pos += read;
        return read;
    }
    @Override
    public int write(ByteBuffer buffer) {
        throw new UnsupportedOperationException("在HTTP上不支持写入。");
    }
    @Override
    public boolean isOpen() {
        return ch != null && ch.isOpen();
    }
    @Override
    public void close() throws IOException {
        ch.close();
    }
    private void ensureOpen() throws IOException {
        if (ch == null) {
            URLConnection connection = url.openConnection();
            if (pos > 0)
                connection.addRequestProperty("Range", "bytes=" + pos + "-");
            ch = Channels.newChannel(connection.getInputStream());
            String resp = connection.getHeaderField("Content-Range");
            if (resp != null) {
                length = Long.parseLong(resp.split("/")[1]);
            } else {
                resp = connection.getHeaderField("Content-Length");
                length = Long.parseLong(resp);
            }
        }
    }
}

英文:

Using Apache Commons Compress and a custom ByteChannel implementation backed by HTTP:

var url = new URL(...);
var fileName = &quot;file.txt&quot;;
var dest = Path.of(fileName);
try (var zip = new ZipFile(new HttpChannel(url), &quot;zip&quot;, &quot;UTF8&quot;, true, true);
    var stream = zip.getInputStream(zip.getEntry(fileName))) {
    Files.copy(stream, dest, StandardCopyOption.REPLACE_EXISTING);
}

HttpChannel (modified from JCodec):

public class HttpChannel implements SeekableByteChannel {
private final URL url;
private ReadableByteChannel ch;
private long pos;
private long length;
public HttpChannel(URL url) {
this.url = url;
}
@Override
public long position() {
return pos;
}
@Override
public SeekableByteChannel position(long newPosition) throws IOException {
if (newPosition == pos) {
return this;
} else if (ch != null) {
ch.close();
ch = null;
}
pos = newPosition;
return this;
}
@Override
public long size() throws IOException {
ensureOpen();
return length;
}
@Override
public SeekableByteChannel truncate(long size) {
throw new UnsupportedOperationException(&quot;Truncate on HTTP is not supported.&quot;);
}
@Override
public int read(ByteBuffer buffer) throws IOException {
ensureOpen();
int read = ch.read(buffer);
if (read != -1)
pos += read;
return read;
}
@Override
public int write(ByteBuffer buffer) {
throw new UnsupportedOperationException(&quot;Write to HTTP is not supported.&quot;);
}
@Override
public boolean isOpen() {
return ch != null &amp;&amp; ch.isOpen();
}
@Override
public void close() throws IOException {
ch.close();
}
private void ensureOpen() throws IOException {
if (ch == null) {
URLConnection connection = url.openConnection();
if (pos &gt; 0)
connection.addRequestProperty(&quot;Range&quot;, &quot;bytes=&quot; + pos + &quot;-&quot;);
ch = Channels.newChannel(connection.getInputStream());
String resp = connection.getHeaderField(&quot;Content-Range&quot;);
if (resp != null) {
length = Long.parseLong(resp.split(&quot;/&quot;)[1]);
} else {
resp = connection.getHeaderField(&quot;Content-Length&quot;);
length = Long.parseLong(resp);
}
}
}
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Java中从大型远程zip文件中下载单个文件？

问题

答案1

答案2

Java服务器使用套接字连接到其他服务器

网关超时问题发生在AJP连接器和Tomcat 8.5.54之间。

if else以及将根字段的值移动到嵌套中

在使用LWJGL中的glCreateShader时出现上下文错误。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。