如何在Java中从大型远程zip文件中下载单个文件?

huangapple go评论62阅读模式
英文:

How can I download a single file from a large remote zip file in Java?

问题

我正在尝试从大小约为3-5 GB的给定zip文件中下载一个小文件(0.3 KB)。

我目前一直在使用本地库libfragmentzip,使用JNA,这非常快,但使用本地库会有自己的问题(比如不跨平台)。

我尝试过这个解决方案,但速度要慢得多,需要几分钟,而不是使用libfragmentzip,后者似乎只需要几秒钟。

这个是一个测试zip文件的URL(扩展名是.ipsw,但实际上是zip格式)。我要下载的文件是BuildManifest.plist,位于zip文件的根目录中。

有没有一种快速的方法可以从远程zip文件中下载单个文件,而不使用本地库?

英文:

I'm trying to download a small file (0.3 KB) from a given zip file that's around 3-5 GB in size.

I have currently been using the native library libfragmentzip using JNA, which is very fast, but has issues of its own that come with using native libraries (like not being cross-platform).

I have tried this solution, but it is much slower and ends up taking minutes compared to using libfragmentzip, which only seems to take seconds.

This is a URL to a test zip file (the extension is .ipsw but it is really a zip). The file I am trying to download is BuildManifest.plist, in the root of the zip.

Is there a fast way to download a single file from a remote zip file without using a native library?

答案1

得分: 2

你可以将 BuildManifest.plist 插入到URL的末尾。

例如:

http://updates-http.cdn-apple.com/2021SpringFCS/fullrestores/071-34317/E63B034D-2116-42D0-9FBD-97A3D9060F68/BuildManifest.plist

英文:

You can insert BuildManifest.plist at the end of the URL.

For example:

http://updates-http.cdn-apple.com/2021SpringFCS/fullrestores/071-34317/E63B034D-2116-42D0-9FBD-97A3D9060F68/BuildManifest.plist

答案2

得分: 1

使用Apache Commons Compress和一个由HTTP支持的自定义ByteChannel实现:

var url = new URL(...);
var fileName = "file.txt";
var dest = Path.of(fileName);

try (var zip = new ZipFile(new HttpChannel(url), "zip", "UTF8", true, true);
    var stream = zip.getInputStream(zip.getEntry(fileName))) {
    Files.copy(stream, dest, StandardCopyOption.REPLACE_EXISTING);
}

HttpChannel(修改自JCodec):

public class HttpChannel implements SeekableByteChannel {

    private final URL url;
    private ReadableByteChannel ch;
    private long pos;
    private long length;

    public HttpChannel(URL url) {
        this.url = url;
    }

    @Override
    public long position() {
        return pos;
    }

    @Override
    public SeekableByteChannel position(long newPosition) throws IOException {
        if (newPosition == pos) {
            return this;
        } else if (ch != null) {
            ch.close();
            ch = null;
        }
        pos = newPosition;
        return this;
    }

    @Override
    public long size() throws IOException {
        ensureOpen();
        return length;
    }

    @Override
    public SeekableByteChannel truncate(long size) {
        throw new UnsupportedOperationException("在HTTP上不支持截断。");
    }

    @Override
    public int read(ByteBuffer buffer) throws IOException {
        ensureOpen();
        int read = ch.read(buffer);
        if (read != -1)
            pos += read;
        return read;
    }

    @Override
    public int write(ByteBuffer buffer) {
        throw new UnsupportedOperationException("在HTTP上不支持写入。");
    }

    @Override
    public boolean isOpen() {
        return ch != null && ch.isOpen();
    }

    @Override
    public void close() throws IOException {
        ch.close();
    }

    private void ensureOpen() throws IOException {
        if (ch == null) {
            URLConnection connection = url.openConnection();
            if (pos > 0)
                connection.addRequestProperty("Range", "bytes=" + pos + "-");
            ch = Channels.newChannel(connection.getInputStream());
            String resp = connection.getHeaderField("Content-Range");
            if (resp != null) {
                length = Long.parseLong(resp.split("/")[1]);
            } else {
                resp = connection.getHeaderField("Content-Length");
                length = Long.parseLong(resp);
            }
        }
    }
}
英文:

Using Apache Commons Compress and a custom ByteChannel implementation backed by HTTP:

var url = new URL(...);
var fileName = "file.txt";
var dest = Path.of(fileName);

try (var zip = new ZipFile(new HttpChannel(url), "zip", "UTF8", true, true);
    var stream = zip.getInputStream(zip.getEntry(fileName))) {
    Files.copy(stream, dest, StandardCopyOption.REPLACE_EXISTING);
}

HttpChannel (modified from JCodec):

public class HttpChannel implements SeekableByteChannel {
private final URL url;
private ReadableByteChannel ch;
private long pos;
private long length;
public HttpChannel(URL url) {
this.url = url;
}
@Override
public long position() {
return pos;
}
@Override
public SeekableByteChannel position(long newPosition) throws IOException {
if (newPosition == pos) {
return this;
} else if (ch != null) {
ch.close();
ch = null;
}
pos = newPosition;
return this;
}
@Override
public long size() throws IOException {
ensureOpen();
return length;
}
@Override
public SeekableByteChannel truncate(long size) {
throw new UnsupportedOperationException("Truncate on HTTP is not supported.");
}
@Override
public int read(ByteBuffer buffer) throws IOException {
ensureOpen();
int read = ch.read(buffer);
if (read != -1)
pos += read;
return read;
}
@Override
public int write(ByteBuffer buffer) {
throw new UnsupportedOperationException("Write to HTTP is not supported.");
}
@Override
public boolean isOpen() {
return ch != null && ch.isOpen();
}
@Override
public void close() throws IOException {
ch.close();
}
private void ensureOpen() throws IOException {
if (ch == null) {
URLConnection connection = url.openConnection();
if (pos > 0)
connection.addRequestProperty("Range", "bytes=" + pos + "-");
ch = Channels.newChannel(connection.getInputStream());
String resp = connection.getHeaderField("Content-Range");
if (resp != null) {
length = Long.parseLong(resp.split("/")[1]);
} else {
resp = connection.getHeaderField("Content-Length");
length = Long.parseLong(resp);
}
}
}
}

huangapple
  • 本文由 发表于 2020年9月24日 06:15:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/64037006.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定