Unzip specific file from inputstream into another inputstream with same file name and type in java

huangapple go评论60阅读模式
英文:

Unzip specific file from inputstream into another inputstream with same file name and type in java

问题

I have an inputstream that contains a zip file, I need to extract a specific file from this zip file, and put the extracted file in an inputstream, with extracted file type and name.

Example: zip file in an inputstream, I need to unzip ".pdf" file, and put this ".pdf" file in an inputstream with the ".pdf" file name and type.

I tried the following flow:

private InputStream extractPdfFile(InputStream zipInputStream) {
    try {
        ZipInputStream zip = new ZipInputStream(zipInputStream);
        ZipEntry entry;
        while ((entry = zip.getNextEntry()) != null) {
            String currentFileName = entry.getName();
            if (currentFileName.endsWith(PP_FILE_EXTENTION)) {
                ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                byte[] buffer = new byte[2048];
                int bytesRead;
                while ((bytesRead = zip.read(buffer)) != -1) {
                    outputStream.write(buffer, 0, bytesRead);
                }
                zip.closeEntry();
                zip.close();
                outputStream.close();
                return new ByteArrayInputStream(outputStream.toByteArray());
            }
            zip.closeEntry();
        }

        zip.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
    return null; // File not found in the zip
}

But it returns the new inputstream in zip format not .pdf.

Any idea how to unzip a file from inputstream and return the extracted file unzipped in another inputstream?

英文:

I have an inputstream that contains a zip file, I need to extract a specific file from this zip file, and put the extracted file in an inputstream, with extracted file type and name.
Example: zip file in an inputstream, I need to unzip ".pdf" file, and put this ".pdf" file in an inputstream with the ".pdf" file name and type.

I tried the following flow:

private InputStream extractPdfFile(InputStream zipInputStream) {
        try {
            ZipInputStream zip = new ZipInputStream(zipInputStream);
            ZipEntry entry;
            while ((entry = zip.getNextEntry()) != null) {
                String currentFileName = entry.getName();
                if (currentFileName.endsWith(PP_FILE_EXTENTION)) {
                    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                    byte[] buffer = new byte[2048];
                    int bytesRead;
                    while ((bytesRead = zip.read(buffer)) != -1) {
                        outputStream.write(buffer, 0, bytesRead);
                    }
                    zip.closeEntry();
                    zip.close();
                    outputStream.close();
                    return new ByteArrayInputStream(outputStream.toByteArray());
                }
                zip.closeEntry();
            }

            zip.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null; // File not found in the zip
    }

But it returns the new inputstream in zip format not .pdf.

Any idea how to unzip a file from inputstream and return the extracted file unzipped in another inputstream?

答案1

得分: 1

我有一个包含zip文件的输入流。

不,你没有。你可能是指你有一个InputStream,可以从中读取zip格式的数据。这不是同一件事,而这个区别可能很重要。

我需要从这个zip文件中提取特定文件,并将提取的文件放入一个带有提取文件类型和名称的输入流中。

这也是重要的一种方式。流中的数据没有关联的文件名。流只是数据的通道。你可以创建一个流,以提供特定文件的内容,但这不是流本身的可识别特征。文件的内容类型当然是数据的特征,但如果你所说的“文件类型”是指“文件扩展名”或类似的东西,那么不,这不是InputStream的一部分。

特别是,ZipInputStream确实支持对内容的更高级抽象,这样你可以将其读取为ZipEntry的序列以及它们关联的数据,但这是由数据的特定形式(zip)启用的功能。通用InputStream没有类似的功能。

我尝试了以下流程:[...]但它返回的新输入流是zip格式,而不是.pdf格式。

难以置信。各种ZipInputStream.read()方法读取条目的未压缩数据。这是有文档记录的,而且很长时间以来都一直有效。你已经实现了一个可以成功提取未压缩数据并提供InputStream以供读取未压缩数据的方法。

但再次强调,没有与返回的流关联的文件名或扩展名。它甚至没有与文件关联。该流只是提供了一个来自数组的字节序列。如果你想传递关联的文件名,那么你需要单独处理。例如,也许你想提供描述文件的ZipEntry,以及从中读取数据的InputStream,几乎像ZipInputStream本身一样。

你可以像这样捆绑它们在一起:

public class UncompressedData {

    private byte[] bytes;
    private String originalFilename;
    private boolean isDirectory;
    // ...

    private UncompressedData() {
        // 空的构造函数
    }

    /**
     * 从指定的ZipInputStream的当前条目创建并返回表示该条目的UncompressedData对象,
     * 此条目由指定的ZipEntry描述
     */
    public static UncompressedData fromZip(ZipEntry entry, ZipInputStream stream)
            throws MyUncompressedDataException {
        UncompressedData data = new UncompressedData();

        data.originalFilename = entry.getName();
        data.isDirectory = entry.isDirectory();

        if (data.isDirectory) {
            data.bytes = new byte[0];
        } else {
            int numToRead = entry.getLength();

            data.bytes = new byte[numToRead];
            for (int offset = 0; offset < numToRead; ) {
                int numRead = stream.read(data.bytes, offset, numToRead);
                if (numRead < 1) {
                    throw new MyCompressedDataException();
                }

                offset += numRead;
                numToRead -= numRead;
            }
        }

        return data;
    }

    /**
     * 创建并返回一个InputStream,可以从中读取此UncompressedData的字节。
     */
    public InputStream asStream() {
        return new ByteArrayInputStream(bytes);
    }

    /**
     * 返回与此UncompressedData关联的文件名。
     * 通常,这将是从中获取数据的原始源文件的名称。
     */
    public String getOriginalFilename() {
        return originalFilename;
    }
}
英文:

> I have an inputstream that contains a zip file

No you do not. Presumably you mean that you have an InputStream from which zip-formatted data can be read. This is not the same thing, and the distinction can be important.

> I need to extract a specific file from this zip file, and put the extracted file in an inputstream, with extracted file type and name.

And that's one of the ways in which it's important. The data in a stream do not have an associated filename. A stream is just a conduit for data. You may create a stream that provides content from a particular file, but that's not a recognizable characteristic of the stream itself. The type of the file's content is characteristic of the data, of course, but if by "file type" you mean "filename extension" or similar, then no, that just is not part of an InputStream.

A ZipInputStream in particular does support a higher level of abstraction over the content, so that you can read it as a sequence of ZipEntrys and their associated data, but that's a capability enabled by the specific form of the data (zip). There is no analogue for generic InputStreams.

> I tried the following flow: [...] But it returns the new inputstream in zip format not .pdf.

Implausible. The various ZipInputStream.read() methods read the uncompressed data of an entry. This is documented, and it has worked correctly for a very long time. You have implemented an approach that will successfully extract that uncompressed data, and will provide an InputStream from which that uncompressed data can then be read.

But again, there is no file name or extension associated with the stream that is returned. It's not even backed by a file. The stream just provides a sequence of bytes from an array. If you want to convey an associated filename then you need to do that separately. For example, perhaps you want to provide the ZipEntry describing the file, along with the InputStream from which the data can be read -- almost like ZipInputStream does itself.

You might, for example, bundle that together like so:

public class UncompressedData {
private byte[] bytes;
private String originalFilename;
private boolean isDirectory;
// ...
private UncompressedData() {
// empty
}
/**
* Creates and returns an UncompressedData object representing the
* current entry of the specified ZipInputStream, as described by
* the specified ZipEntry
*/
public static UncompressedData fromZip(ZipEntry entry, ZipInputStream stream)
throws MyUncompressedDataException {
UncompressedData data = new UncompressedData();
data.originalFilename = entry.getName();
data.isDirectory = entry.isDirectory();
if (data.isDirectory) {
data.bytes = new byte[0];
} else {
int numToRead = entry.getLength();
data.bytes = new byte[numToRead];
for (int offset = 0; offset &lt; numToRead; ) {
int numRead = stream.read(data.bytes, offset, numToRead);
if (numRead &lt; 1) {
throw new MyCompressedDataException();
}
offset += numRead;
numToRead -= numRead;
}
}
return data;
}
/**
* Creates and returns an InputStream from which the bytes of this
* UncompressedData can be read.
*/
public InputStream asStream() {
return new ByteArrayInputStream(bytes);
}
/**
* Returns a filename associated with this UncompressedData.
* Typically, this will be the name of an original source file from which
* the data were (somehow) obtained.
*/
public String getOriginalFilename() {
return originalFilename;
}
}

huangapple
  • 本文由 发表于 2023年6月8日 21:16:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76432256.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定