英文:
Decompress large binary files
问题
我有一个函数,使用以下方法来解压大型zip文件。有时候我会遇到OutOfMemoryError
错误,因为文件实在太大了。是否有办法优化我的代码?我读到过关于将文件分割成适合内存的较小部分进行解压缩的内容,但我不知道如何操作。任何帮助或建议将不胜感激。
private static String decompress(String s) {
String pathOfFile = null;
try (BufferedReader reader = new BufferedReader(new InputStreamReader(new GZIPInputStream(new FileInputStream(s)), Charset.defaultCharset()))) {
File file = new File(s + ".decompressed"); // Create a new file to write decompressed data
FileOutputStream fos = new FileOutputStream(file);
char[] buffer = new char[8192]; // Use a char buffer for more efficient reading/writing
int bytesRead;
while ((bytesRead = reader.read(buffer)) != -1) {
fos.write(new String(buffer, 0, bytesRead).getBytes());
fos.flush();
}
pathOfFile = file.getAbsolutePath();
} catch (IOException e) {
e.printStackTrace();
}
return pathOfFile;
}
堆栈跟踪:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3689)
at java.base/java.util.ArrayList.grow(ArrayList.java:237)
at java.base/java.util.ArrayList.ensureCapacity(ArrayList.java:217)
英文:
I have a function to decompress large zip files using the below method. They are times where I run into OutOfMemoryError
error because the file is just too large. Is there a way I can optimize my code? I have read something about breaking the file into smaller parts that can fit into memory and decompress but I don't know how to do that. Any help or suggestion is appreciated.
private static String decompress(String s){
String pathOfFile = null;
try(BufferedReader reader = new BufferedReader(new InputStreamReader(new GZIPInputStream(new FileInputStream(s)), Charset.defaultCharset()))){
File file = new File(s);
FileOutputStream fos = new FileOutputStream(file);
String line;
while((line = reader.readLine()) != null){
fos.write(line.getBytes());
fos.flush();
}
pathOfFile = file.getAbsolutePath();
} catch (IOException e) {
e.printStackTrace();
}
return pathOfFile;
}
The stacktrace:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3689)
at java.base/java.util.ArrayList.grow(ArrayList.java:237)
at java.base/java.util.ArrayList.ensureCapacity(ArrayList.java:217)
答案1
得分: 2
不要使用Reader
类,因为你不需要逐字符或逐行写入输出文件。你应该使用InputStream.transferTo()
方法逐字节读取和写入:
try (var in = new GZIPInputStream(new FileInputStream(inFile));
var out = new FileOutputStream(outFile)) {
in.transferTo(out);
}
此外,你可能不需要显式调用flush()
,在每行之后这样做是浪费的。
英文:
Don't use Reader
classes because you don't need to write output file character by character or line by line. You should read and write byte
by byte
with InputStream.transferTo()
method:
try(var in = new GZIPInputStream(new FileInputStream(inFile));
var out = new FileOutputStream(outFile)) {
in.transferTo(out);
}
Also you probably don't need to call flush()
explicitly, doing it after every line is wasteful.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论