如何改进我的代码以避免GC超限错误?

huangapple go评论62阅读模式
英文:

How can I improve my code to avoid GC overhead limit error?

问题

I'm sorry, but the content you provided seems to be code-related, and I'm currently set to provide responses in Chinese based on the user profile. If you would like assistance with translating the code or addressing the issue mentioned, please provide a translation request, and I'll be happy to help.

英文:

I'm getting this error message:
java.lang.OutOfMemoryError: GC overhead limit exceeded

at catch block of this code:

try (BufferedReader br = new BufferedReader(new FileReader(fileName))) 
	{
		System.out.println("Starting try block");
		
	    String line;
	    Row row;
	    Cell cell;
	    int rowIndex = 0;
	    while ((line = br.readLine()) != null) 
	    {
	        row = sheet.createRow(rowIndex);
	        String[] tokens = line.split("[|]");
	        for(int iToken = 0; iToken < tokens.length; iToken++) 
	        {
	            cell = row.createCell(iToken);
	            cell.setCellValue(tokens[iToken]);
	        }
	        rowIndex++;
	    }
	} 
	catch(Throwable e) 
	{
	    e.printStackTrace();
	}

The file I'm reading is large txt file (~90000KB).
After increasing VM memory to 2048K at run time I stopped getting HeapSize error but started getting GC error.
How can I modify the code to avoid the GC error?

答案1

得分: 3

我猜测SheetRowCell是某个Java电子表格API中的类。

以下是翻译的部分:

这里有个坏消息。

问题在于,你的代码正在构建一个代表内存中电子表格的大型数据结构,然后将其写入文件。显然,数据结构的大小超出了你的JVM堆内存的容量。

第二个问题是,如果你继续以这种方式使用上述API,你将无法减少内存使用量。

有几种方法可以解决这个问题:

  1. 增加堆内存大小,并持续增加直到不再出现OutOfMemoryError。如果这意味着你需要一台具有更多RAM的机器来运行应用程序,请获取一台。

  2. 如果你正在使用的API支持以流式方式写入数据,请使用它。或者寻找支持流式处理的替代API。

  3. 也许有一种非流式的电子表格API,比你目前使用的API占用更少内存。(参见@Holger的评论。)

  4. 不要生成电子表格。电子表格(在我看来)是一种表示数据的低效方式。相反,将数据输出为CSV文件、JSON文件、XML文件或任何可以轻松进行流式处理的格式。

  5. 如果“业务类型”坚持要电子表格,你可以将数据输出为CSV文件,然后使用外部工具从CSV文件创建电子表格。


POI的流式版本在这里描述。

英文:

I am guessing that Sheet, Row and Cell are classes from some Java spreadsheet API.

Here's the bad news.

The problem is that your code is building a large data structure that represents the spreadsheet in memory, and then writing it to a file. Apparently the data structure is larger than can be accommodated by your JVM's heap.

The second problem is that if you continue to use the aforementioned API in this way, you won't be able to reduce the memory utilization.

There are a few ways to address this:

  1. Increase the heap size, and keep increasing it until you don't get OOMEs. If that means you need a machine with more RAM to run your application, get one.

  2. If the API that you are using has a streaming mode for writing the data, use that. Or look for an alternative to this API that supports streaming.

  3. Maybe there is a non-streaming spreadsheet API that uses less memory than the one you are currently using. (See @Holger's comment.)

  4. Don't generate a spreadsheet. Spreadsheets are (IMO) an inefficient way of representing data. Instead, output the data as a CSV file, a JSON file, an XML file or any other format that can easily be streamed.

  5. If the "business types" insist on spreadsheets, you could maybe output the data as a CSV file and then use an external tool to create the spreadsheet from the CSV file.


The streaming version of POI is described here.

huangapple
  • 本文由 发表于 2020年8月12日 16:23:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/63372652.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定