在字符串中读取带有 POI 的 Excel 文件

huangapple go评论173阅读模式
英文:

POI reading Excel file with body in String

问题

目前我正在尝试通过Apache Camel(2.25.1)轮询读取一个Excel文件。
这意味着该方法通过String获取文件内容:

@Handler
public void processFile(@Body String body) {

为了读取Excel文件,我使用Apache POI和POI-ooxml(均为4.1.2版本)。

然而,直接使用String:

WorkbookFactory.create(new ByteArrayInputStream(body.getBytes(Charset.forName("UTF-8"))))

会抛出"java.io.IOException: ZIP entry size is too large or invalid"。

使用其他编码的String:

WorkbookFactory.create(new ByteArrayInputStream(body.getBytes()))
会抛出"org.apache.poi.openxml4j.exceptions.NotOfficeXmlFileException: No valid entries or contents found, this is not a valid OOXML (Office Open XML) file"。

此外,我尝试过:

File file = exchange.getIn().getBody(File.class);
Workbook workbook = new XSSFWorkbook(new FileInputStream(file));

可能是因为文件是从FTP服务器读取的,会抛出java.io.FileNotFoundException:无效的文件路径。

然而,下面的代码可以工作:

URL url = new URL(fileFtpPath);
URLConnection urlc = url.openConnection();
InputStream ftpIs = urlc.getInputStream();
Workbook workbook = new XSSFWorkbook(ftpIs);

但我更倾向于不自己连接到FTP服务器,因为Camel已经读取了文件并且所需的Excel内容可用(在String body中)。
是否有任何方法可以使用Apache POI从String中读取Excel文件的内容?
英文:

Currenty I am trying to read an Excel file that is polled via Apache Camel (2.25.1).
This means the method gets the file contents via a String:

@Handler
public void processFile(@Body String body) {

For reading the Excel file I use Apache POI and POI-ooxml (both 4.1.2).

However, using the String directly

WorkbookFactory.create(new ByteArrayInputStream(body.getBytes(Charset.forName("UTF-8"))))

throws an "java.io.IOException: ZIP entry size is too large or invalid".

Using the String with other encodings:

WorkbookFactory.create(new ByteArrayInputStream(body.getBytes()))

throw "org.apache.poi.openxml4j.exceptions.NotOfficeXmlFileException: No valid entries or contents found, this is not a valid OOXML (Office Open XML) file".

Besides, I tried:

File file = exchange.getIn().getBody(File.class);
Workbook workbook = new XSSFWorkbook(new FileInputStream(file));

Probably because the file is read from an FTP-server, a java.io.FileNotFoundException is thrown: Invalid file path

However, the next code does work:

URL url = new URL(fileFtpPath);
URLConnection urlc = url.openConnection();
InputStream ftpIs = urlc.getInputStream();
Workbook workbook = new XSSFWorkbook(ftpIs);

But I prefer not making a connection to the FTP server myself, since Camel has already read the file and the needed Excel contents are available (in String body).
Is there any way to read the contents of the Excel file from the String with Apache POI?

答案1

得分: 0

我有我的路由在XML中,所以我使用Groovy来处理Excel文件,也许你会发现它有帮助。

import org.apache.poi.ss.usermodel.WorkbookFactory

def workbook = WorkbookFactory.create(request.getBody(File.class))
def sheet = workbook.getSheetAt(0)
...

还有另一种方法,通常用于处理大型Excel文件,我们在处理流时会使用。要走这条路,我们应该从org.apache.poi.xssf.eventusermodel实现XSSFSheetXMLHandler.SheetContentsHandler

你可以在这个SO问题中找到原始POI示例的副本,由于某种原因,它最近已从poi svn中删除。如果你感兴趣,我的Groovy版本看起来像这样:

import org.apache.poi.openxml4j.opc.OPCPackage
import org.apache.poi.ooxml.util.SAXHelper
import org.apache.poi.xssf.eventusermodel.XSSFReader
import org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler
import org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable
import org.apache.poi.hssf.usermodel.HSSFDataFormatter
import org.xml.sax.InputSource

class MyHandler implements XSSFSheetXMLHandler.SheetContentsHandler {
    ...
}

def pkg = OPCPackage.open(request.getBody(InputStream.class))
def xssfReader = new XSSFReader(pkg)
def sheetParser = SAXHelper.newXMLReader()

def handler = new XSSFSheetXMLHandler(xssfReader.getStylesTable(), null, new ReadOnlySharedStringsTable(pkg), MyHandler, new HSSFDataFormatter(), false)
sheetParser.setContentHandler(handler)
sheetParser.parse(new InputSource(xssfReader.getSheetsData().next()))

<details>
<summary>英文:</summary>

I have my routes in XML, so I use groovy to process excel files, perhaps you may find it helpful

import org.apache.poi.ss.usermodel.WorkbookFactory

def workbook = WorkbookFactory.create(request.getBody(File.class))
def sheet = workbook.getSheetAt(0)
...

There is another approach usually using for large excel files where we are dealing with a stream. To go this way we should implement `XSSFSheetXMLHandler.SheetContentsHandler` from `org.apache.poi.xssf.eventusermodel`   
You could find a copy of the original POI example in this [SO question][1], for some reason it was recently deleted from [poi svn][2]. If you are interested, my groovy version looks like this

import org.apache.poi.openxml4j.opc.OPCPackage
import org.apache.poi.ooxml.util.SAXHelper
import org.apache.poi.xssf.eventusermodel.XSSFReader
import org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler
import org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable
import org.apache.poi.hssf.usermodel.HSSFDataFormatter
import org.xml.sax.InputSource

class MyHandler implements XSSFSheetXMLHandler.SheetContentsHandler {
...
}

def pkg = OPCPackage.open(request.getBody(InputStream.class))
def xssfReader = new XSSFReader(pkg)
def sheetParser = SAXHelper.newXMLReader()

def handler = new XSSFSheetXMLHandler(xssfReader.getStylesTable(), null, new ReadOnlySharedStringsTable(pkg), MyHandler, new HSSFDataFormatter(), false)
sheetParser.setContentHandler(handler)
sheetParser.parse(new InputSource(xssfReader.getSheetsData().next()))

  [1]: https://stackoverflow.com/questions/51447162/reading-big-xls-and-xlsx-files
  [2]: https://svn.apache.org/repos/asf/poi/trunk/src/examples/

</details>



# 答案2
**得分**: 0

你可以直接将消息体转换为InputStream,然后将其传递给XSSFWorkbook构造函数。

```java
Exchange exchange = consumerTemplate.receive("file://C:/ftp/?noop=true", pollCount);
InputStream stream = exchange.getIn().getBody(InputStream.class);

XSSFWorkbook workbook = new XSSFWorkbook(stream);

XSSFSheet sheet = workbook.getSheetAt(0);
英文:

You can directly convert the body into InputStream and pass this into XSSFWorkbook constructor

        Exchange exchange = consumerTemplate.receive(&quot;file://C:/ftp/?noop=true&quot;, pollCount);
    	InputStream stream = exchange.getIn().getBody(InputStream.class);

    	XSSFWorkbook workbook = new XSSFWorkbook(stream);
    	       	
    	XSSFSheet sheet = workbook.getSheetAt(0);

huangapple
  • 本文由 发表于 2020年7月27日 16:03:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/63111096.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定