英文:
It is possible to use the TessAPI1.TessPDFRendererCreate API of tess4J without needing to create physical files?
问题
我正在使用Tesseract Java API(tess4J)将Tiff图像转换为PDF。
这个方法很好,但我被迫将源Tiff图像和输出的PDF都写入本地文件存储,作为实际的物理文件,以便使用TessAPI1.TessPDFRendererCreate
API。
请注意以下代码片段中的以下内容:
- 输入的Tiff最初是一个
java.awt.image.BufferedImage
,但我必须将其写入物理文件(sourceTiffFile是一个File对象)。 - 我必须为输出指定文件路径(pdfFullFilepath是表示新PDF文件的绝对路径的字符串)。
try {
ImageIO.write(bufferedImage, "tiff", sourceTiffFile);
} catch (Exception ioe) {
//处理代码...
}
TessResultRenderer renderer = TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0);
TessAPI1.TessResultRendererInsert(renderer, TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0));
int result = TessAPI1.TessBaseAPIProcessPages(handle, sourceTiffFile.getAbsolutePath(), null, 0, renderer);
我真的想避免创建物理文件,但不确定是否可以使用这个API实现。理想情况下,我希望将Tiff作为java.awt.image.BufferedImage
或字节数组传递,并将输出的PDF作为字节数组接收。
如常,任何建议都将不胜感激。谢谢
英文:
I am using the Tesseract Java API (tess4J) to convert Tiff images to PDFs.
This works nicely, but I am forced to write both the source Tiff image and the output PDF to local filestore as actual physical files in order to use the TessAPI1.TessPDFRendererCreate
API.
Please note the following in the code snippet below: -
-
The input Tiff is originally a
java.awt.image.BufferedImage
, but I have to write it to a physical file (sourceTiffFile is a File object). -
I must specify a file path for the output (pdfFullFilepath is a String representing an absolute path for the new PDF file).
try { ImageIO.write(bufferedImage, "tiff", sourceTiffFile); } catch (Exception ioe) { //handling code... } TessResultRenderer renderer = TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0); TessAPI1.TessResultRendererInsert(renderer, TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0)); int result = TessAPI1.TessBaseAPIProcessPages(handle, sourceTiffFile.getAbsolutePath(), null, 0, renderer);
I would really like to avoid creating physical files, but am not sure if it is possible with this API. Ideally, I would like to pass the Tiff as a java.awt.image.BufferedImage
or a byte array and receive the output PDF as a byte array.
Any suggestions would be most welcome as always. Thank you
答案1
得分: 1
你可以在 ProcessPage
API 方法中传递一个 Pix
,它可以从 BufferedImage
转换而来,但输出仍然会是一个实际文件。这是由 Tesseract API 规定的。
https://tesseract-ocr.github.io/tessapi/4.0.0/a01625.html
http://tess4j.sourceforge.net/docs/docs-4.4/net/sourceforge/tess4j/TessAPI1.html
例如:
int result = TessAPI1.TessBaseAPIProcessPage(handle, LeptUtils.convertImageToPix(bufferedImage), page_index, "输入文件名", null, 0, renderer);
英文:
You can pass in ProcessPage
API method a Pix
, which can be converted from a BufferedImage
, but the output will still be a physical file. Tesseract API dictates that.
https://tesseract-ocr.github.io/tessapi/4.0.0/a01625.html
http://tess4j.sourceforge.net/docs/docs-4.4/net/sourceforge/tess4j/TessAPI1.html
For ex:
int result = TessAPI1.TessBaseAPIProcessPage(handle, LeptUtils.convertImageToPix(bufferedImage), page_index, "input file name", null, 0, renderer);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论