可以使用tess4J的TessAPI1.TessPDFRendererCreate API而不需要创建实际文件吗?

huangapple go评论88阅读模式
英文:

It is possible to use the TessAPI1.TessPDFRendererCreate API of tess4J without needing to create physical files?

问题

我正在使用Tesseract Java API(tess4J)将Tiff图像转换为PDF。

这个方法很好,但我被迫将源Tiff图像和输出的PDF都写入本地文件存储,作为实际的物理文件,以便使用TessAPI1.TessPDFRendererCreate API。

请注意以下代码片段中的以下内容:

  1. 输入的Tiff最初是一个java.awt.image.BufferedImage,但我必须将其写入物理文件(sourceTiffFile是一个File对象)。
  2. 我必须为输出指定文件路径(pdfFullFilepath是表示新PDF文件的绝对路径的字符串)。
try {
    ImageIO.write(bufferedImage, "tiff", sourceTiffFile);
} catch (Exception ioe) {
    //处理代码...
}

TessResultRenderer renderer = TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0);
TessAPI1.TessResultRendererInsert(renderer, TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0));
int result = TessAPI1.TessBaseAPIProcessPages(handle, sourceTiffFile.getAbsolutePath(), null, 0, renderer);

我真的想避免创建物理文件,但不确定是否可以使用这个API实现。理想情况下,我希望将Tiff作为java.awt.image.BufferedImage或字节数组传递,并将输出的PDF作为字节数组接收。

如常,任何建议都将不胜感激。谢谢 可以使用tess4J的TessAPI1.TessPDFRendererCreate API而不需要创建实际文件吗?

英文:

I am using the Tesseract Java API (tess4J) to convert Tiff images to PDFs.

This works nicely, but I am forced to write both the source Tiff image and the output PDF to local filestore as actual physical files in order to use the TessAPI1.TessPDFRendererCreate API.

Please note the following in the code snippet below: -

  1. The input Tiff is originally a java.awt.image.BufferedImage, but I have to write it to a physical file (sourceTiffFile is a File object).

  2. I must specify a file path for the output (pdfFullFilepath is a String representing an absolute path for the new PDF file).

    	try {
    		ImageIO.write(bufferedImage, "tiff", sourceTiffFile);
    	} catch (Exception ioe) {
    		//handling code...
    	}
    
    	TessResultRenderer renderer = TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0);
    	TessAPI1.TessResultRendererInsert(renderer, TessAPI1.TessPDFRendererCreate(pdfFullFilepath, dataPath, 0));
    	int result = TessAPI1.TessBaseAPIProcessPages(handle, sourceTiffFile.getAbsolutePath(), null, 0, renderer);
    

I would really like to avoid creating physical files, but am not sure if it is possible with this API. Ideally, I would like to pass the Tiff as a java.awt.image.BufferedImage or a byte array and receive the output PDF as a byte array.

Any suggestions would be most welcome as always. Thank you 可以使用tess4J的TessAPI1.TessPDFRendererCreate API而不需要创建实际文件吗?

答案1

得分: 1

你可以在 ProcessPage API 方法中传递一个 Pix,它可以从 BufferedImage 转换而来,但输出仍然会是一个实际文件。这是由 Tesseract API 规定的。

https://tesseract-ocr.github.io/tessapi/4.0.0/a01625.html

http://tess4j.sourceforge.net/docs/docs-4.4/net/sourceforge/tess4j/TessAPI1.html

例如:

int result = TessAPI1.TessBaseAPIProcessPage(handle, LeptUtils.convertImageToPix(bufferedImage), page_index, "输入文件名", null, 0, renderer);
英文:

You can pass in ProcessPage API method a Pix, which can be converted from a BufferedImage, but the output will still be a physical file. Tesseract API dictates that.

https://tesseract-ocr.github.io/tessapi/4.0.0/a01625.html

http://tess4j.sourceforge.net/docs/docs-4.4/net/sourceforge/tess4j/TessAPI1.html

For ex:

int result = TessAPI1.TessBaseAPIProcessPage(handle, LeptUtils.convertImageToPix(bufferedImage), page_index, "input file name", null, 0, renderer);

huangapple
  • 本文由 发表于 2020年7月29日 22:34:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/63156046.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定