如何在pytesseract中使用tessedit_write_images?

huangapple go评论101阅读模式
英文:

How to use tessedit_write_images with pytesseract?

问题

I'm using pytesseract 0.3.10 with tesseract 5.3.0.
I want to take a look at how tesseract processed my images.
I tried setting tessedit_write_images to true via:

import pytesseract as pt
pt.image_to_string(crop_img, lang='eng+deu+fra+spa', config="--psm 6 -c tessedit_write_images=1")

But this is not working. The tessinput.tif file is nowhere to be found.
(The --psm 6 part is working.)

I also tried to use tessedit_write_images=True or tessedit_write_images=T.
Using pt.run_and_get_output() is also not working.

Is there a possibility to set the variable tessedit_write_images to true outside my python script?

英文:

I'm using pytesseract 0.3.10 with tesseract 5.3.0.
I want to take a look at how tesseract processed my images.
I tried setting tessedit_write_images to true via:

import pytesseract as pt
pt.image_to_string(crop_img, lang='eng+deu+fra+spa', config="--psm 6 -c tessedit_write_images=1")

But this is not working. The tessinput.tif file is nowhere to be found.
(The --psm 6 part is working.)

I also tried to use tessedit_write_images=True or tessedit_write_images=T.
Using pt.run_and_get_output() is also not working.

Is there a possibility to set the variable tessedit_write_images to true outside my python script?

答案1

得分: 1

创建一个名为“config”的文本文件,并将以下内容写入其中:

tessedit_write_images true

然后使用命令行:

tesseract Text.png out.txt config

这会给你一个文本文件和一个**.tiff文件**。如果你将config重命名为config.txt,也可以在Python的子进程中使用:

import subprocess

process = subprocess.run(["tesseract", "Text.png", "out.txt", "config.txt"], shell=False, stdout=subprocess.PIPE)

PS:我使用的是tesseract v5.1.0.20220510和leptonica-1.78.0。

英文:

Create a "config" text file and write into it:

tessedit_write_images true

Than use the command line: tesseract Text.png out.txt config

This gives you a text and a .tiff file. If you rename config to config.txt works also in python subprocess:

import subprocess

process = subprocess.run(["tesseract", "Text.png", "out.txt", "config.txt"], shell=False, stdout=subprocess.PIPE)

PS: I used tesseract v5.1.0.20220510
leptonica-1.78.0

huangapple
  • 本文由 发表于 2023年3月8日 19:31:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75672460.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定