英文:
How to detect digits from image by using Tesseract 5?
问题
I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?
My environment:
- Windows 11 22H2
- WSL2 Ubuntu 22.04.1LTS
- tesseract 5.3.1-20-g58b7
I tried Tesseract like this
tesseract hoge.jpg output -l eng
and output.txt is
Fb¥;
&/0
Here is hoge.jpg
.
Thank you for helping in advance. I'm a Japanese student, so my English may be not so good. If you think it's not clear English, please change this post to make it more readable.
英文:
I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?
My environment:
- Windows 11 22H2
- WSL2 Ubuntu 22.04.1LTS
- tesseract 5.3.1-20-g58b7
I tried Tesseract like this
tesseract hoge.jpg output -l eng
and output.txt is
Fb¥
&/0
Here is hoge.jpg
.
Thank you for helping in advance. I'm a Japanese student, so my English may be not so good. If you think it's not clear English, please change this post to make it more readable.
答案1
得分: 1
以下是代码部分的翻译:
从糟糕的图片中,你永远不会得到好的结果。我稍微做了一些调整,得到了这个结果:
import subprocess
import cv2
import pytesseract
# 图像处理
# 命令请参考 https://imagemagick.org/script/convert.php
mag_img = r'D:\Programme\ImageMagic\magick.exe'
con_bw = r'D:\Programme\ImageMagic\convert.exe'
in_file = r'ZZ_Numbers.jpg'
out_file = r'ZZ_Numbers_bw.png'
# 调整为黑白并旋转以获得更好的结果
process = subprocess.run([con_bw, in_file, "-resize", "70%","-threshold","60%", "-rotate", "-17", "-brightness-contrast","-15x30",out_file])
# 文本处理
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread(out_file)
# 参数请参考 tesseract 文档
custom_config = r'--psm 11 --oem 3 tessedit_char_whitelist=0123456789'
tex = pytesseract.image_to_string(img, config=custom_config)
print(tex)
with open("cartootn.txt", 'w') as f:
f.writelines(tex)
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
希望这能帮助你理解代码。
英文:
From bad picture you will never get good results. I played a bit and get this one:
import subprocess
import cv2
import pytesseract
# Image manipulation
# Commands https://imagemagick.org/script/convert.php
mag_img = r'D:\Programme\ImageMagic\magick.exe'
con_bw = r"D:\Programme\ImageMagic\convert.exe"
in_file = r'ZZ_Numbers.jpg'
out_file = r'ZZ_Numbers_bw.png'
# Play with black and white and rotate for better results
process = subprocess.run([con_bw, in_file, "-resize", "70%","-threshold","60%", "-rotate", "-17", "-brightness-contrast","-15x30",out_file])
# Text ptocessing
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread(out_file)
# Parameters see tesseract doc
custom_config = r'--psm 11 --oem 3 tessedit_char_whitelist=0123456789'
tex = pytesseract.image_to_string(img, config=custom_config)
print(tex)
with open("cartootn.txt", 'w') as f:
f.writelines(tex)
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
答案2
得分: 0
I've given this image an attempt with basic image manipulation in python with pytesseract with mixed results. There seems to be two challenges in this image: the noisy background and the slant of the numbers. Using thresholding to set pixels to either black and white was able to almost get the bottom number as "6/0", but the slant of the "1" keeps getting recognized as a "/". The top gets read as "SEF", and I haven't figured out how to get a better result there.
from PIL import Image
import pytesseract as tess
img = Image.open('zgKoF.jpg')
img_arr = np.array(img)
img_arr[img_arr > 150] = 255
img_arr[img_arr < 100] = 0
tess.image_to_string(img_arr)
英文:
I've given this image an attempt with basic image manipulation in python with pytesseract with mixed results. There seems to be two challenges in this image: the noisy background and the slant of the numbers. Using thresholding to set pixels to either black and white was able to almost get the bottom number as "6/0", but the slant of the "1" keeps getting recognized as a "/". The top gets read as "SEF", and I haven't figured out how to get a better result there.
from PIL import Image
import pytesseract as tess
img = Image.open('zgKoF.jpg')
img_arr = np.array(img)
img_arr[img_arr > 150] = 255
img_arr[img_arr < 100] = 0
tess.image_to_string(img_arr)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论