2023年5月17日 22:10:25go评论66阅读模式

英文:

How to detect digits from image by using Tesseract 5?

问题

I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?

My environment:

Windows 11 22H2
WSL2 Ubuntu 22.04.1LTS
tesseract 5.3.1-20-g58b7

I tried Tesseract like this

tesseract hoge.jpg output -l eng

and output.txt is

Fb¥;
&amp;/0

Here is hoge.jpg.

如何使用Tesseract 5从图像中检测数字？

Thank you for helping in advance. I'm a Japanese student, so my English may be not so good. If you think it's not clear English, please change this post to make it more readable.

英文:

I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?

My environment:

Windows 11 22H2
WSL2 Ubuntu 22.04.1LTS
tesseract 5.3.1-20-g58b7

I tried Tesseract like this

tesseract hoge.jpg output -l eng

and output.txt is

Fb&#165;
&amp;/0

Here is hoge.jpg.

Thank you for helping in advance. I'm a Japanese student, so my English may be not so good. If you think it's not clear English, please change this post to make it more readable.

答案1

得分: 1

以下是代码部分的翻译：

从糟糕的图片中，你永远不会得到好的结果。我稍微做了一些调整，得到了这个结果：

import subprocess
import cv2
import pytesseract

# 图像处理
# 命令请参考 https://imagemagick.org/script/convert.php
mag_img = r'D:\Programme\ImageMagic\magick.exe'
con_bw = r'D:\Programme\ImageMagic\convert.exe'

in_file = r'ZZ_Numbers.jpg'
out_file = r'ZZ_Numbers_bw.png'

# 调整为黑白并旋转以获得更好的结果
process = subprocess.run([con_bw, in_file, "-resize", "70%","-threshold","60%", "-rotate", "-17", "-brightness-contrast","-15x30",out_file])

# 文本处理
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread(out_file)

# 参数请参考 tesseract 文档
custom_config = r'--psm 11 --oem 3 tessedit_char_whitelist=0123456789'

tex = pytesseract.image_to_string(img, config=custom_config)
print(tex)

with open("cartootn.txt", 'w') as f:
    f.writelines(tex)

cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

希望这能帮助你理解代码。

英文:

From bad picture you will never get good results. I played a bit and get this one:

import subprocess
import cv2
import pytesseract

# Image manipulation
# Commands https://imagemagick.org/script/convert.php
mag_img = r&#39;D:\Programme\ImageMagic\magick.exe&#39;
con_bw = r&quot;D:\Programme\ImageMagic\convert.exe&quot; 

in_file = r&#39;ZZ_Numbers.jpg&#39;
out_file = r&#39;ZZ_Numbers_bw.png&#39;

# Play with black and white and rotate for better results
process = subprocess.run([con_bw, in_file, &quot;-resize&quot;, &quot;70%&quot;,&quot;-threshold&quot;,&quot;60%&quot;, &quot;-rotate&quot;, &quot;-17&quot;, &quot;-brightness-contrast&quot;,&quot;-15x30&quot;,out_file])

# Text ptocessing
pytesseract.pytesseract.tesseract_cmd=r&#39;C:\Program Files\Tesseract-OCR\tesseract.exe&#39;
img = cv2.imread(out_file)

# Parameters see tesseract doc 
custom_config = r&#39;--psm 11 --oem 3 tessedit_char_whitelist=0123456789&#39; 

tex = pytesseract.image_to_string(img, config=custom_config)
print(tex)

with open(&quot;cartootn.txt&quot;, &#39;w&#39;) as f:
    f.writelines(tex)

cv2.imshow(&#39;image&#39;,img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output:

答案2

得分: 0

I've given this image an attempt with basic image manipulation in python with pytesseract with mixed results. There seems to be two challenges in this image: the noisy background and the slant of the numbers. Using thresholding to set pixels to either black and white was able to almost get the bottom number as "6/0", but the slant of the "1" keeps getting recognized as a "/". The top gets read as "SEF", and I haven't figured out how to get a better result there.

from PIL import Image
import pytesseract as tess

img = Image.open('zgKoF.jpg')

img_arr = np.array(img)

img_arr[img_arr > 150] = 255
img_arr[img_arr < 100] = 0

tess.image_to_string(img_arr)

英文:

from PIL import Image
import pytesseract as tess

img = Image.open(&#39;zgKoF.jpg&#39;)

img_arr = np.array(img)

img_arr[img_arr &gt; 150] = 255
img_arr[img_arr &lt; 100] = 0

tess.image_to_string(img_arr)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用Tesseract 5从图像中检测数字？

问题

答案1

答案2

如何在VS Code Jupyter Notebook中导入Pandas

How can I access the leaf node of the JSON file to retrieve the sales_value of the year 2022 for all customers, and then sum them up in Python?

如何将列表中的零移动到末尾，而不是假（False）在Python中。

如何在 pandas 中复制相同行但在列中具有不同信息？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论