将图像二值化并提取文本,其中背景为黑色,要提取的文本为红色。

huangapple go评论67阅读模式
英文:

Binarize an image and extract a text where the background is black and the text to be extract is red?

问题

我正在尝试在内存中使用cv2pytesseract从屏幕截图中提取文本。

当文本是白色的,背景是黑色的时候,它能正常工作,但是当文本是红色的时候,返回的结果总是空的。

import sys
from PIL import ImageGrab
import cv2
import numpy as np
from pytesseract import pytesseract

pytesseract.tesseract_cmd = r'C:\site-packages\Tesseract-OCR\tesseract.exe'

def Extract(tupleCoordenates):
    pic = ImageGrab.grab(bbox=tupleCoordenates)
    img = np.array(pic)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    limiar, imgThreash = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
    text = pytesseract.image_to_string(imgThreash)
    return text

PS:下面的这段代码在我将图片保存在磁盘上并使用cv2.imread("myimage.jpeg", 0)读取它时能正常工作。

myImg = cv2.imread("myimage.jpeg", 0)    
limiar, imgThreash = cv2.threshold(myImg, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
cv2.imwrite(args[2], imgThreash)

我认为问题可能出在从截图到图片再到灰度图的转换。

英文:

I'm trying extract text from a screenshot using cv2 and pytesseract in memory.

When the text is white over a black background it works, but when the text is red the return is always empty.

import sys
from PIL import ImageGrab
import cv2
import numpy as np
from pytesseract import pytesseract

pytesseract.tesseract_cmd = r'C:\site-packages\Tesseract-OCR\tesseract.exe'

def Extract(tupleCoordenates):

pic = ImageGrab.grab(bbox=tupleCoordenates)
img = np.array(pic)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
limiar, imgThreash = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
text = pytesseract.image_to_string(imgThreash)
return text

PS: this code below works when i save the pic on disk and get it from cv2.imread("myimage.jpeg", 0)

myImg = cv2.imread("myimage.jpeg", 0)    
limiar, imgThreash = cv2.threshold(myImg, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
cv2.imwrite(args[2], imgThreash)

I believe that the problem is the parse from pic -> img -> gray scale

答案1

得分: 0

I finally found a solution to my problem.

以下是代码部分:

def ThresholdFromScreenShot(tupleCoordenates):

    pixels = np.array(ImageGrab.grab(bbox=tupleCoordenates))

    gray_f = np.array(Image.fromarray(pixels).convert('L'))

    limiar, imgThreash = cv2.threshold(gray_f, 127, 255, 
    cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

    gray_s = np.array(Image.fromarray(imgThreash).convert('L'))
                
    blur = cv2.blur(gray_s,(3,3))

    limiar,thresh = cv2.threshold(blur,240,255,cv2.THRESH_BINARY)
        
    return thresh
英文:

I finally found a solution to my problem.

Follows the code:

def ThresholdFromScreenShot(tupleCoordenates):

pixels = np.array(ImageGrab.grab(bbox=tupleCoordenates))

gray_f = np.array(Image.fromarray(pixels).convert('L'))

limiar, imgThreash = cv2.threshold(gray_f, 127, 255, 
cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

gray_s = np.array(Image.fromarray(imgThreash).convert('L'))
            
blur = cv2.blur(gray_s,(3,3))

limiar,thresh = cv2.threshold(blur,240,255,cv2.THRESH_BINARY)
    
return thresh

huangapple
  • 本文由 发表于 2023年8月4日 02:56:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76830929.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定