英文:
Binarize an image and extract a text where the background is black and the text to be extract is red?
问题
我正在尝试在内存中使用cv2
和pytesseract
从屏幕截图中提取文本。
当文本是白色的,背景是黑色的时候,它能正常工作,但是当文本是红色的时候,返回的结果总是空的。
import sys
from PIL import ImageGrab
import cv2
import numpy as np
from pytesseract import pytesseract
pytesseract.tesseract_cmd = r'C:\site-packages\Tesseract-OCR\tesseract.exe'
def Extract(tupleCoordenates):
pic = ImageGrab.grab(bbox=tupleCoordenates)
img = np.array(pic)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
limiar, imgThreash = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
text = pytesseract.image_to_string(imgThreash)
return text
PS:下面的这段代码在我将图片保存在磁盘上并使用cv2.imread("myimage.jpeg", 0)
读取它时能正常工作。
myImg = cv2.imread("myimage.jpeg", 0)
limiar, imgThreash = cv2.threshold(myImg, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
cv2.imwrite(args[2], imgThreash)
我认为问题可能出在从截图到图片再到灰度图的转换。
英文:
I'm trying extract text from a screenshot using cv2
and pytesseract
in memory.
When the text is white over a black background it works, but when the text is red the return is always empty.
import sys
from PIL import ImageGrab
import cv2
import numpy as np
from pytesseract import pytesseract
pytesseract.tesseract_cmd = r'C:\site-packages\Tesseract-OCR\tesseract.exe'
def Extract(tupleCoordenates):
pic = ImageGrab.grab(bbox=tupleCoordenates)
img = np.array(pic)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
limiar, imgThreash = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
text = pytesseract.image_to_string(imgThreash)
return text
PS: this code below works when i save the pic on disk and get it from cv2.imread("myimage.jpeg", 0)
myImg = cv2.imread("myimage.jpeg", 0)
limiar, imgThreash = cv2.threshold(myImg, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
cv2.imwrite(args[2], imgThreash)
I believe that the problem is the parse from pic -> img -> gray scale
答案1
得分: 0
I finally found a solution to my problem.
以下是代码部分:
def ThresholdFromScreenShot(tupleCoordenates):
pixels = np.array(ImageGrab.grab(bbox=tupleCoordenates))
gray_f = np.array(Image.fromarray(pixels).convert('L'))
limiar, imgThreash = cv2.threshold(gray_f, 127, 255,
cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
gray_s = np.array(Image.fromarray(imgThreash).convert('L'))
blur = cv2.blur(gray_s,(3,3))
limiar,thresh = cv2.threshold(blur,240,255,cv2.THRESH_BINARY)
return thresh
英文:
I finally found a solution to my problem.
Follows the code:
def ThresholdFromScreenShot(tupleCoordenates):
pixels = np.array(ImageGrab.grab(bbox=tupleCoordenates))
gray_f = np.array(Image.fromarray(pixels).convert('L'))
limiar, imgThreash = cv2.threshold(gray_f, 127, 255,
cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
gray_s = np.array(Image.fromarray(imgThreash).convert('L'))
blur = cv2.blur(gray_s,(3,3))
limiar,thresh = cv2.threshold(blur,240,255,cv2.THRESH_BINARY)
return thresh
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论