获取图像中每个字母的矩形边界

huangapple go评论91阅读模式
英文:

Get rectangle bounds for each letter in a image

问题

So I'm trying to fill an ArrayList<Rectangle> with the bounds of each letter of an image file.

For example, given this .png image:

获取图像中每个字母的矩形边界

I want to fill an ArrayList<Rectangle> with 14 Rectangle (one rectangle for each letter)

We can assume that the image will contain only 2 colors, one for the background and one for the letters. In this case, pixels will be either white or red.

At first, I thought I could search for white columns in between the letters. Then, if I found a completely white column, I could get, for example, the width by obtaining the lowest red pixel value and the highest red pixel value, and calculate width = maxX - minX, and so on:

x = minX;
y = minY;
w = maxX - minX;
h = maxY - minY;

letterBounds.add(new Rectangle(x, y, w, h));

The problem is that there's no space in between the letters, not even 1 pixel:

获取图像中每个字母的矩形边界

My next idea was that for each red pixel I find, I could look for a neighboring pixel that hasn't been seen yet. Then, if I can't find a neighbor, I would have all the pixels to define the bounds of that letter. However, with this approach, I might get 2 rectangles for letters like "i". I could then write an algorithm to merge those rectangles, but I'm uncertain about how well that would work with other multipart letters. Before I attempt that, I wanted to ask for more ideas here.

Do you have any suggestions?

英文:

So I'm trying to fill an ArrayList&lt;Rectangle&gt; with the bounds of each letter of an image file.

For example, given this .png image:

获取图像中每个字母的矩形边界

I want to fill an ArrayList&lt;Rectangle&gt; with 14 Rectangle(one rectangle for each letter)

We can assume that the image will contain only 2 colors, one for the background and one for the letters, in this case, pixels will be either white or red.

At first, I thought I could search for white columns in between the letters, then if I found a completely white column I could get for example the width by getting the lowest red pixel value and the highest red pixel value and width = maxX-minX and so on:

x = minX;
y = minY;
w = maxX-minX;
h = maxY-minY;

letterBounds.add(new Rectangle(x,y,w,h));

The problem is that there's no space in between the letters, not even 1 pixel:

获取图像中每个字母的矩形边界

My next idea was for each red pixel I find, look for a neighbor that hasn't been seen yet, then if I can't find a neighbor I have all the pixels to get the bounds of that letter. But with this approach, I will get 2 rectangles for letters like "i" I could then write some algorithm to merge those rectangles but I don't know how that will turn out with other multi part letters, and before I try that I wanted to ask here for more ideas

So do you guys have any ideas?

答案1

得分: 1

你可以使用OpenCV的cv2.findContours()函数。不要使用cv2.drawcontours()函数来绘制轮廓,该函数会突出显示字母的轮廓。相反,你可以通过使用cv2.rectangle在图像上绘制一个矩形,并从cv2.findContours()函数中提取坐标来实现。

英文:

You can use the OpenCV cv2.findContours() function. Instead of using the cv2.drawcontours() function for drawing the contours, which will highlight the outline of the letter, you could instead draw a rectangle on the image by using the cv2.rectangle and by extracting the coordinates from cv2.findContours() function.

答案2

得分: 1

我认为使用类似于OpenCV的库,只需两个步骤的算法就足以解决这个问题。

  1. 直方图
  2. 缝隙计算

1. 直方图

C.....C..C...
.C.C.C...C...
. C.C....CCCC
1111111003111
  • 点(.)代表背景颜色(白色)
  • C代表除背景颜色以外的任何颜色(在你的情况下,是红色)

累积具有非背景颜色的垂直像素数量会生成直方图。

        *
        *
******..****
0123456789AB

很明显,边界存在于6和7之间。

2. 缝隙计算

一些情况,比如 We,无法通过直方图解决,因为根本没有空的垂直线。

缝隙雕刻算法 为我们提供了一些提示

更详细的实现可在以下链接找到

像素的能量计算

获取图像中每个字母的矩形边界

红色数字并不是像素的颜色值,而是从相邻像素计算出的能量值。

具有最小能量的垂直路径给出了每个字符的边界。
获取图像中每个字母的矩形边界

3. 更多内容...

需要统计数据来确定是否应用缝隙雕刻。

  • 字符的最大和最小宽度

即使直方图为我们提供了垂直边界,也不清楚一个组中是否有两个或更多个字符。

英文:

I think two step algorithm is enough to solve the problem if not using library like OpenCV.

  1. histogram
  2. seam calculation

1. histogram

C.....C..C...
.C.C.C...C...
. C.C....CCCC
1111111003111
  • dot(.) means background color(white)
  • C means any colors except background color(in your case, red)

accumulating the number of vertical pixels with non-background color generates histogram.

        *
        *
******..****
0123456789AB

It is clear the boundary exists at 6 and 7

2. seam calculation

Some cases like We, cannot be solved by histogram because there is no empty vertical lines at all.

Seam Carving algorithm gives us some hints

More detail implementation is found at

Energy calcuation for a pixel

获取图像中每个字母的矩形边界

The red numbers are not color values for pixels, but energy values calculated from adjacent pixels.

The vertical pathes with minimal energy give us the boundary of each characters.
获取图像中每个字母的矩形边界

3. On more...

Statistical data is required to determine whether to apply the seam carving or not.

  • Max and min width of characters

Even if histogram give us vertical boundaries, it is not clear there are two or more characters in a group.

huangapple
  • 本文由 发表于 2020年8月24日 12:20:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/63554671.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定