如何消除数独方块中的凸性缺陷?
问题描述
我正在做一个有趣的项目:使用 OpenCV(如在 Google 护目镜等中)从输入图像中求解数独.而且我已经完成了任务,但是最后我发现了一个小问题,所以我来到这里.
I was doing a fun project: Solving a Sudoku from an input image using OpenCV (as in Google goggles etc). And I have completed the task, but at the end I found a little problem for which I came here.
我使用 OpenCV 2.3.1 的 Python API 进行了编程.
I did the programming using Python API of OpenCV 2.3.1.
以下是我所做的:
- 阅读图片
- 寻找轮廓
- 选择面积最大的那个,(也有点相当于正方形).
找到角点.
- Read the image
- Find the contours
- Select the one with maximum area, ( and also somewhat equivalent to square).
Find the corner points.
例如下面给出:
(请注意,这里的绿线与数独的真实边界正确重合,因此数独可以正确扭曲.查看下一张图片)
将图像变形为完美的正方形
warp the image to a perfect square
例如图片:
执行 OCR(为此我使用了我在 OpenCV-Python 中的简单数字识别 OCR )
Perform OCR ( for which I used the method I have given in Simple Digit Recognition OCR in OpenCV-Python )
而且这个方法效果很好.
And the method worked well.
问题:
查看这张图片.
在此图像上执行第 4 步会得到以下结果:
Performing the step 4 on this image gives the result below:
画出的红线是原始轮廓,是数独边界的真实轮廓.
The red line drawn is the original contour which is the true outline of sudoku boundary.
绘制的绿线是近似轮廓,将是扭曲图像的轮廓.
The green line drawn is approximated contour which will be the outline of warped image.
当然,数独上边缘的绿线和红线是有区别的.所以在变形时,我没有得到数独的原始边界.
Which of course, there is difference between green line and red line at the top edge of sudoku. So while warping, I am not getting the original boundary of the Sudoku.
我的问题:
如何在数独的正确边界(即红线)上扭曲图像,或者如何消除红线和绿线之间的差异?OpenCV中有什么方法吗?
How can I warp the image on the correct boundary of the Sudoku, i.e. the red line OR how can I remove the difference between red line and green line? Is there any method for this in OpenCV?
解决方案
我有一个可行的解决方案,但您必须自己将其翻译成 OpenCV.它是用 Mathematica 编写的.
I have a solution that works, but you'll have to translate it to OpenCV yourself. It's written in Mathematica.
第一步是调整图像中的亮度,通过将每个像素除以关闭操作的结果:
The first step is to adjust the brightness in the image, by dividing each pixel with the result of a closing operation:
src = ColorConvert[Import["http://davemark.com/images/sudoku.jpg"], "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]
下一步是找到数独区域,这样我就可以忽略(屏蔽)背景.为此,我使用连通分量分析,并选择具有最大凸面面积的分量:
The next step is to find the sudoku area, so I can ignore (mask out) the background. For that, I use connected component analysis, and select the component that's got the largest convex area:
components =
ComponentMeasurements[
ColorNegate@Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All,
2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]
通过填充这张图片,我得到了数独网格的掩码:
By filling this image, I get a mask for the sudoku grid:
mask = FillingTransform[largestComponent]
现在,我可以使用二阶导数过滤器在两个单独的图像中找到垂直线和水平线:
Now, I can use a 2nd order derivative filter to find the vertical and horizontal lines in two separate images:
lY = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];
我再次使用连通分量分析从这些图像中提取网格线.网格线比数字长得多,所以我可以使用卡尺长度来仅选择网格线连接的组件.按位置对它们进行排序,我得到图像中每个垂直/水平网格线的 2x10 蒙版图像:
I use connected component analysis again to extract the grid lines from these images. The grid lines are much longer than the digits, so I can use caliper length to select only the grid lines-connected components. Sorting them by position, I get 2x10 mask images for each of the vertical/horizontal grid lines in the image:
verticalGridLineMasks =
SortBy[ComponentMeasurements[
lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks =
SortBy[ComponentMeasurements[
lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 2]] &][[All, 3]];
接下来我取每一对垂直/水平网格线,将它们扩大,逐个像素地计算交点,并计算结果的中心.这些点是网格线的交点:
Next I take each pair of vertical/horizontal grid lines, dilate them, calculate the pixel-by-pixel intersection, and calculate the center of the result. These points are the grid line intersections:
centerOfGravity[l_] :=
ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters =
Table[centerOfGravity[
ImageData[Dilation[Image[h], DiskMatrix[2]]]*
ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h,
horizontalGridLineMasks}, {v, verticalGridLineMasks}];
最后一步是通过这些点为 X/Y 映射定义两个插值函数,并使用这些函数变换图像:
The last step is to define two interpolation functions for X/Y mapping through these points, and transform the image using these functions:
fnX = ListInterpolation[gridCenters[[All, All, 1]]];
fnY = ListInterpolation[gridCenters[[All, All, 2]]];
transformed =
ImageTransformation[
srcAdjusted, {fnX @@ Reverse[#], fnY @@ Reverse[#]} &, {9*50, 9*50},
PlotRange -> {{1, 10}, {1, 10}}, DataRange -> Full]
所有的操作都是基本的图像处理功能,所以这在 OpenCV 中应该也是可以的.基于样条的图像转换可能更难,但我认为你并不需要它.使用您现在对每个单独的单元格使用的透视变换可能会产生足够好的结果.
All of the operations are basic image processing function, so this should be possible in OpenCV, too. The spline-based image transformation might be harder, but I don't think you really need it. Probably using the perspective transformation you use now on each individual cell will give good enough results.
相关文章