在上一篇博客Halcon学习笔记(八)——OCR识别初步模板识别与生成训练文件中,我们着重分析了利用模板进行OCR识别并形成自己的训练文件的主要例程,下面我们分析,当字符排列是圆形或者字体倾斜时如何处理。
第三讲 呈圆形排列或倾斜字符的OCR识别
ocr_cd_print_polar_trans例程 圆形排列的字符
如图
对于这样一幅呈圆形排列的数字字符,我们要怎么处理呢?
这个例程描述了如果打印的符号不是线性图像(比如呈环形排列)的处理方法,即通过极坐标转换变为直角坐标系。
这个例程中最值得分析的就是阈值分割部分和坐标转换部分。
第一步:阈值分割思路:先得到均值滤波后的图像mean_image
,利用ImageMean
和原图Image
做局部阈值分割dyn_threshold
。
这里我尝试过直接借助直方图工具做阈值化,由于感兴趣的区域与背景的灰度值相差很近,直接做阈值分割的效果很差,所以之后如果碰到类似的感兴趣区域与背景融合度很大的,可以尝试本例程的方法。
mean_image (Image, ImageMean, 211, 211)
dyn_threshold (Image, ImageMean, RegionDynThresh, 15, 'dark')
此时得到
接下来就是常规思路——打散
connection (RegionDynThresh, ConnectedRegions)
在打散之后就是选择区域,这里选用了根据给定shape
选择区域,可能的shape
有'max_area'
,'rectangle1'
,'rectangle2'
,这里根据最大面积选择区域。
select_shape_std (ConnectedRegions, SelectedRegions, 'max_area', 0)
此时最大区域如下,即我们感兴趣的区域。
这里采用生成轮廓区域的方法gen_contour_region_xld
,画出选择区域的边界轮廓,然后对生成的边界轮廓用圆进行拟合fit_circle_contour_xld
,返回这个圆的圆心坐标( Row, Column)
,半径Radius
,
gen_contour_region_xld (SelectedRegions, Contours, 'border')
fit_circle_contour_xld (Contours, 'ahuber', -1, 0, 0, 3, 2, Row, Column, Radius, StartPhi, EndPhi, PointOrder)
然后根据圆心坐标生成两个圆,即内圆和外圆,最后做差,得到圆环区域。
gen_circle (CircleO, Row, Column, Radius - 5)
gen_circle (CircleI, Row, Column, Radius - 30)
difference (CircleO, CircleI, Ring)
第二步:坐标转换
极坐标转换到直角坐标的算子polar_trans_image_ext
,在这个算子中,输入为极坐标系的圆心坐标,开始和结束转换的角度等,具体如下
polar_trans_image_ext (Image, ImagePolar, Row, Column, 0, rad(360), Radius - 30, Radius - 5, WidthP, HeightP, 'bilinear')
得到转换结果
此时对转换结果进行旋转180°,便得到我们所需要的部分
dev_open_window (0, 0, WidthP, HeightP, 'black', WindowHandle2)
rotate_image (ImagePolar, ImageRotate, 180, 'constant')
这里的WidthP, HeightP
是预先定义的,即感兴趣区域展平后的宽、高。
最后一步就是分割出数字并利用OCR分类器进行识别。
分割数字的部分前半部分和第一步类似,后半部分是常用的求交集的形式,再创建分类器,训练分类器和测试部分,不加赘述。
完整程序:
*
* This example demonstrates how to perform OCR
* of symbols printed along a non-linear pattern.
* In particular, this example reads the characters
* printed on a CD
*
dev_update_off ()
dev_close_window ()
WidthP := 900
HeightP := 20
read_image (Image, 'ocr/cd_print')
get_image_size (Image, Width, Height)
dev_open_window (HeightP + 60, 0, Width * 2 / 3, Height * 2 / 3, 'black', WindowHandle)
set_display_font (WindowHandle, 16, 'mono', 'true', 'false')
*
* Show original image
dev_display (Image)
disp_message (WindowHandle, 'Read the number on the outer ring', 'window', 12, 12, 'black', 'true')
disp_continue_message (WindowHandle, 'black', 'true')
stop ()
*
* Segment disc in which the characters have been printed
mean_image (Image, ImageMean, 211, 211)
dyn_threshold (Image, ImageMean, RegionDynThresh, 15, 'dark')
connection (RegionDynThresh, ConnectedRegions)
select_shape_std (ConnectedRegions, SelectedRegions, 'max_area', 0)
gen_contour_region_xld (SelectedRegions, Contours, 'border')
fit_circle_contour_xld (Contours, 'ahuber', -1, 0, 0, 3, 2, Row, Column, Radius, StartPhi, EndPhi, PointOrder)
gen_circle (CircleO, Row, Column, Radius - 5)
gen_circle (CircleI, Row, Column, Radius - 30)
difference (CircleO, CircleI, Ring)
*
dev_set_draw ('margin')
dev_set_color ('green')
dev_set_line_width (3)
dev_display (Ring)
Message := '1. Segment ring'
disp_message (WindowHandle, Message, 'window', 36, 12, 'black', 'true')
disp_continue_message (WindowHandle, 'black', 'true')
stop ()
*
* Rectify the region through a polar transformation
* so that the characters now are aligned along an
* horizontal line
polar_trans_image_ext (Image, ImagePolar, Row, Column, 0, rad(360), Radius - 30, Radius - 5, WidthP, HeightP, 'bilinear')
dev_open_window (0, 0, WidthP, HeightP, 'black', WindowHandle2)
rotate_image (ImagePolar, ImageRotate, 180, 'constant')
*
* Segment the characters
mean_image (ImageRotate, ImageMeanRotate, 51, 9)
dyn_threshold (ImageRotate, ImageMeanRotate, RegionDynThreshChar, 5, 'dark')
connection (RegionDynThreshChar, ConnectedRegions1)
select_shape (ConnectedRegions1, SelectedRegions, ['area','width'], 'and', [30,4], [150,10])
sort_region (SelectedRegions, SortedRegions, 'character', 'false', 'column')
* Remove distractors which happen to have similar dimensions to the characters.
* From all the candidate regions pickup those consisting of dark regions
* on light background
threshold (ImageMeanRotate, Region, 90, 255)
intersection (SelectedRegions, Region, RegionIntersection)
* Filter out resulting empty regions
area_center (RegionIntersection, Area, Row1, Column1)
select_mask_obj (RegionIntersection, Characters, Area [>] 0)
*
dev_display (ImageRotate)
Message := [Message,'2. Calculate polar transform']
disp_message (WindowHandle, Message, 'window', 36, 12, 'black', 'true')
disp_continue_message (WindowHandle, 'black', 'true')
stop ()
*
* Read out
read_ocr_class_mlp ('Industrial_0-9A-Z_NoRej', OCRHandle)
sort_region (Characters, SortedRegions, 'character', 'true', 'row')
do_ocr_multi_class_mlp (SortedRegions, ImageRotate, OCRHandle, Class, Confidence)
* Correct zeros that are mistaken as capital O's.
* In a more general situation one may use a more
* complex regular expression or else the operator
* do_ocr_word_mlp()
tuple_regexp_replace (sum(Class), 'O', '0', Result)
*
dev_set_colored (6)
dev_set_draw ('fill')
dev_display (RegionIntersection)
Message := [Message,'3. Segment and read text']
disp_message (WindowHandle, Message, 'window', 36, 12, 'black', 'true')
disp_message (WindowHandle, Result, 'image', Height / 2 - 20, Width / 2 - 150, 'black', 'true')
text_line_slant例程 字体倾斜
如图,对于一幅倾斜的字符,我们应该如何进行OCR识别?
我们在Halcon学习笔记(五)几何定位+仿射+车牌识别中讲到了对于倾斜字符利用旋转矩阵做仿射变换将字符摆正,这里的处理同样如此,不同之处在于,这里采用text_line_slant
算子得到区域中字符的倾斜角度SlantAngle
。下面来看具体怎么操作。
第一步:仿射变换
首先得到倾斜角度text_line_slant (Image, Image, 50, rad(-45), rad(45), SlantAngle)
然后根据齐次单位矩阵hom_mat2d_identity
得到仿射矩阵hom_mat2d_slant
,即在单位矩阵上旋转SlantAngle
角度将倾斜的字符摆正。
hom_mat2d_identity (HomMat2DIdentity)
hom_mat2d_slant (HomMat2DIdentity, -SlantAngle, 'x', 0, 0, HomMat2DSlant)
affine_trans_image (Image, ImageRectified, HomMat2DSlant, 'constant', 'true')
此时得到摆正后的图像
第二步:阈值化分割字符
先把字符分割出来,由于背景与字符相差较大,我们可以用直方图助手阈值化threshold (ImageRectified, Region, 0, 100)
得到
观察到后面三个字符存在粘连,所以用腐蚀erosion_circle
将其断开erosion_circle (Region, RegionErosion, 3)
断开之后进行膨胀,恢复基本形状dilation_rectangle1 (RegionErosion, RegionDilation, 1, 20)
常规操作——打散,不多讲connection (RegionDilation, ConnectedRegions)
常规操作——求交集intersection (ConnectedRegions, Region, RegionIntersection)
此时我们发现有的字符断开成多个连通域,怎么办呢,采用partition_dynamic
定宽划分成矩阵partition_dynamic (RegionDilation, Characters, 100, 20)
此时所有字符都是单个连通域啦
接下来就是常规操作,排序——读OCR分类器——测试
不多加赘述
附完整代码
dev_update_off ()
read_image (Image, 'dot_print_slanted')
get_image_size (Image, Width, Height)
dev_close_window ()
dev_open_window (0, 0, Width , Height , 'black', WindowHandle)
dev_set_draw ('margin')
dev_set_colored (12)
dev_set_line_width (3)
* Correct slant
text_line_slant (Image, Image, 50, rad(-45), rad(45), SlantAngle)
hom_mat2d_identity (HomMat2DIdentity)
hom_mat2d_slant (HomMat2DIdentity, -SlantAngle, 'x', 0, 0, HomMat2DSlant)
affine_trans_image (Image, ImageRectified, HomMat2DSlant, 'constant', 'true')
threshold (ImageRectified, Region, 0, 100)
erosion_circle (Region, RegionErosion, 3)
dilation_rectangle1 (RegionErosion, RegionDilation, 1, 20)
connection (RegionDilation, ConnectedRegions)
intersection (ConnectedRegions, Region, RegionIntersection)
partition_dynamic (RegionDilation, Characters, 100, 20)
sort_region (Characters, SortedRegions, 'character', 'true', 'row')
read_ocr_class_mlp ('Industrial_0-9A-Z_NoRej', OCRHandle)
do_ocr_multi_class_mlp (SortedRegions, ImageRectified, OCRHandle, Class, Confidence)