Академический Документы
Профессиональный Документы
Культура Документы
2007
Abstract
This paper presents a novel connected component
based method for locating text in complex background using
support vector machine (SVM). Our method is composed of
two stages. In the first stage, the cascade of threshold
classifiers and support vector machine are used to identify
characters. In the second stage, the identified characters are
combined into texts, and then text features are extracted and
used to identify text region. Two kinds of features which are
character features and text features are utilized to locate
text region. Character features are used to discriminate
character connected components (CCs) from other objects
in complex background. Text features describe the
characteristics that characters in the same text have same
size, color and font. The cascade of threshold classifiers can
discard most non-character object, and improve the
efficiency of character feature extraction. SVM is used to
identify characters which the cascade of threshold
classifiers can not identify. Experimental results
demonstrate that the proposed approach is robust with
respect to different character sizes, colors and languages,
and achieves high precision which measured on the ICDAR
2003 test database.
Keywords: Text Location; support vector machine; cascade
of classifiers; text features
1.
Introduction
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007
that the CCs are neighbor each other and have approximate
attributes such as stroke width, height. Text features are
extracted to verify candidate text regions.
Input image
Image Decomposition
Character Identification
Image Decomposition
Segment Re sult ( x , y ) = 0
100
I ( x , y )> T+ ( x , y )
I ( x , y ) <T ( x , y )
other
(1)
T ( x , y ) = Mean ( x , y , W B ) offset
Character Identification
1419
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007
is based on the principle of structural risk minimization
which different from the idea of minimizing the training
error. From the implementation point of view, training a
SVM is equivalent to solving a linearly constrained
quadratic programming problem in a number of variables
equal to the number of data points. Given a training set (xi,
yi), where xiRn is input pattern and yi{1,1} is the
corresponding desired label, i = 1,,L. SVM requires the
solution of the following optimization problem:
= a rg m in
Subject to
1
2
i j y i y j K ( x i , y j )
i=1
(2)
i =1
y i = 0, 0 i C
x iS V s
i y i [ K ( x r ,x i ) + K ( x s ,x i ) ]
, xr
x iSV s
StrokeWidth_Size_Ratio =
M axStrokeWidth_Size_Ratio =
(4)
Stroke_Density =
(5)
Stroke Width
M ax(height,width)
StrokeWidthVariance_Ratio =
a re a
(p e rim e te r* p e rim e te r)
Num _ RoughPixels
height + width
a re a
(h e ig h t* w id th )
m a x (w id th , h e ig h t)
m in (w id th , h e ig h t)
i =1
Where b = - 1
(6)
1420
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007
The thresholds of classifiers are crucial to identify
character CCs. In our approach, the cascade of threshold
classifiers is used to remove those distinct non-character
CCs and improve the speed of text locating. So we consider
the maximum and minimum of every feature of character
CCs in the train image database as the low and upper
threshold of corresponding feature. These thresholds which
are relax constraint can ensure that all character CCs can be
recalled in the train database and have high recall rate on the
test database. The ICDAR 2003 train database is used as our
train database. The result of the cascade of threshold
classifiers is shown in Fig. 4(a).
3.3. SVM based Characters Identification
| x xj |
2
(7)
(8)
(9)
1421
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007
CCs set are identified whether they are in the same text
region. Then all CCs that are in the same text region are
merged into a text region.
Five features of text region are extracted in our
method. They describe the texture characteristics of text
region. Four features are the variance of stroke width, gray
value, width and height which can be obtained by
calculating the variances of the corresponding character
features of CCs in the candidate text region. If the candidate
text region is a text region, the four variances are small.
Furthermore, the ratio of average width and height of CCs
which is in the same text region is regarded as a feature of
text region. We use five experimental thresholds to examine
whether a candidate text region is a text region or not.
5.
Experiment Results
m( r , T )
e
p=
re E
|E|
Our Method
r=
R
0.67
0.60
0.60
f
0.62
0.58
0.61
t(s)
14.4
0.35
1.1
m( r , E )
t
P
0.62
0.60
0.64
rt T
|T |
6.
Conclusions
1422
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007
distinct non-character CCs and improve the efficiency of our
system, and SVM can identify character CCs which
threshold classifiers can not classify. Furthermore, we
propose new text features to classify text regions. Future
work will focus on new image segmentation methods for
text.
[3]
[4]
Acknowledgements
This research is supported in part by Research and
Development of Modern Services Infrastructure Services
(No.2006BAH02A01) under Chinese National Key
Technology R&D Program.
[5]
[6]
[7]
References
[1] J. Ohya, A. Shio and S. Akamatsu, Recognizing
Characters in Scene Images, IEEE Trans. On PAMI,
Vol. 16, No.2, pp.214-224, 1994
[2] K. I. Kim, K. Jung, H. J. Kim, Texture-based
approach for text detection in images using support
vector machines and continuously adaptive mean shift
[8]
1423