Академический Документы
Профессиональный Документы
Культура Документы
In many cases, appearance does not give us enough in- represent the head of the arrow. Section 3 introduces our
formation to classify single strokes and we need some con- method for evaluation of the relative position. Experiments
textual information. Relative position of a stroke with re- and their results are described in Section 4. Finally, we
spect to a reference stroke is the most intuitive. Bouteruche make a conclusion in Section 5.
et al. [2] addressed this problem directly and proposed a
fuzzy relative positioning method. The authors introduced 2. Arrow detector
a method evaluating the relative position of strokes based
on the fact how pairs of strokes fulfil a set of relations such Arrows are symbols with a non-rigid body. They consist
as ”the second stroke is on the right of the first stroke” of two parts: shaft and head. The head defines the orien-
through defined fuzzy landscapes. They used this method tation of the arrow. However, arrow’s appearance can be
to solve a prepared task, where pairs of reference and ar- changing arbitrarily according to the given domain. They
gument strokes are given and the argument strokes have to can have various shapes, lengths, heads, and directions.
be classified into 18 classes corresponding to several types Therefore, it is a difficult task to detect arrows with or-
of accentuation or punctuation. The information about the dinary classifiers based on symbol appearance. However,
appearance and the relative position of the argument stroke each arrow connects two other symbols with a rigid body
with respect to the reference stroke must be combined to- (see Figure 1). It is beneficial to detect these symbols first
gether to achieve a good recognition rate. This task ade- and leave the arrow detection to another classifier detecting
quately demonstrates the need for relative positioning sys- arrows between pairs of these symbols. This new classifier
tem. They used Radial Basis Function Networks (RBFN) must perform the following two steps:
as a classifier. The method was further improved by a better
definition of fuzzy landscapes and using SVM by Delaye et 1. Find a shaft of the arrow connecting the given two
al. [7]. Although the fuzzy relative positioning is a power- symbols. This shaft is just a sequence of strokes lead-
ful method useful for more complex tasks as recognition of ing from a vicinity of the first symbol to a vicinity of
structured handwritten symbols (Chinese characters) [6], it the second symbol and it is undirected.
gives poor results when applied on arrow head detection.
2. Find a head of the arrow, which is located around one
Our work brings two contributions. First, we define ar- of the end-points of the shaft. The head defines orien-
row head detection as a classification of possible arrow head tation of the arrow (if it is heading from the first sym-
strokes based on relative positioning. We used this arrow bol to the second symbol or vice versa).
head classifier to significantly improve proposed arrow de-
tector. Second, we propose a new method for evaluation The detection of an arrow’s shaft can be done iteratively
of the relative position of strokes, which exploits simple by simply adding strokes to a sequence such that the first
low-level features and uses Bidirectional Long Short Term stroke starts in a vicinity of the first symbol and the last
Memory (BLSTM) Recurrent Neural Network (RNN) as a stroke ends in a vicinity of the second symbol. A new stroke
classifier. The BLSTM RNN proved to be a good tool for is added to the sequence only if the distance between the
classification of individual strokes [13]. end-point of the last stroke and the end-point of the new
The rest of the paper is organized as follows. Section 2 stroke is smaller than a threshold. The algorithm must con-
describes the proposed arrow detector and the way the rel- sider all possible combinations of strokes creating a valid
ative positioning is exploited to determine which strokes connection between the given two symbols. The search
Ref.pointmA Querymstrokesm HeadmA
searchmandm
classification
Pairsmofmsymbols Shaft Extractionmofm Arrow
Detectionmofmarrowm Selectionmofmthembestm
referencemstrokesm
shaft arrowmhead
andmpoints
Querymstrokesm
searchmandm
Ref.mpointmB classification HeadmB
Figure 2. Arrow recognition pipeline. The recognition process is illustrated on a simple example of two symbols from FC domain.
space can be reasonably reduced by setting a maximal num- It happens quite often that the user draws a shaft and a
ber of strokes in the sequence. This number depends on the head of an arrow by one stroke. Our algorithm would fail
domain and the fact, how many strokes users use to draw ar- in that case. Therefore, we make one important step before
row’s shafts. Typically, it is four and two for flowcharts and we try to find the arrow’s head – we segment the last stroke
finite automata, respectively. We can immediately remove of the shaft into smaller sub-strokes in such a way that the
some shafts, which are in a conflict with another shafts, head is split from the shaft. Created sub-strokes are divided
and keep those with the smallest sum of the following dis- into two groups. One group is used to finish the shaft again
tances: a) distance between the first symbol and the first such that it reaches the symbol again. Sub-strokes of the
stroke of the shaft, b) distance between the second symbol second group are put into the set of query strokes possibly
and the last stroke of the shaft, c) distances between indi- forming the head. Our splitting algorithm is described in
vidual strokes of the shaft. Section 2.2. If the shaft and the head are not drawn by one
stroke, the algorithm will ideally perform no segmentation
Since we do not know the orientation of the arrow yet and this step can be skipped.
and the shaft is undirected, we have to consider both end-
points of the shaft and try to find two heads (one in the 2.1. Reference stroke and reference point
vicinity of each end-point). Ideally we will be able to find
just one head. In practice, it can happen that we find two It is necessary to define a reference stroke. Position of all
heads and we have to decide which one is better. The de- query strokes will be evaluated relatively with respect to it.
tection of an arrow’s head is not a trivial task, because there Naturally, it seems that the arrow’s shaft should be the refer-
might be a lot of interfering strokes around the end-points ence stroke. However, it is better to use just a sub-stroke of
of the shaft: heads of another arrows or text. The deci- the shaft for this purpose. The reason is that the shaft might
sion which strokes represent the true arrow’s head we are be arbitrarily curved or refracted, the whole arrow might
looking for and which are not, is a task, where the stroke be arbitrarily rotated, and we want to normalize the input
positioning might be beneficially used. First, we define a in such a way that the reference stroke has always more or
reference stroke (a sub-stroke of the shaft) and a reference less the same appearance and the query strokes have always
point (end-point of the shaft), which are used to express more or less the same relative position. Therefore, we cre-
a relative position of query strokes (details follow in Sec- ate a sub-stroke beginning at the end-point of the shaft with
tion 2.1). Second, this information about relative position is a shape of a line segment. It is done iteratively by adding
given to a classifier making the decision. The query strokes points to the newly created stroke until the value of a crite-
are all strokes in a vicinity of a given end-point of the shaft, rion, expressing how similar is the stroke to a line, is bigger
which are not a part of the shaft itself nor the two given than a threshold. The criterion is a ratio of the distance be-
symbols. We make a classification into two classes: head tween the end-points of the stroke and the path length of the
and not-head. Explanation for the evaluation of the relative stroke (sum of distances between neighbouring points). We
position of strokes and classification is given in Section 3. set the threshold empirically to 0.95. Another condition is
Let us just note that the classifier returns a class into which that the distance between end-points of the stroke must be
the query stroke is classified along with a potential. We use bigger then a threshold empirically derived from the aver-
this potential to decide which head is of better quality in the age length of strokes, because the possible presence of so
case we find two. We just compute a sum of potentials of all called hooks at ends of strokes would cause small value of
strokes in each head and decide for the head with the big- the criterion for short strokes. Figure 3 illustrates how the
ger value. This slightly favours heads consisting of higher reference stroke is determined as a sub-stroke of the shaft.
number of strokes, which is desirable in the most cases. A Then we rotate the reference stroke and all query strokes
pseudocode for the algorithm that we just described is di- by such an angle that the vector given by the end-points of
vided into two procedures and presented in the supplemen- the reference stroke will be pointing in the direction of the
tary material as Algorithm 1 and Algorithm 2. The arrow x-axis. In another words, it will cause that the true arrow
recognition pipeline is depicted in Figure 2. heads should point from the left to the right. For purposes
of our method for evaluation of relative position of strokes The common approach is to find tentative splitting points
(described in Section 3), we have to define a reference point. with high curvature and low speed. The best subset of
Obviously, it is the end-point of the shaft. these points is selected according to the error function fit-
ting points of each segment into selected primitives. The
most common primitives are line segments and arcs [8, 15].
It is also possible to use machine learning to train a classifier
detecting the splitting points [9, 11].
The presented algorithms are sophisticated and allow to
find segments fitting predefined primitives. However, us-
ing any of these methods seems to be an overkill for our
(a)
task. We do not require to split a stroke at any precisely de-
fined point nor to create segments with particular geomet-
rical properties (line segments or arcs). All we need is to
split the arrow’s head from its body and it is not important
if both the body and the head will be further split into sev-
eral segments. Therefore, we suggest to use much simpler
algorithm for stroke segmentation. Its description follows.
We compute a value AA, which we call “accumu-
lated angle”, associated to each point of the stroke S =
(b) (c)
{p1 , p2 , . . . , pn } according to the following equation:
d1
d2 d1
α2
α1 x
R αn dn pn
α1
R x
dn αn
pn
precisionc[L]
created 1252 positive and 1876 negative examples this way. 98
For our method, we used LSTM and BLSTM RNNs im- 97.5
30 References
20
[1] A.-M. Awal, G. Feng, H. Mouchere, and C. Viard-Gaudin.
10 % First experiments on a new online handwritten flowchart
2 4 8 16 32 64
number%of%nodes%in%the%hidden%layer%[−] database. In DRR 2011, pages 1–10, 2011.
[2] F. Bouteruche, S. Macé, and E. Anquetil. Fuzzy relative po-
TimeBneededBtoBclassifyBoneBsample
sitioning for on-line handwritten stroke analysis. In Proceed-
3.5 ings of IWFHR 2006, pages 391–396, 2006.
BasicBLSTM
BasicBBLSTM [3] M. Bresler, D. Průša, and V. Hlaváč. Modeling flowchart
3 ExtendedBLSTM structure recognition as a max-sum problem. In Proceedings
ExtendedBBLSTM
of ICDAR 2013, pages 1247–1251, August 2013.
2.5
[4] M. Bresler, T. V. Phan, D. Průša, M. Nakagawa, and
2
V. Hlaváč. Recognition system for on-line sketched dia-
timeB[ms]