Вы находитесь на странице: 1из 13

FormReaderProjectReport

Andree Ang Kisjanto Surya, ICT Batch 2007 - Image Processing Assignment

Introduction

This document is written to accomplish the grading requirement for Image Processing subject. The role
of this document is as a personal report for the Form Reader Project given to the students of ICT 2007
in Image Processing subject. The objective of the Form Reader Project is to design and implement a
form reader application (OCR and OMR) utilizing image processing techniques written from scratch.
In this document, the design of the application will be described in detail. The test result against the
implemented solution will be presented. Known issues will be mentioned and possible solutions to fix
existing problems will be proposed.

Application Design

The design of the application can be presented in 4 sections: form design, form preprocessing, Optical
Character Recognition (OCR), and Optical Mark Reader (OMR).

2.1

Form Design

In this project, the form to be read is designed to be like the simplified version of the official answer
sheet of national final examination (UAN) in Indonesia, with OCR capabilities. The design of the form
can be seen in Illustration 1.

Illustration 1: Form design

In this design, the name should be written in the provided name boxes and will be extracted by the
application using OCR. The answers should be marked by darkening the chosen answers (A, B, C, D,
or E for each number) and will be extracted using OMR.
In order to handle location shifting in the raw input image, 3 corners (upper left, upper right, and
bottom left) are marked with cross signs to indicate the area that should be read by the application.

2.2

Form Preprocessing

The form preprocessing step involves image enhancement and preparations required in order to
transform the raw input image into a form suitable for further processing (i.e. OCR and OMR). The key
tasks involved in the form preprocessing step is described in Illustration 2.

Illustration 2: Form preprocessing procedures

2.2.1

Binarization

The binarization process is performed using a simple global thresholding scheme (Gonzales and Woods
2010, p. 760) with a static threshold value of 200 (the best value obtained from experiment). This
binarization process using this scheme has several weaknesses and can be improved. Please see section
4 for further detail.
2.2.2

Inversion

It is not uncommon to consider the value of 1 as exists or foreground and the value of 0 as not
exists or background, thus this scheme is used in this application. The purpose of inversion is to
converts the raw input image to adhere to this scheme.
2.2.3

Find Corners and Remove Border

In order to remove unnecessary border in the raw input image, the coordinates of the corner marks have

to be extracted. This can be done by analyzing the horizontal and vertical histogram of the corners of
the raw input image. In this application, the size of the corners to be analyzed is 50x50 (pixel).

Illustration 3: Corners of the raw input image with size 50x50 (pixel).
(a) upper left, (b) upper right, (c) bottom left.

Through the histogram analysis of the corner, we are able to obtain the the coordinates of the corner
marks, thus allow us to remove the unnecessary border. The result of this operation can be seen in
Illustration 4.

Illustration 4: Form after preprocessing (binarized, inverted, border removed)

The current process to find corner marks coordinates has several weaknesses and can be improved.
After the coordinates have been found, they can also be used to provide more robustness against variety
of input conditions (e.g. skew, rotation), which are currently not implemented. All of these possibilities
will be discussed later in section 4.

2.3

Optical Character Recognition (OCR)

The OCR capabilities in this application is used to extract the name written in the name boxes. The
process can be briefly described in Illustration 5.

Illustration 5: "Read name" action flowchart

2.3.1

Boundary Removal

After getting a name box N using a predefined coordinate, width, and height, we have to remove
unwanted components from the name box in order to optimize the character recognition process. One
of such components is the boundary of the name box (see Illustration 6(a)). To accomplish the removal
of the boundary, a technique called extraction of connected component (Gonzales and Woods 2010, p.
667) is used.
We start with a mask M, an image with dimension equals to N, but all of the pixels values are filled
with 0, except in the boundary which are filled with 1 (see Illustration 6(b)). The intersection of N and
M, referred as X0 (Illustration 6(c)), will be the starting component of this technique. From now on, the
following operation is performed iteratively until X k =X k1 , with Xk containing all components
connected with the border of N (Illustration 6(e)).
X k = X k1 S N

k =1,2, 3, ...

S in this case is the structuring element, whose form, size, and center point can be seen in Illustration
6(d). After the boundary has been found, the process of boundary removal would be as simple as
subtracting N with Xk. The result can be seen in Illustration 6(f).

Illustration 6: Boundary removal process.


(a) Name box N, (b) mask M, (c) starting point X0, (d) structuring element S, (e) Xk, (f) N - Xk.

This technique results in the removal of all components in N that are connected with the border of N.
However, it cannot differentiate whether the removed component is a boundary or the character itself.
There exists several other approaches which are more robust to remove the unwanted boundary. Please
refer to section 4 for further detail.
2.3.2

Trimming

Trimming is the process of removing unwanted space between the actual character and the boundary of
N. The process assumes that N contains no other foreground pixels than the character itself, thus the
boundary removal process has to be conducted beforehand. The trimming process is done using a
simple histogram analysis. Illustration 7 perhaps can better explain the process.

Illustration 7: Trimming process

2.3.3

Scaling

After the trimming process has been done, the resulting image has to be rescaled to a standardized size
to be used later in the matching process. The scaling process is done using a simple proportional
mapping with nearest neighbor interpolation (Gonzales and Woods 2010, p. 87).
Let W and H be the width and height of the input image, (x, y) be a pixel location in the input image,
and (v, w) be the corresponding output pixel location (after mapping). The pixel mapping can be done
using the following formula.

v=

W new
x
W

w=

H new
y
H

f v , w=f x , y
The square brackets in the preceding formulas is a nearest integer (rounding) function.
This approach, which can be referred to as forward mapping (Gonzales and Woods 2010, p. 109),

possess several problems. For example, two or more pixels location in the input image can be mapped
to the same location in the output image, or a pixel location in the output image may not be allocated a
pixel value at all. Because of these problems, inverse mapping approach is more commonly used in
practice. The idea of inverse mapping is to scan the output pixels in the output image first. For each
pixel location in the output image (v, w), compute the corresponding pixel location in the input image
(x, y). The formula for inverse mapping, which is actually the inverse to the preceding formula, is as
follow.
x=

W
v
W new

y=

H
w
H new

f v , w=f x , y
Please note that the nearest neighbor interpolation is implicitly carried out in the nearest integer
(rounding) function. An example of the scaling result can be seen in Illustration 8.

Illustration 8: Scaling process of F character. (a) 8 x 13 pixels, (b) 12 x 12 pixels.

2.3.4

Matching

After the character in the name box has been prepared, the next step is to match this character against
known set of characters in the database. In current implementation, the matching process is conducted
using minimum distance classifier, by computing the Euclidean distance (Tan, Steinbach and Kumar
2006) between input image and database records.
Suppose that f(k) is a pixel value in position k of the input image, f'(k) is a pixel value in position k of
the database record, and n is the total number of pixels in the input image. The Euclidean distance
between these two images can be computed using the following formula.

Each character will be compared against each database record. The database record with the least
distance to the character would be considered the matching record.

2.4

Optical Mark Reader (OMR)

The Optical Mark Reader (OMR) capabilities in the application is used to extract the filled answer (A,
B, C, D, or E) for each number (from 1 to 50). The process of answer extraction can be summarized in
Illustration 9.

Illustration 9: "Get answer" action flowchart

The horizontal division of region N into 5 sub-regions can be seen in Illustration 10. The variable
THRESHOLD is the minimal value for S(k) in order to be considered valid choice. If the value of S(k)
is less than THRESHOLD, it is considered an invalid answer, e.g. the form filler does not fill in any
answer for the corresponding number. The value of THRESHOLD is currently set to 250, obtained from
experiment.

Illustration 10: Division of an answer region into 5 sub-regions

Test Results

The test is conducted against two input images. The first input is a scanned empty form which is filled
with the aid of an image manipulation software (GIMP). The second input is a scanned form, filled
manually using a 2B pencil. Compared to the first input image, the location of the second input image
is shifted for about 20 pixels to right and 10 pixels to bottom.

3.1

Test Result 1

The condition of the first input image:

Properly scanned, without rotation or location shifting.

Filled with the aid of an image manipulation software (GIMP).

Filled using a pitch black fill and font color.

Filled using the same font as one used in the database records (Arial).
Table 1: Matching results for input image 1

Character Best Match

Euclidean Distance with Database Records

(A) 4.36, (B) 9.17, (C) 9.33, (D) 9.38, (E) 9.06, (F) 8.83, (G) 9.17, (H) 9.06, (I) 8.72, (J)
8.49, (K) 8.00, (L) 9.17, (M) 9.38, (N) 9.06, (O) 9.54, (P) 8.89, (Q) 9.06, (R) 8.66, (S)
8.54, (T) 8.72, (U) 10.15, (V) 8.89, (W) 7.81, (X) 7.07, (Y) 8.83, (Z) 8.12

(A) 8.89, (B) 4.90, (C) 8.06, (D) 5.48, (E) 5.66, (F) 7.48, (G) 7.62, (H) 7.62, (I) 7.35, (J)
7.62, (K) 8.12, (L) 7.48, (M) 7.87, (N) 8.25, (O) 7.55, (P) 7.42, (Q) 8.49, (R) 7.00, (S)
6.40, (T) 8.60, (U) 7.55, (V) 9.33, (W) 8.77, (X) 9.17, (Y) 9.27, (Z) 7.87

(A) 9.22, (B) 7.75, (C) 3.87, (D) 7.62, (E) 6.78, (F) 7.48, (G) 5.83, (H) 8.49, (I) 9.06, (J)
8.25, (K) 8.60, (L) 6.48, (M) 8.83, (N) 8.94, (O) 6.40, (P) 8.77, (Q) 7.21, (R) 8.54, (S)
8.19, (T) 8.49, (U) 7.14, (V) 8.89, (W) 9.11, (X) 8.94, (Y) 8.83, (Z) 7.62

(A) 10.15, (B) 5.83, (C) 7.55, (D) 4.00, (E) 6.78, (F) 8.00, (G) 7.75, (H) 7.75, (I) 7.87, (J)
7.87, (K) 8.60, (L) 7.21, (M) 8.00, (N) 8.00, (O) 6.24, (P) 7.68, (Q) 7.35, (R) 7.42, (S)
7.55, (T) 8.72, (U) 6.24, (V) 9.95, (W) 8.66, (X) 9.70, (Y) 9.49, (Z) 8.37

(A) 9.06, (B) 6.24, (C) 6.63, (D) 6.86, (E) 4.36, (F) 5.20, (G) 7.14, (H) 7.28, (I) 8.31, (J)
8.66, (K) 7.28, (L) 5.20, (M) 8.19, (N) 8.31, (O) 7.48, (P) 7.48, (Q) 8.31, (R) 7.07, (S)
7.35, (T) 8.31, (U) 7.21, (V) 8.94, (W) 8.83, (X) 9.11, (Y) 9.00, (Z) 7.14

(A) 8.94, (B) 7.14, (C) 7.62, (D) 7.55, (E) 5.92, (F) 2.65, (G) 8.19, (H) 7.28, (I) 8.77, (J)
9.00, (K) 7.00, (L) 6.56, (M) 8.66, (N) 8.54, (O) 8.12, (P) 6.00, (Q) 8.54, (R) 6.93, (S)
7.75, (T) 7.81, (U) 8.25, (V) 8.72, (W) 9.06, (X) 9.11, (Y) 8.31, (Z) 8.19

(A) 9.49, (B) 7.14, (C) 5.10, (D) 7.00, (E) 7.00, (F) 7.55, (G) 4.80, (H) 7.68, (I) 8.19, (J)
8.06, (K) 9.00, (L) 7.28, (M) 8.19, (N) 8.31, (O) 6.16, (P) 8.83, (Q) 7.28, (R) 8.60, (S)
8.37, (T) 8.89, (U) 6.48, (V) 9.17, (W) 8.60, (X) 9.33, (Y) 9.22, (Z) 8.19

(A) 9.75, (B) 8.25, (C) 9.00, (D) 7.87, (E) 7.75, (F) 7.21, (G) 8.83, (H) 5.83, (I) 9.27, (J)
9.17, (K) 8.00, (L) 7.21, (M) 7.62, (N) 7.35, (O) 9.00, (P) 7.28, (Q) 9.06, (R) 8.19, (S)
9.64, (T) 9.38, (U) 7.00, (V) 9.33, (W) 8.77, (X) 10.00, (Y) 9.06, (Z) 9.49

(A) 8.12, (B) 6.86, (C) 8.72, (D) 7.42, (E) 8.43, (F) 8.54, (G) 7.68, (H) 7.55, (I) 1.73, (J)
8.77, (K) 8.06, (L) 9.64, (M) 6.08, (N) 6.56, (O) 8.37, (P) 8.25, (Q) 8.43, (R) 7.48, (S)
7.48, (T) 9.33, (U) 7.75, (V) 8.49, (W) 7.21, (X) 8.19, (Y) 9.43, (Z) 8.06

(A) 8.49, (B) 7.68, (C) 8.49, (D) 7.68, (E) 8.89, (F) 9.75, (G) 7.94, (H) 9.64, (I) 8.66, (J)
6.08, (K) 9.00, (L) 9.43, (M) 8.66, (N) 9.33, (O) 8.12, (P) 9.27, (Q) 7.81, (R) 8.49, (S)
8.12, (T) 7.55, (U) 8.94, (V) 8.00, (W) 9.17, (X) 8.19, (Y) 8.19, (Z) 8.31

(A) 7.87, (B) 8.54, (C) 8.94, (D) 9.11, (E) 7.68, (F) 8.06, (G) 9.95, (H) 8.89, (I) 8.31, (J)
9.33, (K) 5.39, (L) 7.68, (M) 8.54, (N) 8.06, (O) 9.70, (P) 8.00, (Q) 9.00, (R) 6.78, (S)
8.12, (T) 8.31, (U) 9.17, (V) 8.72, (W) 8.72, (X) 7.55, (Y) 8.31, (Z) 8.43

(A) 9.11, (B) 7.62, (C) 7.14, (D) 7.35, (E) 5.83, (F) 7.07, (G) 8.25, (H) 7.75, (I) 9.80, (J)
8.37, (K) 7.35, (L) 0.00, (M) 8.60, (N) 8.49, (O) 7.81, (P) 8.31, (Q) 8.60, (R) 8.31, (S)
8.06, (T) 9.17, (U) 6.71, (V) 8.77, (W) 8.89, (X) 9.27, (Y) 9.06, (Z) 8.25

(A) 9.54, (B) 8.12, (C) 9.43, (D) 8.49, (E) 8.37, (F) 7.62, (G) 8.94, (H) 6.32, (I) 7.48, (J)
9.49, (K) 7.87, (L) 8.72, (M) 5.48, (N) 6.32, (O) 9.43, (P) 6.86, (Q) 9.27, (R) 7.14, (S)
9.00, (T) 9.06, (U) 7.81, (V) 8.43, (W) 8.54, (X) 8.37, (Y) 7.75, (Z) 8.60

(A) 9.11, (B) 8.12, (C) 9.11, (D) 8.25, (E) 8.49, (F) 8.37, (G) 8.60, (H) 6.63, (I) 7.75, (J)
9.59, (K) 7.07, (L) 8.00, (M) 7.07, (N) 4.90, (O) 9.11, (P) 8.43, (Q) 8.60, (R) 7.81, (S)
8.54, (T) 9.49, (U) 7.28, (V) 9.22, (W) 8.43, (X) 8.37, (Y) 8.83, (Z) 9.59

(A) 9.85, (B) 6.93, (C) 5.92, (D) 5.48, (E) 7.87, (F) 8.37, (G) 6.16, (H) 8.25, (I) 8.00, (J)
7.87, (K) 9.27, (L) 7.62, (M) 8.49, (N) 8.49, (O) 4.58, (P) 8.54, (Q) 6.00, (R) 8.31, (S)
8.06, (T) 9.27, (U) 6.40, (V) 9.54, (W) 8.43, (X) 9.38, (Y) 9.38, (Z) 8.83

(A) 9.17, (B) 7.42, (C) 8.00, (D) 6.71, (E) 7.42, (F) 5.57, (G) 7.94, (H) 7.28, (I) 7.94, (J)
8.89, (K) 7.94, (L) 7.55, (M) 8.06, (N) 7.94, (O) 7.62, (P) 5.10, (Q) 8.06, (R) 6.48, (S)
8.00, (T) 8.66, (U) 7.75, (V) 8.83, (W) 8.49, (X) 9.54, (Y) 9.00, (Z) 8.54

(A) 9.59, (B) 8.06, (C) 6.63, (D) 6.86, (E) 8.66, (F) 8.31, (G) 7.00, (H) 8.31, (I) 8.54, (J)
8.43, (K) 8.77, (L) 8.19, (M) 8.77, (N) 8.43, (O) 5.83, (P) 8.25, (Q) 5.57, (R) 7.75, (S)
8.72, (T) 9.00, (U) 7.35, (V) 9.27, (W) 8.37, (X) 9.43, (Y) 9.54, (Z) 9.22

(A) 7.75, (B) 7.00, (C) 8.83, (D) 6.86, (E) 7.00, (F) 7.42, (G) 8.89, (H) 8.43, (I) 8.19, (J)
8.31, (K) 6.71, (L) 7.94, (M) 8.66, (N) 7.94, (O) 8.72, (P) 6.00, (Q) 8.06, (R) 4.69, (S)
7.21, (T) 8.06, (U) 8.83, (V) 9.06, (W) 8.72, (X) 8.06, (Y) 9.11, (Z) 8.19

(A) 8.83, (B) 6.08, (C) 7.21, (D) 6.71, (E) 6.56, (F) 7.94, (G) 6.56, (H) 8.19, (I) 7.68, (J)
7.81, (K) 8.31, (L) 7.68, (M) 8.43, (N) 8.06, (O) 7.21, (P) 8.25, (Q) 8.06, (R) 8.00, (S)
5.48, (T) 8.66, (U) 7.75, (V) 9.17, (W) 8.83, (X) 8.66, (Y) 9.00, (Z) 8.06

(A) 9.22, (B) 9.06, (C) 8.31, (D) 9.17, (E) 8.00, (F) 7.48, (G) 8.94, (H) 9.90, (I) 9.80, (J)
9.17, (K) 8.49, (L) 8.83, (M) 9.80, (N) 9.90, (O) 8.89, (P) 8.19, (Q) 8.83, (R) 8.66, (S)
8.19, (T) 3.16, (U) 9.75, (V) 8.06, (W) 9.54, (X) 8.37, (Y) 5.66, (Z) 7.21

(A) 10.20, (B) 7.81, (C) 7.48, (D) 6.40, (E) 8.19, (F) 8.77, (G) 7.81, (H) 7.28, (I) 8.89, (J)
7.68, (K) 8.43, (L) 6.40, (M) 7.94, (N) 7.81, (O) 7.48, (P) 9.27, (Q) 8.31, (R) 9.17, (S)
8.94, (T) 9.54, (U) 4.90, (V) 9.27, (W) 8.00, (X) 9.64, (Y) 9.75, (Z) 9.11

(A) 9.22, (B) 9.06, (C) 8.89, (D) 9.17, (E) 9.06, (F) 8.60, (G) 8.94, (H) 9.06, (I) 8.94, (J)
8.37, (K) 8.60, (L) 8.72, (M) 7.35, (N) 8.83, (O) 9.11, (P) 8.06, (Q) 8.49, (R) 8.19, (S)
8.43, (T) 7.62, (U) 8.89, (V) 4.36, (W) 8.77, (X) 8.12, (Y) 6.93, (Z) 8.72

(A) 7.21, (B) 8.19, (C) 8.49, (D) 8.54, (E) 9.00, (F) 8.31, (G) 8.19, (H) 8.43, (I) 7.68, (J)
8.31, (K) 8.31, (L) 9.54, (M) 9.64, (N) 8.54, (O) 8.49, (P) 8.83, (Q) 8.43, (R) 8.60, (S)
8.12, (T) 8.54, (U) 9.06, (V) 8.49, (W) 6.00, (X) 8.31, (Y) 9.43, (Z) 8.77

(A) 6.78, (B) 8.89, (C) 8.94, (D) 9.43, (E) 9.11, (F) 8.89, (G) 9.11, (H) 8.54, (I) 8.19, (J)
8.19, (K) 7.68, (L) 9.43, (M) 8.31, (N) 7.94, (O) 9.70, (P) 8.83, (Q) 9.11, (R) 8.25, (S)
8.12, (T) 8.43, (U) 9.59, (V) 7.75, (W) 8.37, (X) 4.36, (Y) 7.14, (Z) 7.00

(A) 9.38, (B) 8.89, (C) 8.94, (D) 9.54, (E) 8.66, (F) 8.06, (G) 9.43, (H) 9.11, (I) 9.64, (J)
8.54, (K) 8.66, (L) 8.77, (M) 8.54, (N) 9.11, (O) 9.17, (P) 8.60, (Q) 9.11, (R) 8.72, (S)
8.49, (T) 6.08, (U) 9.38, (V) 6.32, (W) 9.49, (X) 7.28, (Y) 2.65, (Z) 7.55

(A) 8.19, (B) 7.35, (C) 7.28, (D) 8.00, (E) 6.63, (F) 7.62, (G) 8.37, (H) 8.49, (I) 8.37, (J)
7.62, (K) 8.49, (L) 8.00, (M) 8.60, (N) 9.49, (O) 8.31, (P) 7.94, (Q) 9.06, (R) 7.68, (S)
7.94, (T) 7.35, (U) 8.66, (V) 8.19, (W) 8.31, (X) 7.21, (Y) 8.00, (Z) 3.74

The test shows that all of the 26 characters and 50 marks in the input image can be detected correctly.

3.2

Test Result 2

The condition of the second input image:

Location is shifted for about 20 pixels to right and 10 pixels to bottom.

Filled manually using a 2B pencil.

The handwriting style used in the second input image can be seen in Illustration 11.

Illustration 11: Handwriting style used in the second input image

Table 2: Matching results for input image 2

Character Best Match

Euclidean Distance with Database Records

(A) 6.32, (B) 7.81, (C) 8.37, (D) 8.43, (E) 8.19, (F) 8.19, (G) 8.43, (H) 8.19, (I) 8.54, (J)
8.54, (K) 8.06, (L) 8.66, (M) 9.43, (N) 8.77, (O) 8.37, (P) 7.87, (Q) 8.19, (R) 7.62, (S)
8.49, (T) 9.64, (U) 9.06, (V) 9.17, (W) 8.25, (X) 8.31, (Y) 9.22, (Z) 8.54

(A) 7.87, (B) 8.19, (C) 7.87, (D) 8.06, (E) 7.94, (F) 8.06, (G) 8.54, (H) 8.54, (I) 8.06, (J)
8.66, (K) 7.42, (L) 7.55, (M) 8.43, (N) 8.66, (O) 8.00, (P) 8.37, (Q) 8.54, (R) 8.00, (S)
7.87, (T) 9.33, (U) 8.37, (V) 9.06, (W) 8.00, (X) 8.43, (Y) 9.43, (Z) 8.19

(A) 8.31, (B) 8.49, (C) 6.56, (D) 8.37, (E) 8.25, (F) 8.37, (G) 7.35, (H) 9.17, (I) 9.17, (J)
7.21, (K) 8.72, (L) 7.75, (M) 9.49, (N) 9.80, (O) 6.86, (P) 9.22, (Q) 8.25, (R) 9.43, (S)
8.31, (T) 8.37, (U) 8.06, (V) 8.66, (W) 8.77, (X) 8.37, (Y) 8.25, (Z) 7.87

(A) 10.15, (B) 7.48, (C) 8.06, (D) 6.16, (E) 7.75, (F) 7.87, (G) 7.87, (H) 7.87, (I) 8.00, (J)
7.87, (K) 8.83, (L) 7.75, (M) 7.62, (N) 7.62, (O) 7.14, (P) 7.81, (Q) 7.35, (R) 7.94, (S)
8.54, (T) 8.94, (U) 6.56, (V) 9.22, (W) 9.43, (X) 10.10, (Y) 9.38, (Z) 9.17

(A) 9.75, (B) 7.62, (C) 8.54, (D) 6.93, (E) 7.75, (F) 7.21, (G) 8.37, (H) 6.48, (I) 7.62, (J)
8.37, (K) 8.49, (L) 8.00, (M) 7.35, (N) 7.75, (O) 8.31, (P) 7.68, (Q) 8.72, (R) 8.19, (S)
9.00, (T) 9.06, (U) 6.40, (V) 9.22, (W) 8.31, (X) 10.00, (Y) 9.59, (Z) 8.94

(A) 9.11, (B) 8.60, (C) 8.31, (D) 8.12, (E) 8.60, (F) 7.35, (G) 7.75, (H) 7.48, (I) 8.25, (J)
9.49, (K) 8.49, (L) 8.60, (M) 8.12, (N) 7.62, (O) 7.94, (P) 6.71, (Q) 8.37, (R) 7.81, (S)
9.22, (T) 9.17, (U) 7.68, (V) 9.54, (W) 8.54, (X) 9.59, (Y) 8.94, (Z) 9.59

(A) 9.33, (B) 8.94, (C) 9.22, (D) 8.12, (E) 8.60, (F) 8.37, (G) 8.49, (H) 8.49, (I) 9.27, (J)
8.72, (K) 9.17, (L) 7.48, (M) 8.49, (N) 8.00, (O) 8.89, (P) 7.81, (Q) 9.06, (R) 9.11, (S)
8.31, (T) 8.25, (U) 7.94, (V) 9.00, (W) 9.11, (X) 9.90, (Y) 8.60, (Z) 9.17

(A) 9.85, (B) 7.48, (C) 8.66, (D) 7.87, (E) 8.00, (F) 7.87, (G) 8.60, (H) 7.75, (I) 8.60, (J)
7.75, (K) 8.00, (L) 8.37, (M) 7.75, (N) 7.87, (O) 8.66, (P) 7.94, (Q) 8.49, (R) 7.55, (S)
8.43, (T) 8.72, (U) 8.06, (V) 8.06, (W) 8.19, (X) 8.72, (Y) 8.49, (Z) 9.17

(A) 28.77, (B) 29.72, (C) 29.53, (D) 30.41, (E) 28.97, (F) 30.71, (G) 29.82, (H) 28.72, (I)
29.55, (J) 30.68, (K) 28.86, (L) 29.34, (M) 28.97, (N) 28.48, (O) 30.56, (P) 30.63, (Q)
29.44, (R) 28.71, (S) 30.40, (T) 30.94, (U) 29.93, (V) 30.66, (W) 29.22, (X) 28.86, (Y)
30.97, (Z) 28.86

(A) 9.43, (B) 8.49, (C) 7.94, (D) 7.75, (E) 9.27, (F) 9.38, (G) 8.25, (H) 8.72, (I) 9.90, (J)
6.48, (K) 9.27, (L) 8.37, (M) 9.38, (N) 8.94, (O) 7.14, (P) 9.64, (Q) 7.21, (R) 9.22, (S)
8.89, (T) 8.37, (U) 7.28, (V) 8.43, (W) 9.64, (X) 8.72, (Y) 8.25, (Z) 9.06

(A) 9.43, (B) 8.60, (C) 9.43, (D) 8.94, (E) 8.25, (F) 7.21, (G) 9.70, (H) 8.60, (I) 9.27, (J)
9.17, (K) 6.93, (L) 7.62, (M) 8.60, (N) 8.37, (O) 9.64, (P) 7.55, (Q) 9.27, (R) 7.81, (S)
8.06, (T) 7.07, (U) 9.22, (V) 7.68, (W) 9.22, (X) 8.25, (Y) 6.78, (Z) 9.17

(A) 8.00, (B) 8.77, (C) 7.62, (D) 8.66, (E) 8.54, (F) 8.19, (G) 8.31, (H) 8.31, (I) 8.89, (J)
8.54, (K) 8.66, (L) 7.42, (M) 8.66, (N) 8.89, (O) 7.75, (P) 9.06, (Q) 8.31, (R) 9.38, (S)
8.25, (T) 9.54, (U) 7.87, (V) 8.25, (W) 8.49, (X) 8.06, (Y) 8.66, (Z) 8.54

(A) 8.06, (B) 9.49, (C) 8.43, (D) 9.27, (E) 9.49, (F) 8.25, (G) 8.00, (H) 8.60, (I) 9.17, (J)
8.94, (K) 9.06, (L) 9.06, (M) 8.72, (N) 8.60, (O) 8.31, (P) 8.19, (Q) 8.25, (R) 9.00, (S)

9.43, (T) 7.87, (U) 9.11, (V) 8.31, (W) 9.11, (X) 8.72, (Y) 8.00, (Z) 9.06
N

(A) 9.00, (B) 9.17, (C) 8.06, (D) 9.06, (E) 9.17, (F) 8.49, (G) 7.62, (H) 7.87, (I) 8.49, (J)
9.06, (K) 8.12, (L) 8.12, (M) 7.87, (N) 6.93, (O) 7.94, (P) 9.00, (Q) 8.25, (R) 8.89, (S)
8.54, (T) 8.83, (U) 7.81, (V) 8.19, (W) 8.19, (X) 8.37, (Y) 8.12, (Z) 9.06

(A) 9.59, (B) 7.94, (C) 8.25, (D) 7.14, (E) 8.06, (F) 7.55, (G) 7.68, (H) 8.19, (I) 8.77, (J)
7.68, (K) 9.22, (L) 8.19, (M) 8.19, (N) 8.31, (O) 7.48, (P) 7.21, (Q) 7.68, (R) 8.25, (S)
8.12, (T) 8.66, (U) 8.12, (V) 8.60, (W) 8.94, (X) 9.43, (Y) 8.89, (Z) 9.22

(A) 8.00, (B) 8.43, (C) 7.75, (D) 9.22, (E) 7.94, (F) 6.24, (G) 8.31, (H) 7.81, (I) 9.00, (J)
8.77, (K) 8.06, (L) 8.31, (M) 8.43, (N) 8.89, (O) 8.12, (P) 7.21, (Q) 8.54, (R) 8.12, (S)
8.60, (T) 8.89, (U) 8.83, (V) 7.48, (W) 8.72, (X) 8.19, (Y) 7.55, (Z) 8.19

(A) 7.68, (B) 8.49, (C) 8.54, (D) 8.25, (E) 8.25, (F) 8.60, (G) 8.60, (H) 9.59, (I) 9.38, (J)
7.87, (K) 8.60, (L) 8.12, (M) 9.80, (N) 8.83, (O) 8.31, (P) 8.54, (Q) 7.87, (R) 8.19, (S)
8.06, (T) 8.49, (U) 9.22, (V) 8.89, (W) 9.11, (X) 8.72, (Y) 8.49, (Z) 9.06

(A) 8.43, (B) 8.94, (C) 8.19, (D) 8.94, (E) 8.49, (F) 8.00, (G) 8.72, (H) 8.94, (I) 8.60, (J)
8.94, (K) 8.12, (L) 8.00, (M) 8.49, (N) 9.06, (O) 8.31, (P) 7.68, (Q) 8.49, (R) 7.94, (S)
8.66, (T) 8.72, (U) 8.77, (V) 7.55, (W) 8.31, (X) 8.25, (Y) 8.00, (Z) 8.72

(A) 9.54, (B) 8.60, (C) 7.00, (D) 8.60, (E) 7.75, (F) 8.25, (G) 8.00, (H) 9.70, (I) 10.00, (J)
8.12, (K) 8.60, (L) 6.93, (M) 9.38, (N) 9.38, (O) 7.68, (P) 8.54, (Q) 8.12, (R) 8.77, (S)
7.14, (T) 7.21, (U) 8.31, (V) 7.81, (W) 9.11, (X) 9.17, (Y) 8.00, (Z) 8.37

(A) 8.72, (B) 8.31, (C) 7.87, (D) 8.19, (E) 8.06, (F) 7.68, (G) 8.43, (H) 10.05, (I) 10.05,
(J) 7.14, (K) 8.89, (L) 8.89, (M) 9.64, (N) 10.05, (O) 8.49, (P) 7.87, (Q) 8.43, (R) 7.87,
(S) 7.75, (T) 6.08, (U) 9.80, (V) 8.00, (W) 9.06, (X) 8.31, (Y) 7.81, (Z) 8.06

(A) 9.80, (B) 8.66, (C) 8.49, (D) 7.55, (E) 8.54, (F) 8.66, (G) 8.77, (H) 8.06, (I) 9.64, (J)
7.00, (K) 9.22, (L) 7.55, (M) 8.43, (N) 8.31, (O) 8.25, (P) 8.72, (Q) 8.54, (R) 9.27, (S)
8.94, (T) 8.54, (U) 6.93, (V) 8.37, (W) 9.59, (X) 9.33, (Y) 8.66, (Z) 8.77

(A) 8.77, (B) 9.49, (C) 8.89, (D) 9.80, (E) 8.94, (F) 8.72, (G) 9.27, (H) 9.27, (I) 9.38, (J)
8.37, (K) 8.49, (L) 8.60, (M) 8.00, (N) 9.38, (O) 9.11, (P) 8.54, (Q) 8.83, (R) 8.89, (S)
8.19, (T) 7.48, (U) 9.33, (V) 5.57, (W) 9.00, (X) 6.93, (Y) 6.00, (Z) 8.00

(A) 7.94, (B) 8.94, (C) 8.54, (D) 8.83, (E) 8.83, (F) 8.49, (G) 8.49, (H) 9.06, (I) 9.17, (J)
8.37, (K) 7.87, (L) 8.25, (M) 9.06, (N) 8.72, (O) 8.31, (P) 8.77, (Q) 8.49, (R) 8.43, (S)
8.66, (T) 7.62, (U) 8.54, (V) 8.06, (W) 8.31, (X) 8.25, (Y) 8.00, (Z) 8.72

(A) 7.81, (B) 8.60, (C) 8.77, (D) 9.06, (E) 8.94, (F) 8.00, (G) 8.94, (H) 8.00, (I) 8.72, (J)
8.37, (K) 7.75, (L) 8.94, (M) 8.83, (N) 8.60, (O) 9.22, (P) 8.54, (Q) 9.17, (R) 8.43, (S)
8.31, (T) 8.00, (U) 9.11, (V) 7.94, (W) 7.94, (X) 6.63, (Y) 6.93, (Z) 7.75

(A) 8.60, (B) 9.22, (C) 8.49, (D) 9.64, (E) 8.66, (F) 7.55, (G) 9.54, (H) 8.89, (I) 9.95, (J)
8.43, (K) 7.81, (L) 8.31, (M) 9.22, (N) 9.75, (O) 9.17, (P) 8.49, (Q) 9.33, (R) 9.06, (S)
8.72, (T) 7.68, (U) 9.27, (V) 7.48, (W) 8.60, (X) 7.28, (Y) 6.86, (Z) 7.81

(A) 8.83, (B) 8.06, (C) 7.48, (D) 8.06, (E) 7.68, (F) 7.14, (G) 8.43, (H) 8.77, (I) 9.43, (J)
6.71, (K) 8.66, (L) 7.81, (M) 9.00, (N) 9.64, (O) 8.00, (P) 8.12, (Q) 8.43, (R) 8.49, (S)
8.37, (T) 7.28, (U) 8.72, (V) 7.87, (W) 9.27, (X) 7.42, (Y) 6.86, (Z) 6.40

The test shows that only 46.15% (12 out of 26) of the characters are detected correctly by the system.
The OMR module, on the other hand, can detect all of the 50 answers correctly.
There is also an anomaly in the matching of I character in the OCR module. All of the euclidean
distances of character I to all of the database records are greater than 20. This should not be possible
since the euclidean distance between two binary images whose size are 144 pixels should never
exceeds 12. The root cause of the anomaly has not yet been found.

Known Issues and Possible Solutions

The current implementation of the form reader application is not practical enough for a real application.
The performance of the form reader, especially in the OCR module, needs an improvement. Moreover,
the application is not robust enough to handle the high variability of user inputs and problems that may
occur in the scanning process (e.g. rotation, position shift, wrong orientation). In this section, all of the
known issues will be presented along with suggestions on how to solve each problems.
1. The binarization process is conducted using a simple global thresholding scheme with a static
threshold value. Should there be a difference in the brightness level in the input image,
unwanted behavior may occurs (e.g. object of interest may be lost because its intensity level is
not as strong as in normal condition). A more robust approach would be to use a global
thresholding scheme with dynamic threshold value, e.g. Otsu's method (Gonzales and Woods
2010, p. 765).
2. The application provides no anticipation against noisy input images. Noise is problematic
because it could be mistaken with the actual object of interest. This would cause problems,
especially in corner marks detection and OCR. To anticipate noisy input images, an image
smoothing operation should be performed in the form preprocessing step, perhaps using one of
the most common image smoothing technique called averaging filters (Gonzales and Woods,
2010, p. 174).
3. The process to find the location of the corner marks cannot very well handle location shifting in
the input image. The size of corner regions to be searched for corner marks is too small (50 x 50
pixels), therefore if the image is shifted too much, the corner marks would not be able to be
found. If we set the size of the corner regions to be larger, it will increase the risk of having a
collision to the actual form content, which would lead to subsequent difficulties in histogram
analysis (Illustration 12). This problem can be solved by assigning a different color to the
corner marks in the form design. Another approach would be to increase the space between the
corner marks and the actual content of the form.

Illustration 12: Corner regions with larger size (100 x 100 pixels)

4. Currently, the corner marks in the form is only utilized to remove unnecessary border in the raw
input image. In order for the application to be more robust against geometric distortion (e.g.
resolution mismatch, location shifting, rotation, shear, wrong orientation), the location of these
corner marks can be used as tie points in image registration process (Gonzales and Woods, p.
111). Image registration is a process of aligning two or images of the same scene. Based on the
location information of these tie points, we can conduct image registration using affine
transformation (Gonzales and Woods 2010, p. 109).
5. In the OCR module, the process of boundary removal for the name boxes is not robust. For
example, if the character written by the form filler is touching the border (which is not
uncommon), the character would also be removed by the operation, resulting in a meaningless
image. A more robust approach would be to use a simple histogram analysis to differentiate
between name box border and the character.

6. The current OCR design described in section 2.4 has proven to be not effective enough to
handle the large variability of handwriting style. Several improvements can be made to enhance
the character recognition performance. Image smoothening using averaging filter can be used so
that the matching would be more tolerant against a slight shape shift in the letter. A
morphological operation, thinning (Gonzales and Woods 2010, p. 674) can be used to obtain the
skeletonised representation of the letter in order to to avoid the problem caused by the
difference in letter width. The number of the database records should also be increased and
include various samples of handwriting in the records. The Artificial Neural Network (Tan,
Steinbach, and Kumar 2006, p. 246) or Support Vector Machine (Tan, Steinbach, and Kumar
2006, p. 256) may also be incorporated as the classifier.

References

Gonzales, RC and Woods, RE 2010, Digital Image Processing, 3rd edn, Pearson, United States of
America.
Tan, PM, Steinbach, M and Kumar V 2006, Introduction to Data Mining, Pearson, United States of
America.

Вам также может понравиться