Zernike CBM 2011

Computers in Biology and Medicine 41 (2011) 726735
Contents lists available at ScienceDirect
Computers in Biology and Medicine

journal homepage: www.elsevier.com/locate/cbm
Classication of benign and malignant masses based on Zernike moments

Amir Tahmasbi n, Fatemeh Saki, Shahriar B. Shokouhi
Department of Electrical Engineering, Iran University of Science and Technology (IUST), Narmak, Tehran, Iran
a r t i c l e i n f o
a b s t r a c t
Article history:
Received 31 August 2010
Accepted 14 June 2011
In mammography diagnosis systems, high False Negative Rate (FNR) has always been a signicant
problem since a false negative answer may lead to a patients death. This paper is directed towards the
development of a novel Computer-aided Diagnosis (CADx) system for the diagnosis of breast masses. It
aims at intensifying the performance of CADx algorithms as well as reducing the FNR by utilizing
Zernike moments as descriptors of shape and margin characteristics. The input Regions of Interest
(ROIs) are segmented manually and further subjected to a number of preprocessing stages. The
outcomes of preprocessing stage are two processed images containing co-scaled translated masses.
Besides, one of these images represents the shape characteristics of the mass, while the other describes
the margin characteristics. Two groups of Zernike moments have been extracted from the preprocessed
images and applied to the feature selection stage. Each group includes 32 moments with different
orders and iterations. Considering the performance of the overall CADx system, the most effective
moments have been chosen and applied to a Multi-layer Perceptron (MLP) classier, employing both
generic Back Propagation (BP) and Opposition-based Learning (OBL) algorithms. The Receiver Operational Characteristics (ROC) curve and the performance of resulting CADx systems are analyzed for each
group of features. The designed systems yield Az 0.976, representing fair sensitivity, and Az 0.975
demonstrating fair specicity. The best achieved FNR and FPR are 0.0% and 5.5%, respectively.
& 2011 Elsevier Ltd. All rights reserved.
Keywords:
Computer aided diagnosis
Mammography
Neural network
Opposition-based learning
Zernike moments
1. Introduction
1.1. Background
American Cancer Society estimated 192,370 new cases of invasive breast cancer diagnosis among women and 62,280 additional
cases of in-situ breast cancer in the year 2009. Furthermore,
approximately 40,170 women are expected not to survive from
breast cancer [1]. In the United States, only lung cancer accounts for
more cancer deaths in women [1,2]. Moreover, women have about a
1 in 8 lifetime risk of developing invasive breast cancer [2]. It can be
inferred that despite advances of technology in the elds of
mammography [2], thermography [3], optical tomography [4] and
other anticancer methodologies in the last 20 years, breast cancer is
still a prominent problem.
Early detection of breast cancer increases the survival rate as
well as treatment options [2,5]. Mammography has been one of the
most reliable methods for early detection and diagnosis of breast
cancer [68]; it has reduced breast cancer mortality rates by
3070% [5]. However, mammography is not perfect; mammograms
n
Corresponding author.
E-mail addresses: a.tahmasbi@ieee.org,
a_tahmasbi@elec.iust.ac.ir (A. Tahmasbi), f_saki@elec.iust.ac.ir (F. Saki),
bshokouhi@iust.ac.ir (S.B. Shokouhi).
0010-4825/$ - see front matter & 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.compbiomed.2011.06.009
are difcult to interpret [3]. The sensitivity of screening mammography is affected by the image quality and the radiologists level of
expertise [2]. Fortunately, CADx technology can improve the performance of radiologists [5].
Radiologists visually search mammograms for specic abnormalities. A number of important signs of breast cancer, which
radiologists look for, are clusters of microcalcications, masses,
and architectural distortions [2]. A mass is dened as a spaceoccupying lesion seen in at least two different projections [9].
Although a vast number of algorithms including segmentation,
feature extraction and classication approaches have been developed in order to diagnose the masses in mammography images,
more research is still needed in this area. The current research is
directed towards the development of a CADx system for diagnosis of
breast masses in mammography images focusing on feature extraction section. Our proposed features, Zernike moments, have been
utilized for extracting the shape and margin properties of masses.
Zernike moments are the mapping of an image onto a set of
complex Zernike polynomials [10]. Since Zernike polynomials are
orthogonal to each other, Zernike moments can represent the
properties of an image with no redundancy or overlap of information between the moments [10]. Because of these important
characteristics, Zernike moments have been used widely in
different types of applications [1115]. For instance, they have been
utilized in shape-based image retrieval [12], in edge detection [13]
A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735
and as a feature set in pattern recognition [14]. However, they are

not perfect; the most important drawback of Zernike moments is
their high computational complexity making them unsuitable for
real-time applications. Fortunately, many researchers have been
developing approaches to the fast computation of Zernike moments
[10,16,17].
In this paper, a novel CADx system has been developed for the
mass diagnosis in mammography images. The aim is to increase
the diagnosis performance as well as to reduce the FNR by taking
the most effective moments into account and discarding low
performance features.
1.2. Motivation
According to BI-RADS, expert radiologists focus on three
distinct types of features including shape, margin, and density
while visually searching the mammograms [2,5]. The mass shape
can be round, oval, lobulated, or irregular [18]. Irregular shapes
have a more likelihood of malignancy [18,19]. The mass margin
can be circumscribed, microlobulated, indistinct, or spiculated
[18]. Spiculated and indistinct margins have a more likelihood of
malignancy [18,19]. Density features represent mass density
relative to normal broglandular breast tissue [19]. In addition,
BI-RADS categorizes the breast density into four groups of high
density, isodense, low density, and radiolucent [19]. Masses with
higher density usually have a more likelihood of malignancy.
Fig. 1 illustrates various types of mass shapes, margins and
densities, as well as their likelihood of malignancy.
The implied features are exactly equal to those benchmarks to
which an expert radiologist pays attention in the diagnosis
process. Hence, if one extracts them by an appropriate method,
the resulting CADx system may achieve better performances than
traditional approaches.
As a matter of fact, Zernike moments impart two prominent
characteristics:
(1) Although they are extremely dependent on the scaling and
translation of the object, their magnitude is independent from
its rotation angle [11]. Hence, one can utilize them to describe
shape characteristics of breast masses and be sure that the
rotation of mass will not affect the diagnosis process.
(2) The Zernike polynomials are orthogonal [10]. Thus, one can
extract the Zernike moments from an ROI regardless of the
shape of mass. The extracted moment can describe the mass
margin characteristics, comprehensively.
727
In order to extract shape characteristics, the input ROI has

been subjected to some preprocessing stages such as segmentation, co-scaling and translation, and the mass shape was obtained
in a binary format. Then, a suitable number of Zernike moments
were extracted, which can completely describe the mass shape.
Additionally, in order to extract margin characteristics, the input
ROI has been subjected to other preprocessing stages such as
histogram equalization, translation and co-scaling, and the mass
margin has been obtained in a gray scale format. Then, the
Zernike moments were extracted as the descriptors of the mass
margin.
According to literature, most of breast masses are either highdensity or isodense [20]. Therefore, the density feature can be
neglected considering the low likelihood of the presence of low
density and radiolucent masses. In fact, it does not provide
enough useful information for the diagnosis process.
Finally, either each group of these features or a combination
of them have been applied to an MLP classier, which is trained
with both BP and OBL learning rules [21]. The results and performances have been analyzed. Moreover, in order to analyze the effect
of orders of Zernike moments on the performance of the overall
system, a group of high-order as well as low-order Zernike
moments have been separately extracted from ROIs. Then, the
ROC curve and the performance of each group have been analyzed.
1.3. Database
In this paper, the Mammographic Image Analysis Society
(MIAS) database has been utilized to provide the mammography
images [22]. It contains 322 digital mammograms belonging to
right and left breasts of 161 different women. Each mammogram
has been digitized by a resolution of 200 mm pixel edge, and then
stored as a digital image with the size of 1024 1024 pixels. In
fact, the digital mammograms are gray-scale images with an 8 bit
pixel depth.
These images have been categorized into three different
classes from the viewpoint of breast tissue which are fatty, fattyglandular, and dense-glandular. Moreover, they have been categorized into normal, mass containing, micro-calcication clusters,
asymmetry and architectural distortions from the viewpoint of
lesion type. The MIAS database includes 209 normal breasts, 67
ROIs with benign lesions and 54 ROIs with malignant lesions [22].
Note that some breasts may have more than one lesion.
In the next section, the proposed approach has been discussed
in detail. In Section 3 the experiments and results are reported,
and nally, in the last section a conclusion has been made.
2. Methodology
Fig. 1. Different types of (a) mass shapes, (b) mass margins, (c) mass densities,
and their likelihood of malignancy according to BI-RADS.
Fig. 2 represents the owchart of the proposed approach. The

input data is an ROI which is a suspected part of a mammogram
and contains only one mass. In this research, each ROI is a squareshaped region with the same number of rows and columns
(N N). However, different ROIs might have different diameters
depending on the size of masses.
The ROIs have been segmented in the left-hand path, as
illustrated in Fig. 2. Then, the Normalized Radial Length (NRL)
vector and its average value, which is indeed the average radius of
the mass, are calculated [6]. By lling the mass, nally there will
be a binary image that represents the mass shape. In the righthand path, the input ROI is subjected to histogram equalization.
The outcome is a gray-scale image which contains mass margin
information.
There will be the same preprocessing stages on both right-hand
and left-hand paths hereafter. First, both images are subjected to a
728
radial averaging. Fig. 3 depicts the basic idea of radial averaging

and its equation is given as follows:
ry
POy QOy
,
2
80 o y o 2p
where r(y) is a point on the nal mass boundary for a special

angle y. Moreover, O denotes the mass centroid; PO and PQ are
the achieved radial length for a special y by the rst and second
radiologists.
2.2. Preprocessing
Fig. 2. The owchart of the proposed approach including segmentation, preprocessing, feature extraction, feature selection and classication stages.
translation stage which translates the centroid of the mass into the
center of the ROI; v is the proposed translation vector. This stage
resolves the problem of dependency of Zernike moments on translation. Furthermore, using the NRL average value (k), the masses
should be co-scaled. In fact, the ROIs should be scaled in such a
manner that equalizes the radii of all masses. Not to mention the fact
that after translation and co-scaling procedures, a number of rows
and columns might become unusable. Indeed, the new values of
those rows and columns might carry no useful information. Hence,
in the last step of preprocessing stage, each ROI has been cropped in
order to discard the proposed rows and columns.
Now the input ROIs are preprocessed and ready to be applied
to the feature extraction section. Fig. 2 illustrates the feature
extraction blocks. The Zernike moments have been extracted from
both binary and gray scale images which represent shape and
margin characteristics, respectively. Moreover, in order to analyze
the effect of orders of Zernike moments on the performance of
overall system, a group of high-order Zernike moments and also
a group of low-order Zernike moments have been separately
extracted from ROIs. In the feature selection stage, either the
descriptors of margin or the descriptors of shape or a symmetric
combination of them or an asymmetric combination of them may
have been chosen. Finally, the selected features are applied to an
MLP classier which is trained and tested with different groups of
input patterns.
In the rest of paper, each stage is discussed in detail.
The operations in the preprocessing stage are divided into two

branches. The rst branch includes those functions which are
applied on the gray scale image to extract the Zernike moments as
the descriptors of mass margins. The second branch includes
those functions that are applied on the binary image to extract
the Zernike moments as the descriptors of mass shapes. The rst
and second branches are depicted in the right-hand and the lefthand paths in Fig. 2, respectively. Fig. 4 illustrates two breast
masses in different steps of preprocessing stage.
The input ROI is subjected to histogram equalization at rst.
This increases the contrast of ROI; so, the mass margins will be
more visible. The outcome of this stage is illustrated in Fig. 4c.
Zernike moments are dependent on the translation and scaling
of masses in ROIs. In other words, the Zernike moments of
two similar objects that are not equally scaled and translated are
different. Thus, this suffering had better be compensated in the
preprocessing stage. Two steps have been employed to resolve the
dependency problems. Firstly, the centroid of each mass is translated into the center of corresponding ROI. This step removes the
dependency of Zernike moments to object translation. The centroid
of masses has been calculated by the following equations [23]:
"P
#
N1 PN1
c0
c 0 cf c,r
c0 PN1 PN1
2
c0
c 0 f c,r
"P
r0
N1 PN1
c0
c 0 rf c,r
PN1 PN1
c0
c 0 f c,r
#
3
where c0 and r0 denote the column number and row number of the
centroid, respectively. The pair (c,r) which is equal to (x,y) denotes
the coordination of the image while f(c, r) is the image function.
The size of the image is N N.
Then, the translation vector is computed and given as follows:

!
N
N
!
!
vj round
c0 c round
r0 r
4
2
2
2.1. Segmentation
In this paper, manual segmentation has been utilized. In fact,
each ROI has been segmented by two different expert radiologists and the nal boundary of mass has been calculated using
Fig. 3. Segmentation of masses based on manual segmentation and radial

averaging method.
729
Fig. 4. (a) Input ROI that contains only one mass, (b) manual segmentation of mass by an expert radiologistm (c) input ROI after histogram equalization, (d) mass boundary
after segmentation in binary format and (e) mass shape in binary format.
where v j is the translation vector. In addition, considering that

the size of the ROI is N N, N/2 denotes its center. Besides, j is the
index of input ROI. Note that after the translation, the centroid of
mass coincides with the center of ROI.
In the second step, the masses should be co-scaled. In fact, the
ROIs should be scaled in such a manner that equalizes the radii of
all masses. Initially, the average radius of each mass is found
using NRL vector, which has been employed as a descriptor of
mass margin in many applications [2426]. Fig. 5 illustrates how
to extract this vector. Each member of d is the Euclidian distance
of the mass margin from its centroid for angle y. Vector d can be
calculated using the following relation [24]:
djy
q
cyj c0j 2 ryj r0j 2 ,
80 r y o 2p
where (c0,r0) denotes the coordinates of centroid of the mass

which is already coincided with the center of ROI. Besides, (cy,ry)
denotes the coordinates of the mass margin for angle y. Note that
y does not change continuously and has M discrete values.
Furthermore, j is the index of input ROI. Vector d is depicted in
Fig. 5b. Eventually, the following equation is used to calculate the
scaling coefcient:
kj
1=M
R
PM
i1
dji
where k denotes the suitable scaling coefcient for jth ROI and M
is the length of NRL vector. Moreover, R is the desired nal radius
of mass which is set to 50 pixels in this research. Now, each mass
can be scaled properly using the suitable k j.
After the translation and co-scaling procedures, some rows and
columns might become unusable. In fact, the new values of those
rows and columns might carry no useful information. Therefore, in
the last step of preprocessing stage, each ROI has been cropped to
discard the proposed rows and columns. In the crop procedure, the
same numbers of rows are discarded from the top and the bottom
of ROI; likewise, the same numbers of columns are discarded from
the left-side and the right-side of ROI. This results in keeping the
centroid of mass in the center of ROI after the crop procedure. The
numbers of discarded rows and columns are selected in such a
manner that preserves the essential information of ROI and keeps
the size of nal image (N) an even number.
The outputs of the preprocessing stage are two images: one of
them is a binary image, which contains the shape characteristics
and the other is a gray scale image, which contains the margin
Fig. 5. (a) Mass boundary in a binary image and (b) the NRL vector of the mass and
its mean value.
characteristics. In the rest of paper, we call the former Mass Shape

Image (MSI) and the latter Mass Margin Image (MMI).
2.3. Feature extraction and selection
The computation of Zernike moments from an input image
includes three steps: computation of radial polynomials, computation of Zernike basis functions and computation of Zernike
moments by projecting the image onto the Zernike basis functions [1013].
The procedure for obtaining Zernike moments from an input
image starts with the computation of Zernike radial polynomials
[11]. The real-valued 1-D radial polynomial Rn,m is dened as
Rn,m
n9m9=2
X
s0
1s
ns!
rn2s
s!n 9m9=2s!n9m9=2s!
730
where n is a non-negative integer representing the order of the radial

polynomial. m is a positive or negative integer satisfying constraints
n 9m9even and 9m9rn representing the repetition of the azimuthal angle. r is the length of vector from the origin to (x, y) [10,27].
Using (7), complex-valued 2-D Zernike basis functions, which
are dened within a unit circle, are formed by
Vn,m r, y Rn,m r ejmy ,
9r9 r 1
The complex Zernike polynomials satisfy the orthogonality

condition [1013].
p
Z 2p Z 1
n p,m q
n
Vn,m
r, yVp,q r, yr dr dy n 1
9
0
otherwise
0
0
where * denotes the complex conjugate. As mentioned before, the
orthogonality implies no redundancy or overlap of information
between the moments with different orders and repetitions. This
property enables the contribution of each moment to be unique
and independent from the information in an image [10]. In fact,
they can be fair features.
Complex Zernike moments of order n with repetition m are
dened as [12]
Zn,m
n1
Z 2p Z
0
1
0
n
f r, yVn,m
r, yr dr dy
10
where f (c,r) is the image function. For digital images, the integrals
in (10) can be replaced by summations. Moreover, the coordinates
of the image must be normalized into [0,1] by a mapping transform. Fig. 6 depicts a general case of the mapping transform. Note
that in this case, the pixels located on the outside of the circle are
not involved in the computation of the Zernike moments. Eventually, the discrete form of the Zernike moments for an image
with the size N N is expressed as follows [27]:
Zn,m
1 N
1
X
n 1 NX
lN
1 N
1
X
n 1 NX
lN
n
f c,rVn,m
c,r
c0r 0
f c,rRn,m rcr ejmycr
11
c0r 0
where 0 r rcr r1, and lN is a normalization factor. In the discrete

implementation of Zernike moments, the normalization factor
must be the number of pixels located in the unit circle by the
mapping transform, which corresponds to the area of a unit circle,
p, in the continuous domain [10]. The transformed distance, rcr,
and the phase, ycr, at the pixel of (c, r) are given by the following
equations. Note that c and r denote the column and row number
of image, respectively
q
2cN 12 2rN 12
rcr
N

N12r
ycr tan1
2cN 1
12
Using (11) and (12), a nal equation can be obtained which is

only a function of c, r, m and n. Using this equation, we can simply
calculate Zernike moments for every image without having
difculty with mapping functions.
As a matter of fact, if the size of the ROI is odd, it will have a
center point with the following coordination:

N1 N1
c,r
,
13
2
2
Combining (12) and (13) yields the following results:
rcr 0
ycr tan1

0
Nan
0
14
In fact, the phase value will be undened. The best way to

resolve the problem is to select the size of image an even number.
Therefore, the image will not have a center and the problem will
be resolved without any redundancy.
It can be shown that rotating the object around the Z-axis do
not inuence the magnitude response of the Zernike moments
[11,28]. Indeed, the rotation only inuences the phase response of
Zernike moments. In other words, it can be uttered
r
r
Zn,m
Zn,m ejma 9Zn,m
9 9Zn,m 9
15
r
where Zn,m
and Zn,m are the moments which are extracted from
the object and the rotated object, respectively. The rotation angle
is a. The magnitude of the Zernike moment of object and that of
rotated object are equal. Thus, our proposed features are the
magnitudes of Zernike moments which are proper descriptors of
shape characteristics. Fig. 7 illustrates the magnitude plots of
some low order Zernike moments in the unit disk.
Not to mention the fact that high-order Zernike moments not
only have a high computational complexity [10,16,17], but also
represent a high sensitivity to noise [29]. In fact, they may diminish
the performance of the system if they are not selected precisely.
However, they might be better descriptors of shape and margin
characteristics than low-order Zernike moments. Therefore, in order
to analyze the effect of orders of Zernike moments on the performance of the overall system, a group of high-order Zernike moments
and also a group of low-order Zernike moments have been extracted
from MSI and MMI images, separately. These two groups of Zernike
moments are tabulated in Table 1. The rst group includes 32 loworder moments which satisfy the following conditions [27]:
8
3 rn r10
>
>
>
< 9m9r n
Group 1 fZn,m g8
16
>
n9m9 2k
>
>
:
kAN
Moreover, the second group includes 32 high-order moments

which satisfy the following conditions:
8
10 r n r 17
>
>
>
< 9m9 r n
17
Group 2 fZn,m g8
>
n9m9 4k
>
>
:
kA N
Hence, in the feature selection stage 4 different groups of
features can be selected which are:
Fig. 6. (a) N N digital image with function f(c, r) and (b) the image mapped onto
the unit circle.
(1) The magnitude of the low-order Zernike moments, which are

extracted from MSI pictures (LMSI).
Fig. 7. Plots of the magnitude of low-order Zernike basis functions in the unit disk.
Table 1
The proposed Zernike moments.
Group
3
4
5
6
7
8
9
10
1,
0,
1,
0,
1,
0,
1,
0,
3
2,
3,
2,
3,
2,
3,
2,
4
5
4,
5,
4,
5,
4,
10
11
12
13
14
15
16
17
2,
3,
0,
1,
2,
3,
0,
1,
6,
7,
4,
5,
6,
7,
4,
5,
10
11
8, 12
9, 13
10, 14
11, 15
8, 12, 16
9, 13, 17
Number of
moments
32
6
7
6, 8
7, 9
6, 8, 10
32
(2) The magnitude of the low-order Zernike moments, which

are extracted from MMI pictures (LMMI).
(3) The magnitude of the high-order Zernike moments, which are
extracted from MSI pictures (HMSI).
(4) The magnitude of the high-order Zernike moments, which are
extracted from MMI pictures (HMMI).
Either every group of introduced features or a symmetric
combination of them or an asymmetric combination of them
can be chosen and applied to the classication stage. Considering
the performance of the overall system, the best group of features
will be found. We are interested in knowing how the orders and
iterations of Zernike moments will inuence the overall CADx
systems performance. Thus, the total number of moments
applied to the classier has always been kept constant, which is
32 in this research, while their orders and iterations has been
changed; this will reveal the behavior of system as a function of
orders and iterations of moments.
2.4. Classication
Articial Neural Networks (ANNs) have been used widely as
good classiers in medical diagnosis of mammography images.
731
For instance, they have been utilized in diagnosis of microcalcications [30], detection of microcalcication clusters [31,32], diagnosis
of masses [33], and detection of suspicious masses [34]. According to
literature, MLP has always been a good classier in mammography
image processing applications [30,31]. Therefore, in this research, an
MLP classier has been employed.
OBL can improve the convergence rate in machine intelligent
algorithms [21,35]. It has been used in some evolutionary algorithms and in Reinforcement Learning (RL) yielding interesting
results [36]. Moreover, it has been utilized in neural network
learning rules such as resilient back propagation with opposite
weights [21] and back propagation with opposite activation functions [37]. In this research, the opposition-based idea is utilized in
order to improve the convergence rate of generic back propagation learning rule as well as speed up the training time [35].
Generally speaking, when a dataset with a large number of
instances is available, the input patterns are usually partitioned
into three equal-sized groups called training, validation and
testing. In fact, each group gets approximately 33% of total input
patterns [33]. However, in this research, due to the limited
number of instances available in the MIAS database, 40% of input
patterns are dedicated to the training set, 30% of them are
dedicated to the validation set and also the remaining 30% are
dedicated to the testing set.
The proposed MLP classier has been trained using the training set so that the suitable weights are found. Moreover, the
number of hidden layers and their nodes are being changed until
the best network topology, which yields the best accuracy as well
as small amount of FNR, is found. Not to mention the fact that in
order to nd the best topology and avoid over-training, the
accuracy on validation set has been evaluated every 1000 training
epochs.
After nding the best topology and appropriate number of
training epochs with the aid of validation set, the testing patterns,
which are unseen for the trained classier, are applied to the
classier and the performance has been evaluated. Note that the
Accuracy measure reported in the experiments and results section
represents the ratio of correct classied patterns to the total
number of patterns in the testing set (e.g. unseen data).
The input layer has 32 nodes which equals to the number of
utilized features. The sigmoid function has been utilized as the
activation function of all internal nodes. Moreover, the output
node uses a linear activation function. This increases the dynamic
range of output node, so the ROC curve can be computed more
precisely [38].
3. Experiments and results

The following terms are benchmark functions of a CADx
system:
Accuracy
FNR
TP TN
TN FN TP FP
FN
FN TP
Sensitivity TPR
18
19
TP
TP FN
Specificity 1FPR
TN
TN FP
20
21
where FP, FN, TP and TN denote false-positive, false-negative,

true-positive and true-negative answers, respectively. Moreover,
FPR and TPR denote false-positive rate and true-positive rate,
respectively. The most common way of reporting the accuracy of
732
a binary prediction is using the true (or false) positives and true
(or false) negatives separately [38]. This recognizes that a false
negative prediction may have different consequences from the
false positive one. In this research, a false-positive answer is less
important than a false-negative answer due to the fact that a
false-negative answer may lead to the death of a patient. On the
other hand, a false-positive answer leads to an in-vain biopsy
which brings discomfort for women. A plot of Sensitivity (TPR) vs.
1-Specicity (FPR) with changing the decision threshold is called
Receiver Operating Characteristic (ROC) [39]. The area under the
ROC curve, Az, shows how successful the classication is [39].
In this research, each of HMMI, HMSI, LMMI, and LMSI features
or a combination of them have been applied to MLP classiers with different structures and different learning rules, and
the results have been reported. Eventually, those features and
Table 2
Diagnosis performance for HMSI features.
Name
No. of hidden
layers
FPR (%)
FNR (%)
Accuracy (%)
Az
N1
N2
N3
1
2
3
11.11
16.66
11.13
40
10
0
78.5
85.7
92.8
0.755
0.94
0.975
Fig. 8. ROC plot for N1, N2, and N3. The dashed line illustrates the ROC plot of a
random decision making system.
classiers which yield the best performance and the smallest FNR
have been chosen as the nal system.
The HMSI features are the high-order Zernike moments which
are extracted from the MSI images and contain shape charactristics. The utilized MLP has 32 input nodes and one output node.
The number of hidden layers and their nodes are being changed
so that the best structure is found. The linear function is used as
the activation function of the output node which increases the
dynamic range of that node. In order to evaluate the performance
of the HMSI features, we have employed MLPs with 1, 2, and
3 hidden layers. Table 2 shows those structures which yield the
best diagnosis performances. Although the MLPs with 4 hidden
layers need more training time, they do not represent the desired
performances; so their results are omitted for brevity. In order to
nd FPR, FNR and accuracy in the table, the decision threshold is
selected in such a manner to yield the smallest number of wrong
answers. However, the FNR, FPR and accuracy parameters are not
enough measures for a CADx system. Therefore, the ROC plots of
the systems which are tabulated in Table 2 are illustrated in Fig. 8.
The process which is explained in the previous paragraph has
been also applied in other groups of features such as HMMI, LMSI,
LMMI, and their combinations. Results are reported in Table 3.
The rst column is the name of each system while the second
column explains the utilized features in each of them. The third
column shows the name of feature groups employed in each
system. The 4th and 5th columns report the utilized learning rule
and the number of training epochs in each system. Column
6 shows the employed orders in each system. Column 7 represents
the number of hidden layers. Eventually, the four last columns of
table report the performances of the developed systems.
The rst four rows of the table are those features which utilize
the high-order Zernike moments and BP learning rule. It is shown
that S1, which uses HMSI features, represents an acceptable
accuracy, a fair Az and a FNR which is almost equal to zero.
On the other hand, the M1 which uses HMMI features has not
yielded a fair accuracy and its FNR is signicantly higher than S1.
However, a symmetric combination of these two groups of features
leads to developing SMH1 system which represents smaller FNR
than M1 while its FPR is almost unchanged. With increasing the
number of HMSI features in the complex system, the FNR decreases
while the FPR stays almost constant. It can be inferred that the
Zernike moments which are extracted from the margin information
are not suitable descriptors of malignancy of masses. Indeed, the
best performance belongs to the system that utilizes shape descriptors. The ROC plots of explained systems are illustrated in Fig. 9. It is
obvious that S1 has the best Az.
The middle four rows in Table 3 belong to those systems which
are using low-order Zernike moments and BP learning rule. These
systems show the same behavior as those expressed in the
Table 3
Final result.
Name
Selected
features
Feature
group
Learning
rule
No. of training
epochs
Order of
moments (n)
No. of hidden
layers
FPR (%)
FNR (%)
Accuracy (%) Az
S1
M1
SMH1
SMH2
S2
M2
SML1
SML2
32
32
16
20
32
32
16
20
shape
margin
shape 16
shape 12
shape
margin
shape 16
shape 12
HMSI
HMMI
Complex
Complex
LMSI
LMMI
Complex
Complex
BP
BP
BP
BP
BP
BP
BP
BP
10,379
12,534
11,687
9851
9173
13,647
12,118
10,226
1017
1017
1017
1017
310
310
310
310
3
3
1
2
2
2
1
1
11.13
22.2
27.7
22.2
5.5
27.78
16.67
27.78
0.0
51.1
39.8
20.2
0.0
39.8
50.3
29.9
92.8
67.8
67.85
78.57
96.43
67.86
71.43
71.43
0.975
0.547
0.595
0.816
0.976
0.531
0.642
0.795
SO1
MO1
SO2
MO2
32
32
32
32
shape
margin
shape
margin
HMSI
HMMI
LMSI
LMMI
OBL
OBL
OBL
OBL
4945
6127
5326
5792
1017
1017
310
310
2
2
2
2
16.66
16.66
11.11
38.8
9.9
49.9
9.8
20.1
85.71
71.42
89.28
67.86
0.872
0.692
0.88
0.588
margin
margin
margin
margin
previous paragraph. The ROC plots of them are illustrated in

Fig. 10. It is obvious that S2 which utilizes the shape descriptors
has the best ROC curve.
Note that those systems which are using low-order Zernike
moments need smaller number of hidden layers. In other words,
they have smaller computational complexity in the feature
extraction and classication stages.
The last four rows in Table 3 belong to those systems which
are using OBL learning rule in their classication stage. Although
FNR, FPR, Accuracy and Az for those systems which utilize OBL
algorithm is smaller than those employing BP learning rule, the
number of training epochs for OBL algorithm is signicantly
smaller. Therefore, utilizing OBL learning rule in the classication
stage increases the convergence rate and decreases the training
time. The ROC plots of these systems illustrated in Fig. 11.
It can be inferred that S1 and S2 represent the best performances among those systems employed high-order and loworder Zernike moments, respectively. Fig. 12 compares the ROC
curves of these two systems. It can be seen that they have almost
733
the same Az. Moreover, S1 represents a higher specicity while S2

represents a higher sensitivity. Thus, considering the proposed
clinical application, we can use one of them. However, S2 not only
has smaller computational complexity than S1 but also represents
a kind of better performance.
In addition, a statistical analyze has been done to evaluate the
level of uncertainty of the proposed system. In fact, the experiments have been conducted using 10-fold cross-validation [40].
To perform the test, the data is rst partitioned into 10 equally
Fig. 11. ROC plot for SO1, SO2, MO1, and MO2. All of them employ OBL
learning rule.
Fig. 9. ROC plot for S1, M1, SMH1, and SMH2. All of them utilize high-order
Zernike moments.
Fig. 12. ROC plot of S1 vs. S2; S2 yields higher sensitivity than S1 while S1 yields
higher specicity.
Table 4
Performance measures using 10-fold cross validation.
Fig. 10. ROC plot for S2, M2, SML1, and SML2. All of them utilize low-order
Zernike moments.
Average on 10 folds for S1

Average on 10 folds for S2
FPR (%)
FNR (%)
Accuracy (%)
10.55
8.2
5
3.2
91.4
93.6
734
Table 5
The proposed CADx system in comparison with other CADx systems.
Reference
Year
Feature extraction technique
Database
FPR (%)
FNR (%)
Accuracy (%)
Az
This research (S2)

This research (S2 using
10 fold crossvalidation)
Tahmasbi et al. [25]
Boujelben et al. [41]
2011
2011
Zernike moments as shape descriptors

Zernike moments as shape descriptors
MIAS
MIAS
5.5
8.2
0.0
3.2
96.43
93.6
0.976
2010
2009
MIAS
DDSM
5.56
9.9
92.8
95.98
0.98
Rojas et al. [42]
2009
DDSM
MIAS
81.0
Mu et al. [43]
2008
MIAS
0.92
Rangayyan et al. [44]

2007
2006
2000
FTRD features
Boundary vector of three derives of radial distance measure, convexity
and index angle
Spiculation measure based on relative gradient orientation, spiculation
measure based on signature information, and two features which are
measure of fuzziness of mass margins
Fourier factor, Spiculation index, fractal dimension, compactness, and
fractal dimension
Fractal dimension, Fractional concavity
Index of convexity from the turning angle function
Spiculation Index, compactness and fractional concavity
MIAS
MIAS
MIAS
81.5
0.82
0.93
0.79
sized segments or folds. Then, 10 iterations of training and validation are performed; a different fold of the data is held out in each
iteration for validation while the remaining 9 folds are used for
training [40]. Table 4 presents the average value of performance
measures on 10 folds.
Finally, a brief comparison between our proposed approach and
other CADx algorithms developed recently is represented in Table 5.
We have tried to evaluate those systems that utilize geometric
features as the descriptors of mass shape and margin. Not to
mention the fact that the CADx systems that only employ such
features are noticeably rare as the geometric features are usually
used beside other features such as density and texture. Moreover, a
precise comparison of the systems is a difcult task since different
researchers have employed different mammography databases.
Besides, most performance benchmarks are not reported.
According to Table 5, S2 that employs 32 low-order Zernike
moments as shape descriptors represents a fair Az which is
comparable with other reported systems. In addition, its accuracy
is noticeably higher than those systems utilized MIAS database.
As it mentioned before, one of our objectives were reducing FNR.
Unfortunately, most of researchers have not reported the FNR of
their CADx system. However, our achieved FNR is almost equal to
zero and is better than [25].
The second row of the table shows the results of S2 based on 10
fold cross-validation approach. Although the performance measures
are slightly lower than the rst row of the table, they are statically
validated and are more certain. Fortunately, these results are also
comparable with results of other reported systems.
4. Conclusion
In this paper, a novel CADx system has been introduced for the
diagnosis of breast masses. The Zernike moments are utilized
as the descriptors of shape and margin characteristics of masses.
The input ROIs are segmented and preprocessed; and nally,
two images which contain shape and margin characteristics of
masses are applied to the feature extraction stage. Two groups of
features, each containing 32 Zernike moments with different
orders, are extracted from the images. Considering the performance of the overall system, the best 32 moments are selected
and applied to a Multi-Layer Perceptron classier which is trained
with both BP and OBL learning rules. The two nal systems which
employ the shape descriptors yield an Az equal to 0.975 and 0.976
and also have a good specicity and sensitivity, respectively. The
FNR is almost equal to zero in both of these two systems. Besides,
the best achieved FPR is 5.5%. Although FNR, FPR, Accuracy and Az
for those systems which utilize the OBL algorithm is smaller than
those employing BP learning rule, the number of training epochs
for OBL algorithm is signicantly smaller. Therefore, utilizing the
OBL learning rule in the classication stage increases the convergence rate and decreases the training time.
It is worth mentioning that the utilized feature selection stage
in this research is manual. The researchers are advised to develop
this stage by an autonomous algorithm which takes the FNR and
FPR parameters and then nds the best combination of features
automatically.
Conict of interest statement

None declared.
References
[1] American Cancer Society, Breast Cancer Facts & Figures 20092010, Atlanta,
2009.
[2] C. Alan, Bovik, Handbook of Image and Video Processing, 2nd ed., Elsevier
Academic Press, 2005, pp. 11951217.
[3] T.Z. Tan, C. Quek, G.S. Ng, E.Y.K. Ng, A novel cognitive interpretation of breast
cancer thermography with complementary learning fuzzy neural memory
structure, Expert Systems with Applications 33 (2007) 652666.
[4] Q. Fang, S.A. Carp, J. Selb, G. Boverman, Q. Zhang, D.B. Kopans, R.H. Moore,
E.L. Miller, D.H. Brooks, D.A. Boas, Combined optical imaging and mammography of the healthy breast: optical contrast derived from breast structure
and compression, IEEE Transactions on Medical Imaging 1 (28) (2009) 3042.
[5] R.M. Rangayyana, F.J. Ayresa, J.E.L. Desautels, A review of computer aided
diagnosis of breast cancer: toward the detection of subtle signs, Journal of the
Franklin Institute 344 (2007) 312348.
[6] H.D. Cheng, X.J. Shi, R. Min, L.M. Hu, X.P. Cai, H.N. Du, Approaches for
automated detection and classication of masses in mammograms, Pattern
Recognition 39 (2006) 646668.
[7] N. Szekely, N. Toth, B. Pataki, A hybrid system for detecting masses in
mammographic images, IEEE Transactions on Instruments and Measurement
3 (55) (2006) 944952.
[8] X. Zhang, X. Gao, Y. Wang, MCs detection with combined image features and
twin support vector machines, Journal of Computers 3 (4) (2009) 215221.
[9] American College of Radiology, ACR BI-RADSMammography, Ultrasound &
Magnetic Resonance Imaging, 4th ed., American College of Radiology, Reston,
VA, 2003.
[10] S.K. Hwang, W.Y. Kim, A novel approach to the fast computation of Zernike
moments, Pattern Recognition 39 (2006) 20652076.
[11] W. Wang, J.E. Mottershead, C. Mares, Mode-shape recognition and nite
element model updating using the Zernike moment descriptor, Mechanical
Systems and Signal Processing 23 (2009) 20882112.
[12] Sh. Li, M.Ch. Lee, Ch.M. Pun, Complex Zernike moments features for shapebased image retrieval, IEEE Transactions on Systems, Man and Cybernetics,
Part A: Systems and Humans 1 (39) (2009) 227237.
[13] X. Li, A. Song, A new edge detection method using GaussianZernike moment
operator, in: Proceedings of the IEEE, 2nd International Asia Conference on
Informatics in Control, Automation and Robotics, 2010, pp. 276279.
[14] J. Haddadnia, M. Ahmadi, K. Faez, An efcient feature extraction method with

pseudo-Zernike moment in RBF neural network-based human face recognition system, Journal of Applied Signal Processing 9 (2003) 890901.
[15] Z. Chen, Sh.K. Sun, A Zernike moment phase-based descriptor for local image
representation and matching, IEEE Transactions on Image Processing 1 (19)
(2010) 205219.
[16] Ch. Y. Wee1, R. Paramesran, F. Takeda, Fast computation of Zernike moments
for rice sorting system, in: Proceedings of the IEEE, International Conference
on Image Processing (ICIP), 2007, pp. VI-165VI-168.
[17] B. Fu, J. Liu, X. Fan, Y. Quan, A hybrid algorithm of fast and accurate
computing Zernike moments, in: Proceedings of the IEEE, International
Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2007,
pp. 268272.
[18] T. Rui-ying, G. Wen, M. Li-hua, L. Ben-yao, X. Guang-wei, BI-RADS categorization and positive predictive value of mammographic features, Chinese
Journal of Cancer Research 3 (13) (2001) 202205.
[19] C. Balleyguier, et al., BIRADSTM classication in mammography, European
Journal of Radiology 61 (2007) 192194.
[20] N.F. Boyd, et al., Heritability of mammographic density, a risk factor for
breast cancer, New England Journal of Medicine 12 (347) (2002) 886894.
[21] H.R. Tizhoosh, Opposition-based learning: a new scheme for machine intelligence, in: Proceedings of the International Conference on Computational
Intelligence for Modeling Control and Automation CIMCA05, Vienna, Austria,
2005.
[22] J. Suckling, et al., The Mammographic Image Analysis Society Digital
Mammogram Database, Exerpta Medica, International Congress Series,
1069, 1994, pp. 375378.
[23] W.K. PRATT, Digital Image Processing: PIKS Inside, third ed., John Wiley &
Sons, 2001 (Chapter 18).
[24] L.M. Bruce, R.R. Adhami, Classifying mammographic mass shapes using the
wavelet transform modulus-maxima method, IEEE Transactions on Medical
Imaging 12 (18) (1999) 11701177.
[25] A. Tahmasbi, F. Saki, S.B. Shokouhi, Mass diagnosis in mammography images
using novel FTRD features, in: Proceedings of the IEEE, 17th Iranian Conference on Biomedical Engineering (ICBME2010), Isfahan, Iran, 2010, pp. 15.
[26] A. Tahmasbi, F. Saki, S.M. Seyedzadeh, S.B. Shokouhi, Classication of breast
masses based on cognitive resonance, in: Proceedings of the IEEE, 3rd
International Conference on Signal Acquisition and Processing (ICSAP2011),
Singapore, 2011, pp. V1-97V1-101.
[27] A. Tahmasbi, F. Saki, S.B. Shokouhi, An effective breast mass diagnosis system
using zernike moments, in: Proceedings of the IEEE, 17th Iranian Conference
on Biomedical Engineering (ICBME2010), Isfahan, Iran, 2010, pp. 14.
[28] A. Khotanzad, Y.H. Hong, Invariant image recognition by Zernike moments,
IEEE Transactions on Pattern Analysis and Machine Intelligence 5 (12) (1990)
489497.
[29] C.H. Theh, R.T. Chin, On image analysis by the methods of moments, IEEE
Transactions on Pattern Analysis and Machine Intelligence 4 (10) (1988)
496513.
[30] L. Wei, Y. Yang, R.M. Nishikawa, Y. Jiang, A study on several machine-learning
methods for classication of malignant and benign clustered microcalcications, IEEE Transactions on Medical Imaging 3 (24) (2005) 371380.
[31] S. Halkiotisa, T. Botsisa, M. Rangoussib, Automatic detection of clustered
microcalcications in digital mammograms using mathematical morphology
and neural networks, Signal Processing 87 (2007) 15591568.
[32] F. Dehghan, H. Abrishami-Moghaddam, Comparison of SVM and neural
network classiers in automatic detection of clustered microcalcications
in digitized mammograms, in: Proceedings of the IEEE, 7th International
Conference on Machine Learning and Cybernetics, Kunming, China, 2008,
pp. 756761.
[33] B. Verma, P. McLeod, A. Klevansky, Classication of benign and malignant
patterns in digital mammograms for the diagnosis of breast cancer, Expert
Systems with Applications 37 (2010) 33443351.
[34] K. Bovis, S. Singh, J. Fieldsend, C. Pinder, Identication of masses in digital
mammograms with MLP and RBF Nets, in: Proceedings of the IEEE, International Joint Conference on Neural Networks, vol. 1, 2000, pp. 342347.
735
[35] F. Saki, A. Tahmasbi, S. B. Shokouhi, A novel opposition-based classier for

mass diagnosis in mammography images, in: Proceedings of the IEEE, 17th
Iranian Conference on Biomedical Engineering (ICBME2010), Isfahan, Iran,
2010, pp. 14.
[36] H.R. Tizhoosh, M. Ventresca, Oppositional Concepts in Computational Intelligence, Springer, 2008.
[37] M. Ventresca, H.R. Tizhoosh, Improving the convergence of back propagation
by opposite transfer function, in: Proceedings of the International Joint
Conference on Neural Networks, 2006, pp. 47774784.
[38] M. Gonen,
et al., Receiver operating characteristic (ROC) curves, paper 210,
in: Proceedings of the SUGI, vol. 31, 2003.
[39] J. Bozek, M. Mustra, K Delac, M. Grgic, A survey of image processing
algorithms in digital mammography, Recent Advances in Multimedia Signal
Processing and Communications 231 (2009) 631657.
[40] P. Refaeilzadeh, L. Tang, H. Liu, Cross-validation, Encyclopedia of Database
Systems, 2009, pp. 532538.
[41] A. Boujelben, A. Ch. Chaabani, H. Tmar, M. Abid, feature extraction from
contours shape for tumor analyzing in mammographic images, in: Proceedings of the IEEE, International Conference on Digital Image Computing:
Techniques and Applications, 2009, pp. 395399.
[42] A. Rojas-Dominguez, A.K. Nandi, Development of tolerant features for characterization of masses in mammograms, Computers in Biology and Medicine
39 (2009) 678688.
[43] T. Mu, A.K. Nandi, R.M. Rangayyan, Classication of breast masses using
selected shape, edge-sharpness, and texture features with linear and kernelbased classiers, Digital Imaging 2 (21) (2008) 153169.
[44] R.M. Rangayyan, T.M. Nguyen, Fractal analysis of contours of breast masses in
mammograms, Digital Imaging 3 (20) (2007) 223237.
[45] R.M. Rangayyan, D. Guliato, J. Cavalho, S. Santiago, Feature extraction from
the turning angle function for the classication of contours of breast tumors,
Digital Imaging (2007).
[46] R.M. Rangayyan, N.R. Mudigonda, J.E.L. Desautels, Boundary modeling and
shape analysis methods for classication of mammographic masses, Medical
and Biological Engineering and Computing 38 (2000) 487496.
Amir Tahmasbi received his B.Sc. degree in Electronics Engineering from the
Department of Electrical Engineering, Shiraz University of Technology, Shiraz, Iran
in 2008. Currently, he is pursuing his M.Sc. degree in Electronics Engineering at
the Department of Electrical Engineering, Iran University of Science and Technology (IUST), Tehran, Iran. His research interests lie in the elds of image processing,
biomedical image analysis, real-time digital signal processing and pattern recognition. Moreover, he is interested in implementing digital lters as well as image
processing algorithms on TMS320C6xxx platforms.
Fatemeh Saki received her B.Sc. degree in Electronics Engineering from the
Department of Electrical Engineering, Chamran University, Ahwaz, Iran, in 2008.
Now, she is studying for M.Sc. degree in Electronics Engineering at the Department
of Electrical Engineering, Iran University of Science and Technology (IUST), Tehran,
Iran. Her research interests include CADx system design for the diagnosis of breast
cancer and neural networks.
Shahriar B. Shokouhi received his B.Sc. and M.Sc. degrees in Electronics in 1986
and 1989, both from the Department of Electrical Engineering, Iran University of
Science & Technology (IUST). He received his Ph.D. in Electrical Engineering in
1999 from School of Electrical Engineering, University of Bath, England. Since
2000, he is an assistant professor in the Department of Electrical Engineering,
IUST. His research interests include signal and image processing, machine vision,
pattern recognition and intelligent systems design.

Zernike CBM 2011

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Zernike CBM 2011

Загружено:

Авторское право:

Доступные форматы

Computers in Biology and Medicine 41 (2011) 726735

Contents lists available at ScienceDirect

Computers in Biology and Medicine

Classication of benign and malignant masses based on Zernike moments

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

and as a feature set in pattern recognition [14]. However, they are

In order to extract shape characteristics, the input ROI has

Fig. 2 represents the owchart of the proposed approach. The

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

radial averaging. Fig. 3 depicts the basic idea of radial averaging

where r(y) is a point on the nal mass boundary for a special

The operations in the preprocessing stage are divided into two

Fig. 3. Segmentation of masses based on manual segmentation and radial

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

where v j is the translation vector. In addition, considering that

where (c0,r0) denotes the coordinates of centroid of the mass

characteristics. In the rest of paper, we call the former Mass Shape

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

where n is a non-negative integer representing the order of the radial

The complex Zernike polynomials satisfy the orthogonality

f c,rRn,m rcr ejmycr

where 0 r rcr r1, and lN is a normalization factor. In the discrete

Using (11) and (12), a nal equation can be obtained which is

In fact, the phase value will be undened. The best way to

Moreover, the second group includes 32 high-order moments

(1) The magnitude of the low-order Zernike moments, which are

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

(2) The magnitude of the low-order Zernike moments, which

3. Experiments and results

where FP, FN, TP and TN denote false-positive, false-negative,

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

previous paragraph. The ROC plots of them are illustrated in

the same Az. Moreover, S1 represents a higher specicity while S2

Average on 10 folds for S1

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

Feature extraction technique

This research (S2)

Zernike moments as shape descriptors

Rojas et al. [42]

Rangayyan et al. [44]

Conict of interest statement

A. Tahmasbi et al. / Computers in Biology and Medicine 41 (2011) 726735

[14] J. Haddadnia, M. Ahmadi, K. Faez, An efcient feature extraction method with

[35] F. Saki, A. Tahmasbi, S. B. Shokouhi, A novel opposition-based classier for

Вам также может понравиться