Академический Документы
Профессиональный Документы
Культура Документы
Cluster Computing
The Journal of Networks, Software Tools
and Applications
ISSN 1386-7857
Cluster Comput
DOI 10.1007/s10586-018-1914-8
1 23
Your article is protected by copyright and
all rights are held exclusively by Springer
Science+Business Media, LLC, part of
Springer Nature. This e-offprint is for personal
use only and shall not be self-archived in
electronic repositories. If you wish to self-
archive your article, please use the accepted
manuscript version for posting on your own
website. You may further deposit the accepted
manuscript version in any repository,
provided it is only made publicly available 12
months after official publication or later and
provided acknowledgement is given to the
original source of publication and a link is
inserted to the published article on Springer's
website. The link must be accompanied by
the following text: "The final publication is
available at link.springer.com”.
1 23
Author's personal copy
Cluster Computing
https://doi.org/10.1007/s10586-018-1914-8 (0123456789().,-volV)(0123456789().,-volV)
Abstract
The work here intends to develop an algorithm for optimizing the available feature set for identifying tumor from brain
MRI images. A set of features are selected based on texture features. From the large set of features relevant features would
be selected using wrapper approach. Further, an optimized subset of the relevant features is generated with the help of
Genetic Algorithm. The machine learning with support vector machine algorithm is used for detection and segmentation of
tumors in the brain MRI image acquired. The superiority of the algorithm is established by comparing it with the state of
the art algorithms such as level set method and fuzzy based methods. The authors are using performance measurement tools
including manual segmentation and volume based tools for validating the claim.
1 Introduction grow with time [5]. As they grow, it will become more
conspicuous and start showing their characters [6]. A per-
Based on the current scenario, a tumor is the second cause son with tumors usually shows certain symptoms and that
of cancer related death in the world for both male and will bring the person to a physician. From this, they will be
female [1]. This fact increases the importance of research able to detect the smallest possible symptom of malignant
in the field of tumor detection, which helps the doctors for tumors (cancerous) and benign tumors (non-cancerous) in
early detection of disease and to perform necessary actions the screening process itself [7].
[2]. Now day’s a large variety of image processing tech-
niques are available, which can detect certain features of
the tumors such as the shape, size, border, calcification etc. 2 Related works
[3]. These extracted features will help to make the seg-
mentation process more accurate and precise. In medical There exist different methods for the image segmentation
ethics, tumors can also be named as neoplasm [4]. The and processing. These methods can be supervised, semi
tumor can be defined as the abnormal growth of normal supervised and unsupervised [8]. Zhang et al. [3] proposed
body tissue, which can be differentiated from the sur- a new method known as hidden Markov random field
rounding body tissue by its shape and structure. In the (HMRF) model. This model has the ability to encode both
beginning stage all tumors will be very small and they the spatial and statistical properties of a given image. When
compared with existing methods, it is more flexible for
image modelling. The limitation of this system is that its
& S. U. Aswathy
noorulaswathi98@rediffmail.com preliminary estimations is based on threshold value which
is purely heuristic and its time consuming. It gives inac-
1
Department of Computer Science, Noorul Islam University, curate results most of the time. Tolba et al. [4] proposed a
Kanyakumari, India new method for MRI brain image segmentation known as
2
Electronics and Instrumentation, Vimal Jyothi Engineering Gaussian Multi-Resolution Expectation Maximization
College, Kannur, Kerala, India algorithm [9]. This algorithm is based on EM algorithm
3
Electronics and Instrumentation, Noorul Islam University, and the multiresolution analysis of the given image [10].
Kanyakumari, India
123
Author's personal copy
Cluster Computing
The limitation of this technique is that miss-classified pixel based genetic algorithm [14]. The proposed method is a
is generated when this algorithm is applied to pixel laying block process integrated with the classifier to segment the
in the edges of boundaries. Taheria et al. [5] introduced a tumor region from MRI [15]. In this proposed method,
model based on a threshold value that uses level set truth image of the input is taken and texture features and
methods for 3D brain tumor segmentation. histogram based features are extracted [16]. Feature
In this proposed model, the speed function of the level selection is done by using genetic algorithm, feature subset
set was designed using a general threshold [11]. Zadeh selection is done by wrapper method, these selected fea-
et al. [3] in their paper explain the number of fuzzy clus- tures are used to train the classifier [17]. The sum of the
tering methods based on fuzzy set theory. Fuzzy c Mean total block output gives the classified region.
(FCM) algorithm assign every pixel to the clusters without The rest of this paper is organized as follows [18]. In
any label, however the algorithm fails to succeed com- Sect. 3 gives an idea of the method overview (Fig. 1). In
pletely to segment image having noise, images having the Sect. 4 some experimental analysis with the existing system
intensity difference, artefacts’ [12]. To overcome these and in Sect. 5 comparative analysis with manual delineation
draw back modified FCM algorithm was proposed. Sel- is done, and finally in Sect. 6 we conclude the paper.
vathi and Anitha [13] proposed a method to overcome the
intensity in homogeneity. Though this algorithm work well
in medical image segmentation, it has some limitation such 3 Method overview
as, time required for segmentation is more due to its
complexity in spatial and additional terms. In proposed system pre-processing of the image is done for
This paper recommends fully automatic and effective background separation so as to get the ROI. The set of
tumor segmentation based on SVM classifier and wrapper features are extracted using first order histogram
Input image
Morohologicaloperations
Pre -processing
Thresholding
Co-occurrence Matrix
Cross over
Feature subset (wrapper Mutation
based)
Fitness Function
Is Termi
Svm classifier Validation nation
123
Author's personal copy
Cluster Computing
techniques and co-occurrence matrix. The relevant features such features through texture analysis process is called as
selection from the available feature set is done by wrapper texture feature extraction. Four categories of extracting
method [19]. The Genetic algorithm is incorporated as a textural features are: (1) statistical methods (2) structural
wrapper based model in the proposed algorithm. In each methods (3) model based methods (4) transform-based
generation the population is evaluated and tested with the methods, of these we are using statistical method for fea-
termination algorithm [20]. If it fails the cross over, ture extraction.
mutation and fitness computation steps are repeated.
Finally classification of pixel to tumor and non tumor 3.2.1.1 Histogram based feature extraction Histogram
region is done by SVM classifier. based feature extraction depend only on individual pixel
values and not on the interaction or co-occurrence of
3.1 Pre-processing neighbouring pixel values The histogram-based features
used in this proposed system are first order statistics that
The pre-processing of the image is done so as to improve include mean, variance, skewness and kurtosis. Let z be a
the finer form of the image and to remove unnecessary random variable denoting image gray levels and p(zi),
noise present. In the proposed algorithm background sep- i = 0,1,2,3,…,L - 1, be the corresponding histogram,
aration is done to obtain the region of interest. In the below where L is the number of distinct gray levels. The features
Fig. 2, the major steps included in the preprocessing stage are calculated using the above-mentioned histogram.
are thresholding, region filling and morphological
operations. 3.2.1.2 Co-occurrence matrix based features Texture
feature computed using histograms suffer from the limita-
3.2 Feature extraction tion that they carry no information regarding the relative
position of the pixels with respect to each other, this can be
To represent an image, large amount of data is needed overcome using co-occurrence based feature extraction.
which in turn require more memory and time. In order to The second-order gray level probability distribution of a
reduce this large amount of data memory and time we go texture image can be calculated by considering the gray
for feature extraction. The extracted features contain the levels of pixels in pairs at a time. A second-order proba-
relevant information of an image. These feature extracted bility is often called a GLC probability. Texture feature
are said to contain the properties that describe the whole calculations use the contents of the GLCM to give a
image. Moreover, it also refer as an important piece of measure of the variation in intensity at the pixel of interest.
information which is relevant for solving the computational The co-occurrence matrix is computed based on two
task related to specific application. The purpose of feature parameters, which are the relative distance between the
extraction is to reduce the original dataset by measuring pixel pair d measured in pixel number and their relative
certain features. These extracted features are fed as an orientation. Normally, is quantized in four directions (e.g.,
input to the classifier for image classification and seg- 0, 45, 90 and 135), even though various other combi-
mentation. There are various types of feature extraction nations could be possible. The various features that can be
such as intensity based, shape based and texture based calculated from the co-occurrence matrices are contrast,
feature extraction. In our proposed system we are using absolute value, inverse difference, energy, and entropy.
texture based feature extraction.
3.3 Feature selection by wrapper based genetic
3.2.1 Texture based feature extraction algorithm
Texture refers to surface characteristics and appearance of The probability of getting an optimal feature subset for
an object given by the size, shape, density, arrangement, classification is high when GA is employed for feature
proportion of its elementary parts. A basic stage to collect selection with suitable fitness functions and possible
123
Author's personal copy
Cluster Computing
Input feature the population is evaluated and tested with the termination
Feature
Feature algorithm. If it fails the crossover, mutation and fitness
subset evaluation
computation steps are repeated.
No
Term 3.3.1 Proposed feature selection algorithm
ina
Yes
validation
Genetic algorithm (GA) is a method for moving to a new
population from an existing population of chromosomes
Fig. 3 Block diagram showing validation of wrapper based GA
using a natural selection method. It has two operators,
namely crossover and mutation. Crossover exchanges
subparts of two chromosomes or it performs recombination
considerations. Taking these advantages, a feature selec-
between two single chromosomes. Mutation randomly
tion scheme based on genetic algorithm is proposed. The
changes the all values of some locations in the chromo-
proposed wrapper based feature selection method takes the
some. GA evaluates the fitness of each and every individ-
advantage of the supervised learning algorithm to evaluate
ual; this means that the superiority of the results is achieved
the significance of the feature subset and employs the
through a fitness function. The suitable chromosome has a
genetic algorithm to optimize the searching of features in
higher probability to choose for the next generation
the feature selection process.
formation.
The wrapper method is a random search technique used
for selecting the relevant subset features from the available
feature set. In the above Fig. 3, the features will be con- 3.3.1.1 Genetic algorithm It is an optimization approach
sisting of redundant data set. Here the selection of a set of to minimize an objective function based Darwin’s theory of
features will be carried out based on the relevance. It evolution that mimics the natural selection, crossover and
considers the set of features as a search problem. The mutation process [Holland]. The best subset of features are
different combinations of features are prepared, evaluated prepared through GA algorithm by avoiding all the
and compared to other combinations. The stopping condi- redundancies.GA is a stochastic optimization method,
tion could be with generating and evaluating new feature which is based on meta heuristic search procedures which
subsets when adding or removing features does not make starts with a matrix of population of solution set. In the
any performance improvements i.e. Table 1, each row of this matrix shows the individuals or
genes that generated randomly with each gene is displaying
(i) A predefined number of features are selected. a solution of an objective function. Using a objective
(ii) A predefined number of iterations are reached. function, fitness of individuals are computed according to
(iii) Addition or deletion of a feature which fails to an objective function. Population is improved by cross over
produce better subset. function which is a combination of genetic information
(iv) An optimal subset obtained according to evalua- from different members of population. Another population
tion criteria. improvement method is mutation, wher some individuals
The Genetic algorithm is incorporated as a wrapper of population are mutated according to the mutation rate of
based model in the proposed algorithm. In each generation population. Pseudo code of GA is show in Algorithm 1.
123
Author's personal copy
Cluster Computing
The proposed methods for generating feature subset can train the classifier. SVM classifies data by finding the best
reduce the curse of dimensionalities to a great extent. hyperplane that separates all specific data points of one
class from those of the other. The finding of the hyper
3.4 SVM based segmentation plane and the corresponding support vector is called
Training. For segmentation, the test image undergoes the
SVM is generalized statistical learning theory which is feature extraction. Feature Selection is done with the help
used for classifying the set of inputs based on supervised of wrapper based GA. These selected features are used for
learning theory. It separates the set of inputs into two SVM classifier training. Here SVM classifier is trained to
classes with class labels - 1 and 1. The classifier separates classify a tumor region and abnormal region (Fig. 4). This
into classes by defining a hyper plane defined by classification takes place block wise, and the sum total of
f(x) = bT x ? a where b [ Rn is orthogonal to the hyper the classified blocks can give out the classified region.
plane and a [ Rn is a constant. For the set of training
feature set f(ai,bj) defined in the form f ða; bÞ ¼ f ai ; bi Þ=
ai 2 <m ; bi 2 ð0; 1Þg i¼1 : n : Here ai is the input vector and 4 Comparative analysis with existing
bi is the target vector with two classes with class label 0 methods and with manual delineation
and 1. SVM training will maximize the distance between
the classes by defining the hyper planes with a small Proposed method is compared with two existing methods
empirical risk. The hyper planes for the classes can be such as Expectation maximization using level set and
represented as bi bT a þ a 1: The objective function advanced fuzzy c mean method (Fig. 5).
for optimizing the distance between the classes defined by
the hyper planes can be represented as minimize : Pðb; 4.1 Expectation maximization with level set
Pn method
a; eÞ ¼ 0:5 kbk þ C ei ; where ei is a set of slack vari-
i¼1
ables and C is a cost variable for each slack variable. EM algorithm helps to find out the maximum possible
Proposed SVM based segmentation consists of two likelihood measures of parameters in statistical models.
parts. (1) Testing and (2) training. Firstly the image is The EM algorithm has two steps, an expectation (E) step,
divided into blocks and features extracted from each block which will generate a function for the expectation stage and
are arranged in the feature matrix. A truth vector is formed is calculated by using the present estimate for the param-
from the corresponding truth image. This matrix is used to eters. The other one is maximization (M) step, which
123
Author's personal copy
Cluster Computing
Index Machine
Pre processing& Learning
Data Feature Algorithm
Block Division
Base Extractor
Features
Fig. 4 Flowchart for training and Testing of SVM based segmentation method
evaluates parameters maximizing the anticipated log like- 4.2 Level set method
lihood that is found on the E step.
Expectation maximization is performed based on the It’s a numerical technique or tool for tracking interfaces
number of classes represented by k and an image I. Based on and shapes. The level set method is a general method
the histogram of the image I and the number of classes K, A which will be made application specific by the use of a
(0) is initially estimated. E-step and M-step is performed priori information. In this the boundaries are the curves that
until it converges, E-step compute the probability class of separate the regions or objects. A curve is represented
each pixel based on the current estimation of A(t) at each implicitly as the zero level set of a time dependent func-
iteration and M-step determine the new expectation of A tion: X ? R, such that C = {x(t) [ X: (x, t) = 0 (Fig. 6).
(t ? 1), based on values of E-step. Classification matrix C is The curve moves under the influence of a speed law. A
generated as a result. Finally based on the classification special case is when the motion is restricted to the normal
matrix C, segmented image is generated. direction of the curve, i.e. dx/dt = Fn, where the scalar
function F is termed as speed function.
123
Author's personal copy
Cluster Computing
123
Author's personal copy
Cluster Computing
5 Experimental analysis with the existing segmentation. The results are evaluated by comparing it
method and with manual delineation with the result of manually segmented data done by med-
ical experts. The degree of similarity of the segmented
For the performance evaluation, a series of 30 images of results gives the accuracy of the segmented image. The
both FLAIR and T2 (23 abnormal and 7 normal) were result is obtained in terms of specificity, accuracy, Abso-
taken into consideration, of these 95 objects were used in lute volume measurement error (AVME) and figure of
training and 30 objects were used in testing phase. These merit (e), helps to demonstrate the validity of the
images are segmented with a common threshold value, in proficiency.
association to the radiologist segmented image. The four 1. Specificity = True negative/(true negative ? false
parameters- true positive (TP), false positive (FP), true positive): d = TN/(TN ? FP).
negative (TN), false negative (FN) are evaluated by the 2. Accuracy = (True negative ? true positive)/total
logically AND ing the ground truth and segmented image. samples: A = (TN ? TP)/(TN ? TP ? FN ? FP).
For the numerical analysis, the number of FN and FP are
calculated, based on the number of pixels in the interested 3. Absolute volume measurement error ¼ VVautomatic
manual
1
region (ROI). The proposed system have been imple- 100%
mented by using MATLAB software with a computer 4. Figure of merit ðeÞ ¼ 1 jVmanualVmanual
Vautomatic j
123
Author's personal copy
Cluster Computing
Table 5 Comparative analysis of Volume measurement for EM ? Level set, FCM, SVM
Data Volume measurement of automatic segmentation Volume measurement of absolute segmentation
EM ? Level Set FCM SVM EM ? Level Set FCM SVM
Table 6 Statistical comparison of input images in terms of Figure of methods. Statistical comparison and tumor volume calcu-
Merit lation from Table 5 for the proposed algorithm shows the
Data SVM FCM EM ? level set efficiency. The time required for processing each image for
10 patients for T2 and FLAIR is plotted as graphs in Fig. 7.
1 0.788978 0.82651 - 1.51585
Figure 8 shows the statistical evaluation, showing speci-
2 0.755297 0.799067 - 4.0706
ficity and accuracy of multimodality images. The perfor-
3 0.751111 0.978903 0.835434 mance of the classifier is evaluated from the confusion
4 0.817083 0.787337 - 96.9291 matrix, specificity and accuracy by testing it with the fea-
5 0.307714 0.232455 - 8.23518 tures extracted from the feature set (Figs. 9, 10, and 11).
Table 7 Volumetric analysis of manual and automatic segmentation for multimodality image
Image type Data Manual segmentation Automatic segmentation Absolute Volume measurement Figure of merit
123
Author's personal copy
Cluster Computing
123
Author's personal copy
Cluster Computing
Fig. 11 a, b Analysis of manual and automatic segmentation for images T2 and FLAIR. f, g, h Shows the output for EM and level set,
FLAIR for EM – Level Set, FCM and SVM for FLAIR image. c, FCM and SVM for T2 input. j ,k, l Shows the output for EM and level
d Analysis of manual and automatic segmentation for T2 for EM – set, FCM and SVM for FLAIR input
Level Set, FCM and SVM based technique for T2 image. e, i input
extraction of features which may include combinations of 5. Karnan, M., Logeswari, T.: An improved implementation of
multiple factor includes intensity, energy with contrast, etc. braintumor detection using soft computing. Int. J. Comput. Netw.
Secur. 2(1), 6–10 (2010)
6. Kavitha, A.R., Chellamuthu, C.: Detection of brain tumour from
MRI image using modified region growing and neural network.
Imaging Sci. J. 61(7), 556–567 (2012)
References 7. Goldberg, D.E.: Genetic algorithms in search, optimization and
machine learning, 3rd edn, pp. 60–68. Addison Wesley Longman
1. Gonzalez, R.A., Woods, R.E.: Digital Image Processing, 2nd edn. Pvt. Ltd., Boston (2000)
Prentice Hall, Upper Saddle River (2002) 8. Sieno, D.D.: Adding a Conscience to Competitive Learning. In:
2. Dubey, R.B., Hanmandlu, M., Vasikarla, S.: Evaluation of three Proceeding of IEEE the Second International Conference on
methods for mri brain tumor segmentation. In: 2011 Eighth Neural networks (ICNN88), pp. 117–124 (1988)
International Conference on Information Technology: New 9. Sieno, D.D: Adding a Conscience to Competitive Learning. In:
Generations, IEEE (2011) Proceeding of IEEE the Second International Conference on
3. Zhang, Y., Brady, M., Smith, S.: Segmentation of brain MR Neural networks (ICNN88), pp. 117–124 (1988)
images through a hidden Markov random field model and the 10. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and
expectation-maximization algorithm. In: Proceedings of the IEEE Machine Learning, 3rd edn, pp. 60–68. Addison Wesley Long-
transaction on Medical Images (2001) man Pvt. Ltd., Boston (2000)
4. Tolba, M.F., Mostafa, M.G., Gharib, T.F., Salem, M.A.: MR- 11. Jiang, L., Yang, W. (2003) A modified fuzzy C-means algorithm
brain image segmentation using gaussian multi resolution anal- for segmentation of magnetic resonance images. In: Proc. VIIth
ysis and the EM algorithm. ICEIS 2, 165–170 (2003) Digital Image Computing: Techniques and Applications, Sydney
(2003)
123
Author's personal copy
Cluster Computing
12. Farmer, M.E., Jain, A.K.: A wrapper-based approach toimage main area of interest in Medical Image processing and analysis, Bio
segmentation and classification. IEEE Trans. J. Mag. 14(12), Medical Engineering.
2060–2072 (2005)
13. Selvathi, D., Anitha, J.: Effective fuzzy clusteringalgorithm for G. Glan DevaDhas is presently
abnormal MR brain image segmentation. In: International/Ad- working as a Professor in the
vance Computing Conference (IACC2009), IEEE (2009) Department of Electronics and
14. Kaur, M., Banga, V.K.: Thresholding and level set based brain Instrumentation at Vimal Jyothi
tumor detection using bounding box as seed. Int. J. Eng. Res. Engineering College, Kannur,
Technol. 4, 2503–2507 (2013) Kerala, India. He has received
15. Wu, J., Ye, F., Ma, J.L., Sun, X.P., Xu, J., Cui, Z.M.: The seg- his B.E. degree in Instrumenta-
mentation and visualization of human organs based on adaptive tion and Control Engineering,
region growing method. In: IEEE 8th International Conference on M.E. degree in Process Control
Computer and Information Technology Workshops 978-0-7695- and Instrumentation and Ph.D.
3242-4/08, IEEE (2008) in Intelligent Controller Design
16. Eschrich, S., Ke, J., Hall, L.O.: Fast accurate fuzzy clustering during 1998, 2001, 2013
through data reduction. IEEE Trans. Fuzzy Syst. 11(2), 262–270 respectively. He is having more
(2003) than 15 years of experience in
17. Menze, B.H., Van Leemput, K., Lashkari, D., Weber, M.-A., teaching and research as Lec-
Ayache, N., Golland, P.: Segmenting glioma in multi-modal turer, Assist Professor, Associate Professor, HOD, Board of studies
images using a generative model for brain lesion segmentation. chairman, Academic council member, Research Supervisor, Doctoral
In: Proc. MICCAIBRATS, pp. 1–8 (2012) committee member and Professor in various Engineering Colleges
18. Zikic, D. et al.: Context-sensitive classification forests for seg- and Universities. He is the reviewer of many journals includes
mentation of brain tumor tissues. In: Proc. MICCAI-BRATS, International Journal of Naval Architecture and Ocean Engineering,
pp. 22–30 (2012) Journal of Water Process Engineering, Int. J. Of Modeling, Simula-
19. Bauer, S., et al.: Segmentation of brain tumor images based on tion and Scientific Comp, etc. He has published 30 articles in various
integrated hierarchical classification. In: MICCAI BraTS Work- indexed journals. His main area of interest are Soft computing,
shop. Nice: Miccai Society (2012) Intelligent controller Design, System Identification, Adaptive sys-
20. Aswathy, S.U., et al.: Quick detection of brain tumor using a tems, Smart Systems.
combination of EM and level set method. Indian J. Sci. Technol.
8(34), 74–82 (2015). https://doi.org/10.17485/ijst/2015/v8i34/ S. S. Kumar is presently working
85361 as associate professor and HOD
in the Department of Electronics
& Instrumentation at Noorul
S. U. Aswathy Computer Science Islam University, Thuckalay,
and Engineering Department, Tamil Nadu, India. He has
Noorul Islam University, Tamil received his B.E. Degree in
Nadu, India. I completed my Electronics and Instrumentation
M.Tech. (computer and Infor- Engineering, M.E. Degree in
mation Technology) From MS Applied Electronics and Ph.D.
University in 2009 and B.Tech. in Information and Communi-
(Electronics and Instrumenta- cation Engineering, Intelligent
tion) from Noorul Islam Controller Design during 2002,
University in 2005. Presently 2004, 2013 respectively. He is
Working as Associate professor having more than 13 years of
and HOD in charge in Muslim experience in teaching and research as Lecturer, Assist Professor,
Association college of Engi- Associate Professor, HOD. He is a reviewer in various International
neering, Trivandrum, Kerala. Journals and Published over 20 articles in various indexed journals.
She is having a total of His main area of interest is Medical Image Processing, Satellite Image
12.5 years of experience in teaching. She has published eight articles Processing, Control Sytsem etc.
in various International Journals and Scopus indexed journals. Her
123