Вы находитесь на странице: 1из 5

2010 11th Int. Conf.

Control, Automation, Robotics and Vision


Singapore, 7-10th December 2010

Liver Cancer Identification Based on PSO-SVM


Model
Huiyan Jiang Fengzhen Tang Xiyue Zhang
Software College, Northeastern Software College, Northeastern Software College, Northeastern
University University University
Shenyang, China Shenyang, China Shenyang, China
hyjiang@mail.neu.edu.cn tangzhengliu@126.com Zhangxiyue82@163.com.com

Abstract—This paper proposes a novel liver cancer propose the PSO -SVM method. This approach consists of four
identification method based on PSO-SVM. First, the region of steps: (i) identifying the ROIs by Lazy-Snapping, (ii) extracting
interest (ROI) is determined by Lazy-Snapping, and various texture features, (iii) selecting features by F-score algorithms,
texture features are extracted from ROI. Afterwards, F-score and (iv) training parallel PSO-SVM with selected features to
algorithm is applied to select relevant features, based on which indentify the classes of liver cancer .
liver cancer classifier is designed by combining parallel Support
Vector Machine (SVM) with Particle Swarm Optimization (PSO) II. PARALLEL PSO-SVM MODEL
algorithm. PSO is used to automatically choose parameters for
SVM, and the advantage is that it makes the choice of parameter
more objective and avoids the randomicity and subjectivity in the
traditional SVM whose parameters are decided through trial and
error. The experiment results on real-world datasets show that
the proposed parallel PSO-SVM training algorithm improves the
prediction accuracy of liver cancer. Figure 1. The whole structure of liver cancer recognition

Keywords—feature extraction; feature selection; PSO


algorithm; parallel SVM

I. INTRODUCTION
Primary carcinoma of the liver (Liver cancer) whose a) Original image
incidence, mortality rate and the annual growth rate have
reached the top in the world is one of the most serious
malignant tumors in China. At the early stage of liver cancer
diagnosis, CT imaging is commonly used because of its high
resolution, minor injuries to patients and accurate localization
of liaisons [1]. Computer Aided Diagnosis (CAD) based on
CT has therefore received considerable attentions. b) Given seed

In the field of liver cancer identification CAD, the texture


of the liver CT images, including gray co-occurrence matrix,
gray level run length matrix and gray level gradient co-
occurrence matrix, was used to distinguish between two kinds
of liver disease [2]. Gray co-occurrence matrix feature from
liver CT images was put into the probabilistic neural network
to classify liver cancer and liver hemangioma[3]. But the c) ROI
single feature is difficult to completely represent the image
information. Improved SVM, which constructed a non-linear Figure 2. Definition of ROI
classifier, was also applied to cancer diagnosis [4]. Using
SVM and binary tree method to classify liver fibrosis in the The architecture of the CAD system (Fig.1) in this paper
CT images, the recognition rate is higher than K-neighbors comprises three main modules, including (i) definition of
and BP artificial neural network [5]. region(s) of interest (ROI), (ii) extraction of features, (iii) and
selection of features,and (iv) classification of the selected ROI
In order to improve the recognition rate of liver cancer and based on PSO-SVM method..
to avoid the randomicity and subjectivity in the traditional
SVM whose parameters are decided by the experience, we A. Feature Extraction and Selection

978-1-4244-7815-6/10/$26.00 ©2010 IEEE ICARCV2010


2519
TABEL I. FEATURE SET To use SVM we only need to choose kernel function and
Group Feature relevant parameters. In this paper, we choose the RBF kernel
first-order histogram features, function (1) and (2) as the default kernel function rather than
1
including average of gray, variance, polynomial kernel primarily for three reasons. Firstly, the RBF
entropy, coefficient of variation, kernel is able to map the non-linear boundaries of the input
skewness and kurtosis[7] space into a higher dimensional feature space. Secondly, has
GLCM(Gray Level Co-occurrence
2
Matrix) [7,8] only two parameters (C , O ) while polynomial kernel has more
3
Gray - Gradient Co-occurrence hyper parameters, where C is a constant controlling the
Matrix classification boundary when data is non-separable, it is also
4 Fractal[9] called penalty factor. Thirdly, it makes computation less
5 Gabor
difficult because the RBF kernel values lie between zero and
The ROI is determined using Lazy -Snapping method one while the polynomial kernel values may go to infinity or
which is an interactive image cutout tool [6]. As shown in zero when the degree is large.
Fig2, the seed is first manually selected (b), and the region 2 (1)
(ROI) with the same texture or gray of the seed is then K ( xi , x j ) exp(J xi  x j ), J ! 0.
automatically cut out (c).
K ( xi , x j ) { I ( xi )T I ( x j ). (2)
Features are used to identify the liver cancer. They are the
quantitative measurements of medical images regarding the The other important factor is to select parameters. The
pathology of the tissue. The textures are different if the traditional SVM uses cross-validation and grid search. It is
diseases are different. In this paper, features (Table I) are simple and easy, but because the ranges of C and  are
extracted in the ROI of abdominal CT images. determined by the experience of the user, it is not satisfying
used on real data. In this paper, inertia weight PSO [12] is used
Feature selection is used to select the optimal subset from
the original set. By feature selection, the redundant and to determine the parameters C and O .The Quadratic
irrelevant or insignificant are deleted. Optimization Problem of SVM now is redefined in the form:
l l
It is generally acknowledged that a classifier built upon
compact feature subsets tends to generalize well. In this paper, max W (D ) ¦Di 
i 1
¦DD
i, j 1
i j yi y j K ( xi ˜ x j ). (3)
F-Score method is used for feature selection [10]. The
(4)
algorithm is executed as follows:
s.t. D i t 0, i 1, ˜˜˜, l.
Step 1: Calculate F-Score value of each feature; (5)
l
Step2: Remove greatest and smallest F-Score values, keep ¦D y i i 0.
the middle F-Score values as thresholds. i 1

Step3: For every threshold, we do as following: The Lagrange multiplier D which is needed to be solved is
a l - Dimension vector.
(i) Remove the feature whose F-Score value is small
than the threshold; It naturally occurred to us to use PSO algorithm to find the
best “particle” in the searching space. Now, we improve on
(ii) Pick up X train and X valid in the training set PSO to satisfy SVM. Given each particle is treated as a point
randomly; in a D-dimensional space. The location of the i th particle is
represented as D i (D i1 ,..., D iD ) . The position of the best
(iii) Use X train as new training subset, X valid as
fitness the i th particle has been achieved is recorded and
testing subset to get the identification result;
represented as pi ( pi1 ,..., piD ) and pg ( pg 1 ,... p gD )
(iv) Repeat (ii) and (iii) five times, calculate the
mean error of the five times. is represented the position of the best fitness the neighbor of
Step4: Choose the F-Score with the smallest mean error as the i th particle has been achieved. The formula of updating
the final threshold. the particle’s value is in the form as shown as (6) and (7).

Step5: Remove the features with F-Scores smaller than the temp _ vi(,td1)
final threshold.
Z ˜ vi(,td)  c1 ˜ r1 ˜ ( pi ,d  D i(,td) )  c2 ˜ r2 ˜ ( pg ,d  D i(,td) ). (6)
B. Parallel PSO-SVM Model
SVM (support vector machine) is a classification technique ­C  D i(,td) , D i(,td)  temp _ vi(,td1) ! C
based on structural risk minimization principle [11]. It is an ° (7)
vi(,td1) ® D i ,d , D i ,d  temp _ vi ,d  C
(t ) (t ) ( t 1)
efficient learning method able to deal with problems such as
small sample size, nonlinearity, over learning, high ° temp _ vi(,td1) , otherwise.
dimensionality. ¯
Where c1 and c2 are two positive constants, r1 and r2 are

2520
two random numbers in the range [0, 1], and C is constant g) Let Mipbest M (D i(0) ), pi D i(0) , i 1, 2,..., m
and Z is the weight. Thus constantly D i(,td1) D i(,td)  vi(,td1)
and Migbest max ^Ii pbest ` and updating the global optimum
meets the constraints of (4) and (5). If after t  1 times of i

iteration, 0 dD ( t 1)
i ,d d C doesn’t meet (5), in other words: pg and the location of particles D best
(0)
.

l l l 2) Optimizing
¦D d 1
( t  1)
i ,d yd ¦D
d 1
(t )
i ,d y d  ¦ v i(,td1) y d
d 1
a) Updating inertia weight w.
l l
b) Updating speed vector vi(t 1) by (7),and updating
¦ v i(,td1) y d  ¦ (  v i(,td1) y d ) (8) 
location vector D i( t 1) ,if D i(t 1) don’t meet the constraint of (5),
vi(,td1 ) y d ! 0 v i(,td1 ) y d  0
( t 1)
sum V   sum V  z 0. then updating vi by (8) and (9) , afterwards updating D i .
( t 1)

In order to make (8) equal to 0, update speed


( t 1)
vector vi ,d as following:
c) Calculate the fitness M (D i(t 1) ) of the particles.

d) For each particle if M (D i( t 1) )  M ipbest ,let


If sumV ! sumV
Ii pbest
I (D i
( t 1)
), pi D i
( t 1)

­ sumV ( t 1) ( t 1)
° sumV vi , d , if vi , d y d ! 0. (9)
M (D i(t 1) )  M ipbest
vi(,td1) ® 
f) For i 1,2,..., m if ,let
° v ( t 1)
M igbest M (D i(t 1) ), p g D i(t 1) .
¯ i , d , otherwise.

If sumV d sumV e) If p g meet KKT constraints:


 
­ sumV ( t 1) ( t 1)
pg,i 0
( t 1) ° sumV vi , d , if vi , d y d  0. Ÿyi f (xi ) t1,0 pg,i C
v i,d ® 
(10)

° vi(,td1) , otherwise. Ÿyi f (xi ) 1, pg,i C
¯ (11)
Ÿyi f (xi ) d1.
Where d 1, 2,..., l
output the optimum ,then stop the algorithm.
The concrete process of optimizing the parameters of SVM
using PSO is as following: f) If iterating times t t itermax ,output second-best
1) Initializing resolution p g ,stop the algorithm.

a) Initialize constants m , C , c1 , c2 , itermax , g) Add iterating times t and 1.


vmax , wmin , wmax as seed. h) Return to step a).
SVM is good at dealing with the problem of small samples,
b) Let t 0 and set a clock to produce random.  non-linear or high dimensionality in pattern recognition. But if
c) On the constraint of (4), initialize the original value of there are multiply data, large memory and computing time is
each particle D i  R , i 1,2,..., m randomly, where l is demanded. In order to speed up the training process of SVM, a
(0) l
cascade SVM is proposed. In this paper, we combine feedback
the number of the samples, is also the dimension of the parallel SVM with PSO parameters selection method.
Lagrange multiplier vector which is needed to be solved.

d) If D i(0) are not under the constraint of (5), adjust them.

e) Initialize speed vector of the particle vi(0)


, d randomly,
and make them meet the constraint of
v max d vi(0)
,d d v
max
, i 1,..., m .

f) Calculate the fitness I (D i(0) ), i 1,..., m of the


particles by formula (2).
Figure 3. Cascade SVM structure

2521
As shown in Fig. 3, the Parallel PSO-SVM structure TABEL II. STATISTIC OF HEALTHY CONTROLS AND PATIENTS
proposed in this paper contains seven SVMs and three layers. Age Gender
Parameters of each SVM are selected by PSO method. Given class max min averaged (male/female)
the selected feature set is decomposed into 4 training subsets
Health 70 35 55.5 20/18
(TD1, TD2, TD3 and TD4), the process is executed as following:
Cyst 70 44 58.3 7/8
Step 1: Train the four subset of the selected features TD1, Hemangioma 63 45 52.7 11/13
TD2, TD3, TD4 with the first layer SVMs and get support
vectors sets SV1, SV2, SV3, SV4 respectively. Liver 78 51 60.2 5/2

Step 2: Combine SV1 and SV2, SV3 and SV4 respectively N fp represents the number of false positive; N tn represents
to assign to the second layer.
the number of true negative; N fn represents the number of
Step 3: Train with the datasets obtained in step 2 to get
SV5, SV6. false negative. In this paper, the recognition rate of Ci is
Step 4: Add SV5 to TD3 and TD4, while SV6 to TD1 and adjust to as (13).
TD2.
NTCi
Step 5: Repeat step 1 until get SV5' and SV6' . accuracyCi = , i 1, 2,3, 4. (13)
N Ci
Step 6: Combine SV5' and SV6' to train the final layer
SVM, if SV5 - SV5' ) and SV6 - SV6' ) or the difference of The Table IV gives the recognition rate of Ci by each
feature set. As the Table IV shown, the recognition rate of
SV5' and SV5, SV6 and SV6' are less than a given number e , normal (C1) is generally high for each feature set, because the
then stop the algorithm. normal region of liver on the CT images are smooth and
shapely. Compared to other focuses of infection, it’s much
Step 7: Return to Step 5 to start training process again if easier to recognize the normal regions. The highest
the conditions in step 6 are not satisfied. recognition rate of HCC is 85.3% acquired by the fractal.
III. EXPERIMENT AND ANALYSIS The final recognition rate is adjusted to as (14).
A. Dataset N TC1  NTC2 +N TC3 +NTC4
The dataset is abdominal CT images of 80 patients (38 accuracy = . (14)
Normal ones, 15 Cysts, 24 Hemangiomas and 7 HCCs) taken N C1  N C2 +N C3 +N C4
by Philip CT scanner from Zhongshan hospital of Anshan city,
China. Under the guidance of experienced physicians, 187 The Table IV gives the comparison among different
ROIs are selected. Among them, 62 are normal (C1), 40 are classifiers, Bayesian, traditional SVM, PSO-SVM and Parallel
Cysts (C2), 56 are Hemangiomas (C3) and 29 are HCCs(C4). PSO-SVM. From Table V apparently our proposed method
The concrete information is shown in Table II. has obvious advantages.

B. Feature Extraction and Selection TABLE III. THE SELECTED FEATURES


In the ROI, 116 features are extracted as described in table
Feature set
I. The first-order histogram feature set includes 6 features and Symbiosis Gray
the gray level co-occurrence matrix feature set has 80 features Gabor
Gray gradient
(d=1,2,4,8;=0°,45°,90°,135°) and ray-gradient co-occurrence Moment of inertia
Energy 0 entropy
matrix feature set contains 14 features and fractal feature is (d=1)
represented by box counting dimension while Gabor feature small gradient
Entropy (d=1)
set has 15 features. advantage 0 variance
Correlation (d=1) inertia 0 mean
After F-score method only 20 features are left shown as
Table III. Contrast (d=2) Gray entropy
45 mean
C. Recognition Results Moment of inertia
feature Uniformity 90 mean
The six feature sets are put into parallel PSO-SVM as input (d=2)
respectively. The recognition rate also called accuracy is Contrast (d=4) 135 mean
generally defined as (12).
binary
N t p  N tn Contrast (d=8)
entropy
accuracy . (12)
N tp  N tn  N fp  N fn
Contrast (d=12)

N t p represents the number of true positive;

2522
TABLE IV. THE RECOGNITION RATE OF DIFFERENT ACKNOWLEDGEMENT
FEATURE SET
This research is supported by the National Science
First Gray Gray
Feature set
order symbiotic Gradient
Fractal Gabor Foundation of China (No: 60973071) and the Liaoning
C1 100 100 100 97.5 97.5 province Natural Science Foundation (No: 20092004).
Train C2 100 100 100 96.3 97.2
set C3 87.5 81.2 71.8 87.5 88.2 REFERENCES
C4 87.5 81.2 71.8 87.5 88.2 [1] Taylor H M,Ros P R.Hepatic imaging: an overview[J], Radiol Clin
C1 90.6 71.8 65.6 86.2 90.6 North Am, 1998, 36(2): 237-245.
Validation C2 90.5 71.5 65.6 86.2 90.5 [2] Mir A H, M Hanmandlu, S N Tandon. Texture analysis of CT images[J].
set C3 85.5 66.5 64.1 82 85.5 Engineering in Medicine and Biology Magazine,1995,14(6):781-786.
C4 85.3 62.4 63 80.4 85.3 [3] Kretowsli M, Johanne B W, Dorota D. Classification of hepatic
C1 87.5 62.5 53.1 81.5 87.5 metastasis in enhanced CT images by dipolar decision tree[C], Proc. of
Test C2 87.5 62.5 53.1 82.3 87.5 XIX GRETSI Conference, 2003, 327-330.
set C3 72.5 60.1 52.2 80.2 72.5 [4] Jing Wang, JimaoWei .An improved SVM applied in cancer diagnosis[J],
C4 72.5 60.1 52.2 85.3 72.5 Computer Applications, 2006,26(2):212-220.
[5] Ling Li,Ming Sun. liver fibrosis CT image classification based on
TABLE V . THE RECOGNITION RATE OF DIFFERENT SVM[J],Peking Biomedical Engineering,2007, 26(1):200-212.
FEATURE SETS
[6] Yin Li,Sun Chi-Keung and Heung-Yeung Shum.Lazy Snapping[J].ACM
Classifier Recognition rate Transactions on Graphics ,2004 23(3):303-308 in procedings of ACM
Bayesian 66.5% SIGGRAPH 2004.
Traditional SVM 70.5% [7] R. Kohavi, G. H. John. Wrappers for feature subset selection Artificial
PSO-SVM 73.2% Intelligence, 1997,97:273-324.
Parallel PSO-SVM 76.7% [8] LIN C-F,WANG S-D.Fuzzy support vector machines[J].IEEE Trans on
Neural Network. 2002,13(2):464-471.
IV. CONCLUSION [9] J.J.Gangepain, C. Roques-Carmes. Fractal approzch to two dimendional
and three dimensional surface roughness.Wear,1986,(109):119-126.
In this study, we have proposed a new algorithm for liver
cancer diagnosis. The new algorithm employs multiple [10] Yi-Wei Chen, Chih-Jen Lin,Combing SVMs with vious feature selection
stratigies,Feature extraction, foundatons and applications,Springer,2006.
features, including first-order gray-scale, gray co-occurrence
[11] Vapnik V N. Statistical Learning Theory[M],New York:Wiley,1998.
matrix, gray-gradient co-occurrence matrix, Gabor features
[12] Shi Y,Eberhart R. A modified particle swarm optimizer[A],IEEE
and the fractal dimension extracted by statistical method, Int.Conf.on Evolutionary Computati- on [C],Piscataway,NJ,IEEE
structure method and spectrum. The experiment results have Service Center,1998:69-73.
shown that these features are considerably effective to depict [13] Jian-Pei Zhang,Zhong-Wei Li,Jing Yang.A parallel SVM training
the abdominal CT images. In addition, we use PSO method to algorithm on the large-scale classification problems.International
optimize the parameters of the SVM. A three layer parallel Conference on Machine Learning and Cybernetics ICMLC2005.
PSO-SVM is designed to classify the liver cancer and through
the experiment the improved method arises the recognition
rate in some degree.

2523

Вам также может понравиться