Вы находитесь на странице: 1из 4

An Application of The Empirical Mode Decomposition to Brain Magnetic Resonance Images Classification

Salim Lahmiri
Department of Computer Science University of Quebec at Montreal Montreal, Canada lahmiri.salim@courrier.uqam.ca AbstractA new approach to distinguish normal from abnormal brain magnetic resonance (MR) images is presented. First, the empirical mode decomposition (EMD) is applied to brain MR images to obtain high frequency intrinsic mode functions (IMF) from which features are extracted. Then, an entropy-based selection process is used to identify the most informative and non redundant features from each IMF before classification by support vector machines (SVM). The validation of the approach with a MR image database consisting of Alzheimers disease, glioma, herpes encephalitis, metastatic bronchogenic carcinoma, multiple sclerosis, and normal condition shows its effectiveness as well as slightly better classification efficiency in comparison to using discrete wavelet transform-based alternatives. However, the EMD approach is substantially more time consuming. I. INTRODUCTION Brain magnetic resonance (MR) imaging has become the primary imaging modality for early diagnosing and monitoring of brain pathologies. However, the existing MR imaging systems cannot yet perform automatic diagnosis and help offload the physicians in their diagnostic and treatment work. As a result, several studies have been conducted to automate the classification of two-dimensional brain MR images. The typical approach starts by filtering the image to remove unwanted components. Then, textural information is extracted from the filtered image to form a feature vector. Finally, the latter is processed by a classifier algorithm to class the MR image. Currently, there appears to be a convergence of the various techniques that have been reported in the literature to use the discrete wavelet transform (DWT) as the first step, prior to feature extraction [1]-[4]. For instance, Chaplot et al. [1] used wavelets as input to support vector machine (SVM) to classify brain magnetic resonance (MR) images as either normal or abnormal. The discrete wavelet transform was applied to each MR image to obtain level-2 Daub4 wavelet approximation coefficients and feed them to the SVM classifier. The approach was validated on with a dataset consisting of fifty-two brain MR images, of which six normal and forty-six of brains affected by Alzheimers disease. Four normal images and six abnormal

Mounir Boukadoum
Department of Computer Science University of Quebec at Montreal Montreal, Canada boukadoum.mounir.@uqam.ca images were randomly chosen for training the SVM and forty-two images were used for testing. The obtained correct classification rate was 98% when using a polynomial or Gaussian kernel for the SVM. El-Dahshan et al., [2] used the Haar wavelet to extract third level approximation and details coefficients from MR brain images. Then, principal component analysis (PCA) was employed to reduce the number of features to seven and feed the ensuing vector to the classifier. The approach was validated on dataset included ten normal images and sixty abnormal images of various brain diseases. The learning and test set size were not indicated. The experimental results showed 97% classification accuracy when using the backpropagation artificial neural network (BPNN) classifier trained with the Levenberg-Marquardt numerical algorithm and 98% when using the k nearest neighbour algorithm. Zhang et al. [3] also applied Haar wavelets and third-level decomposition to extract the low frequency coefficients from the MR images. Then, PCA was used to reduce the dimension of the feature space to 19 principal components. The reduced feature set was then fed to a feed-forward neural network whose parameters were optimized using adaptive chaotic particle swarm optimization. The validation procedure consisted of k-fold stratified cross validation. The proposed system achieved 98.75% correct classification rate. More recently, Zhang et al., [4] employed a feed-forward backpropagation artificial neural network trained with the scaled conjugate gradient algorithm to classify MR brain images as normal or abnormal. The PCA analysis was the same, leading again to a 19 component feature vector. The validation dataset consisted of 18 normal and 48 abnormal images of several brain pathologies. The data were randomly divided into two equalsized learning and test sets and the obtained correct classification rates on both training and test images were 100%, but an attempt to reproduce these results independently yielded lower accuracy; in [5], Lahmiri and Boukadoum presented a methodology based on edge extraction and subsequent analysis by means of fractal dimension and spectral energy distribution high order

978-1-4673-4900-0/13/$31.00 2013 IEEE

statistics. Using leave-one-out cross validation on the same data used by [4], the obtained classification accuracy by support vector machines with a quadratic kernel was 91.78%0.01. In comparison, applying the feature extraction technique based on the the DWT and PCA yielded 82.69%0.08 accuracy. Although useful for image decomposition, the DWT has also drawbacks [6][7]. For instance, wavelet analysis uses a pre-determined set of filters derived from a mother wavelet, and the extracted features are based on the moving and scaling of this specific mother wavelet. Also, the number of components and their temporal resolutions are defined a priori. These features of the DWT may lead to suboptimal feature vectors at the input of the classifier as no formal rules exist on how to choose a mother wavelet. Using wavelet packets may help circumvent them but at the expense of an increased algorithmic complexity and no guaranteed result. This paper presents an alternative approach that replaces the DWT by the empirical mode decomposition [8] technique. The EMD is a multi-resolution decomposition technique introduced by Huang et al., [8] to perform joint space-spatial frequency signal representation by successive removal of elemental signals called intrinsic mode functions (IMF). The EMD is adaptive a fully data driven method- and is suitable for non-linear, non-stationary data analysis [9]. Contrary to the DWT, the EMD does not use any pre-determined filter or wavelet functions [10][11], using the signal itself to derive the basis functions instead. There appears to be no previous work that has considered using the EMD for feature extraction from brain MR images. In this paper, the EMD and the DWT extracted features are compared in terms of ensuing classification accuracy of normal and abnormal brain MR images. The paper is organized as follows. The methodology is presented in Section 2. Simulation results and discussion are provided in Section 3. Finally, we conclude in Section 4. II.
SYSTEM ARCHITECTURE

(a) Find all the local maxima, M i , i = 1,2,..., and minima, mk , k = 1,2,..., in s(t). (b) Compute by interpolation -for instance a cubic Spline- the upper and lower envelopes of the signal: M (t ) = f M (M i , t ) and m(t ) = f m (mi , t ) . (c) Compute the envelope mean e(t) as the average of the upper and lower envelopes: e(t ) = (M (t ) + m(t )) 2 .

residual r (t ) = s (t ) IMF (t ) . (e.2) If d(t) is not an IMF, then replace s(t) with the detail: s(t ) = d (t ) . (f) Itterate steps (a) to (e) until the residual r(t) satisfies a given stopping criterion. In the end, s(t) is expressed as follows:
s (t ) =

(d) Compute the details as d (t ) = s(t ) e(t ) . (e) Check the properties of d(t): (e.1) If d(t) meets the conditions on the number of extrema and symmetry stated previously, compute the ith IMF as IMF (t ) = d (t ) and replace s(t) with the

IMF (t )+ r
j j=1

(t )

(1)

where N is the number of IMF which are nearly orthogonal to each other and all have nearly zero means; and rn(t) is the final residue which is the low frequency trend of the signal s(t). Usually, the standard deviation (SD) computed from two consecutive sifting results is used as the criterion to stop the sifting process by limiting the SD size [7][9] such that:
SD(k ) =

|d
t= 0

k 1 T

(t ) d k (t )|2 (t )
<

(2)

t= 0

2 dk 1

The overall methodology is based on four stages. First, the gray level brain MR image is converted to a double precision image before being analyzed with the EMD. Second, statistical features are computed from the obtained IMF. Third, the most discriminant features are selected based on entropy statistic. Finally, support vector machines are employed to classify normal versus abnormal images. The overall methodology is described in more details next: A. The empirical mode decomposition The key feature of the EMD is to decompose a signal into a sum of functions such that the following two conditions are satisfied for each one of them: 1) It has the same numbers of zero crossings and extrema; 2) it is symmetric with respect to its local mean. The two conditions allow computing the so called Intrinsic Mode Functions or IMF. The IMFs are found at scales that range from fine to coarse by an iterative procedure referred to as the sifting algorithm. For a signal s(t), the EMD decomposition is performed as follows [10]:

where k is the index of the kth difference between the signal s(t) and the envelope mean e(t). The term is a predetermined stopping value. The 2D signal EMD follows the same process as 1D signal EMD and 2D IMF are defined in the same manner. However, to analyze a brain MR image, the image is transformed into a one dimensional signal to speed up the processing time. For instance, let an image I be a twodimensional n n array of pixels. The corresponding image Inew is viewed as a vector with n2 coordinates that result from a concatenation of successive rows of the image: from left to right and from top to down. Then, the EMD algorithm is applied to the raw vector Inew. Finally, the processed image is reconstructed from the latter vector. For instance, Fig. 1 shows an example of a double precision abnormal brain MR image, and Fig. 2 shows its representation as a one dimensional signal. Finally, Fig. 3 shows twelve IMFs related to the latter. B. Features extraction and selection Five statistical textural features are extracted from the first four intrinsic mode functions which account for most of the

1 20 40 0.6 60 80 100 120 0.4

0.8

0.2

for each IMF, only the two features with the lowest entropies are selected as inputs to the classifier. We also used the DWT as an alternative feature extraction method in order to assess the efficiency of our EMD-based approach. The Daubechies-4 wavelet was used as mother function, and the level of decomposition was set to two. The five features were extracted from the low frequency subband as in [1-4].
5000 10000

20

40

60

80

100

0 0

Fig.1 Example of a double precision abnormal image (a) and its one dimensional signal representation (b).
0.5 0 -0.5 0 0.5 0 -0.5 0 0.1 0 -0.1 0 IMF1 0.5 0 -0.5 0 0.5 0 -0.5 0 0.1 0 -0.1 0 IMF2 0.2 0 -0.2 0 0.5 0 -0.5 0 0.1 0 -0.1 0 IMF3 0.2 0 -0.2 0 0.2 0 -0.2 0 0.05 0 -0.05 0 IMF4

C. Classification and performance measurement The support vector machine (SVM) [15] with the quadratic kernel function was adopted for classification. The leave-oneout cross validation method (LOOM) was used to enhance the generalization capability of the proposed approach, and the average and standard deviation of the correct classification rate were computed. III.
VERIFICATION AND VALIDATION

5000 IMF5

10000

5000 IMF6

10000

5000 IMF7

10000

5000 IMF8

10000

5000 IMF9

10000

5000 IMF10

10000

5000 IMF11

10000

5000 IMF12

10000

A collection of 62 axial, T2-weighted, MR brain images of 256256 size was downloaded from the Harvard Medical School webpage [16]. The set included 10 images of normal brains, and 52 of abnormal brains including 9 with Alzheimers disease, 13 with Glioma, 8 with Herpes encephalitis, 8 with Metastatic bronchogenic carcinoma, and 14 with multiple sclerosis. This database was also used by [1][2][5], allowing results comparisons. An example of each category of MR image is shown in Fig. 4.

5000

10000

5000

10000

5000

10000

5000

10000

Fig.2 Example of extracted IMFs, in this case for the signal in Fig.1.b.

high frequency elemental signals in comparison with the remaining IMFs. We make the hypothesis that the high frequency elemental signals capture sudden changes in the biological tissue that may serve to characterize it. The statistical features computed for each of the four IMFs are the mean, standard deviation (std.dev), smoothness, third (3th) moment, and uniformity (see [12] for definitions)]. In order to select the relevant features to be fed to the classifiers, the computed statistics are ranked based on their respective class conditional entropies. The latter measure the level of uncertainty of each feature as a class descriptor: The smaller it is, the more discriminatory the related feature is. The conditional entropy is considered here because of its effectiveness in removing irrelevant and/or redundant inputs and its ability to overcome the issue of data sparseness [13]. Also, using it may be more efficient for feature selection than principal component analysis as used in [2][3][5]. This is because the conditional entropy does not make assumptions of linearity about the data. Moreover, PCA seeks only data directions that maximize variance (these do not necessarily maximize class information) and is sensitive to data scaling. The entropy of a feature x after observing the class (output) y as normal or abnormal is defined as [14]: (3) H ( x | y ) = P( yi ) P( xi | yi ) log (P( xi | yi ))

Normal

Alzheimer

Glioma

Herpes Meta. Bronch. Car. Fig. 4. Example of brain MR images.

Multiple sclerosis

where P(xi) is the prior probabilities for all values of X, and P(xi|yi) is the posterior probabilities of X given the values of y the output or the class. Once computed for all features, the conditional entropy values are sorted in ascending order and,

Based on the entropy statistics (Table 1), the selected features were uniformity and standard deviation for the IMF 1 image, and mean and third moment for the IMF 2, IMF 3, and IMF 4 images. As a result, these features formed the input to the SVMs. The obtained classification results (Table 2) with LOOM validation show that combining the selected features form all the IMFs leads to a higher classification accuracy than using those of a given IMF. For instance, the classification rate was 98.98 (0.01) in this case. This suggests that the relevant information to distinguish normal from abnormal images is spread across several high frequency IMFs. Based on the conditional entropy statistic for features selection, it appears also that the mean and third moment (skewness) are important distinctive features since they are selected in IMF 2, 3, and 4. Since skewness relates to pixel distribution asymmetry, this suggests that the latter plays an important role, a finding supported by Chaplot et al., [1] who also suggested that it is important to consider

symmetry in axial brain MR images when classifying normal and abnormal brain images. As for the DWT, Table 1 shows that it is uniformity, standard deviation, and the mean that are associated with low conditional entropy values while smoothness and the third moment are associated with higher ones. As a result, the former three are the features fed to the SVMs, leading to 97.28%0.03 correct classification rate when using DWT as the first step of MR image processing.
Table 1. Conditional entropy statistics for different MR image features using IMFs and DWT, with the lowest values (selected features) underscored IMF 1 IMF 2 IMF 3 IMF 4 DWT Mean 0.7372 6.8958 6.1045 6.9172 8.3117 Std.dev 0.7173 38.1726 34.6755 42.1231 8.1969 Smoothness 0.7284 15.4080 15.6369 7.6838 43.8745 3th. Moment 0.8157 6.5919 5.8319 7.2537 48.7563 Uniformity 0.6334 14.1315 12.5948 14.9499 0.3985 Table 2. Obtained classification accuracy (normal function of using the features from given IMFs IMF-1 IMF-2 IMF-3 Average 87.35% 91.31% 80.81% Std. dev. 0.0451 0.0388 0.1461 vs. all abnormal) as a IMF-4 84.83% 0.0579 All 98.98% 0.0121

normal versus abnormal brain magnetic resonance images. The ensuing computer aided diagnosis system uses the EMD mode to decompose the process the brain image into components, an entropy statistic to select the most significant features in them, and support vector machines to perform classification. The experiments on a database of 10 images of normal brains, and 52 of abnormal brains indicate that the EMD-based features are effective to characterize brain images and lead to slightly better performance by a SVM classifier than their DWT-based counterparts. Overall, the proposed method can make an accurate and robust classification system. However, it is substantially more time consuming than the DWT-based alternative. This weakness can be improved by optimizing the approximation stage of the EMD algorithm. REFERENCES
[1] S. Chaplot, L.M. Patnaik, and N.R. Jagannathan, Classification of magnetic resonance brain images using wavelets as input to support vector machine and neural network, Biomedical Signal Processing and Control, vol. 1, 86-92, 2006. E.-S.A El-Dahshan, T. Hosny, and A.-B.M. Salem, Hybrid intelligent techniques for MRI brain images classification, Digital Signal Processing, vol. 20, 433-441, 2010. Y. Zhang, S. Wang, and L. Wu, A novel method for magnetic resonance brain image classification based on adaptive chaotic PSO, Progress In Electromagnetics Research, vol. 109, 325-343, 2010. Y. Zhang, Z. Dong, L. Wu, and S. Wang, A hybrid method for MRI brain image classification, Expert Systems with Applications, vol. 38, 10049-10053, 2011. S. Lahmiri, and M. Boukadoum, Fractal dimension and high order statistics of spectral energy distributions as features for pathology detection in brain MR images, proc. IEEE NEWCAS, 293-296, 2012. W. Liu, W. Xu, and L. Li, Medical image retrieval based on bidimensional empirical mode decomposition, IEEE International Conference on Bioinformatics and Bioengineering, 641-646, 2007. Y. Chen, Y. Jiang, C. Wang, D. Wang, W. Li, and G. Zhai, A novel multi-focus images fusion method based on bidimensional empirical mode decomposition, IEEE International Congress On Image and Signal Processing, 1-4, 2009. N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.Ch. Yen, C.C. Tung, and H.H. Liu, The empirical mode decomposition and the Hilbert spectrum for non-linear and nonstationary time series analysis, In: Proc. R. Soc., Lond. A 454, 903995, 1998. J.C. Nunes, Y. Bouaoune, E. Delechelle, O. Niang, and Ph. Bunel, Image analysis by bidimensional empirical mode decomposition, Image and Vision Computing, vol. 21, 1019-1026, 2003. W. Liu, W. Xu, and L. Li, Medical image retrieval based on bidimensional empirical mode decomposition, IEEE International Conference on Bioinformatics and Bioengineering, 641-646, 2007. Y. Chen, Y. Jiang, C. Wang, D. Wang, W. Li, and G. Zhai, A novel multi-focus images fusion method based on bidimensional empirical mode decomposition, IEEE International Congress On Image and Signal Processing, 1-4, 2009. R.C. Gonzalez, R.E. Woods, S.L. Eddins, Digital Image Processing Using Matlab, Pearson Prentice Hall, 2003. S. Zhu, D. Wang, K. Yu, T. Li, and Y. Gong, Feature selection for gene expression using model-based entropy, IEEE Transactions on Computational Biology and Bioinofrmatics, vol. 7, 25-36, 2010. C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006. V.N. Vapnik, The Nature of Statistical Learning Theory, SpringerVerlag, 1995. http://med.harvard.edu/AANLIB/

In order to evaluate the efficiency of our approach further, Table 3 compares our classification accuracies with recent results [1][2][5] that used the same MR images database, and also with another study [3] that used a different database. The comparison shows that our system provides high classification accuracy in comparison with [1][2]. It also shows the great potential of the EMD processing approach in comparison with [3][5]. It should be noted that the leave-oneout validation method is used in our study, while [3] and [5] respectively employed five-fold cross validation and fixed 50%-50% (training-testing) data partition.
Table 3. Comparison of the results with the literature Data Approach Processing partition [1] DWT + SOM Fixed [1] DWT + SVM Fixed [2] DWT + PCA + neural networks Fixed [2] DWT + PCA + k-NN Fixed 5-fold cross [3] DWT + PCA + neural networks validation [4] DWT + PCA + neural networks Fixed Fractal + Spectral Energy + [5] SVM LOOM This work DWT + Entropy + SVM LOOM This work EMD + Entropy + SVM LOOM

[2] [3] [4] [5] [6] [7]

Accuracy % 94 98 [8] 98 97 98.75 100

[9] 99.90.006 97.280.0329 98.980.0121 [10] [11]

Finally, we notice that the EMD-based average processing time of a brain MR image is about 60 minutes using Matlab on a PC station with a 1.5GHz Core Duo CPU. This was due to the computational complexity incurred by using cubic spline interpolation in the approximation stage (step b in the algorithm). The computational complexity is then O((M-1)3), M being the signal size. In comparison, the computational complexity of the DWT is only O(M), leading to a processing time of less than a second [5]. IV.
CONCLUSION

[12] [13] [14] [15] [16]

In this paper, we explored the potential of the empirical mode decomposition as a basis of feature extraction to classify

Вам также может понравиться