Академический Документы
Профессиональный Документы
Культура Документы
Abstract—Content-based image retrieval (CBIR) system helps everyday. In 2003, Zheng et al. [7] has developed a content-
users retrieve relevant images based on their contents. A reliable based pathology image retrieval system based on image
content-based feature extraction technique is therefore required feature types of color histogram, texture representation by
to effectively extract most of the information from the images. Gabor transform, Fourier coefficients, and wavelet
These important elements include texture, colour, intensity or
coefficients. Recently, Rahman et al. [13] has proposed a
shape of the object inside an image. CBIR, when used in medical
applications, can help medical experts in their diagnosis such as CBIR framework which consists of machine learning methods
retrieving similar kind of disease and patient’s progress for image prefiltering, statistical similarity matching and
monitoring. In this paper, several feature extraction techniques relevance feedback scheme for medical images. The features
are explored to see their effectiveness in retrieving medical are extracted using colour moment descriptor, gray-level co-
images. The techniques are Gabor Transform, Discrete Wavelet occurrence matrix as texture characteristics and shape features
Frame, Hu Moment Invariants, Fourier Descriptor, Gray Level based on Canny edge detection.
Histogram and Gray Level Coherence Vector. Experiments are In this paper, detail comparison on the accuracy of different
conducted on 3,032 CT images of human brain and promising feature extraction techniques are discussed and experimented
results are reported.
on medical images. The motivation is to get the best technique
I. INTRODUCTION to be used in further medical image retrieval application. The
techniques are from texture, colour and shape elements; where
The advancement in computer technologies produces huge
texture techniques are Gabor Transform and Discrete Wavelet
volume of multimedia data, specifically on image data. As a
Frame, colours are Gray Level Histogram and Gray Level
result, studies on content-based image retrieval (CBIR) has
Coherence Vector, and shape methods are Hu Moment
emerged and been an active research nowadays. CBIR system
Invariants and Fourier Descriptors.
is used to find images based on the visual content of the
This paper is organized as follows. The next section briefly
images, and the retrieved results will have visually similar
describes the feature extraction techniques used in the
appearance to the query image. In order to describe the image
comparison, and then followed by review of medical images
content, low level arithmetic features are extracted from the
used in the experiment in Section III. Experimental setup is
image itself [3]. Numerous elements such as texture, motion,
discussed in Section IV, followed by the results and
colour, intensity and shape have been proposed and used to
discussions in Section V. Finally the conclusion is presented
quantitatively describe visual information [4]. The image
in Section VI.
features that were generated using specific algorithms are then
stored and maintained in a separate database. II. REVIEW OF FEATURE EXTRACTION TECHNIQUES
A number of previous works have been done addressing
different techniques of the image elements for image retrieval. A. Gabor Transform (texture)
In 2002, Nikolou et al. [8] has proposed a fractal scanning Gabor transform is a technique that extracts texture
technique to be used in colour image retrieval with Discrete information from an image. The one used in this research is a
Cosine Transform (DCT) and Fourier descriptors as feature two-dimensional Gabor function proposed by Manjunath and
extraction techniques. Qiang et al. [10] has developed a Ma [1]. Expanding the mother Gabor wavelet forms a
framework of CBIR based on global colour moments in HSV complete but non-orthogonal basis set. The non-orthogonality
colour space. Later in 2006, a user concept pattern learning implies that there will be redundant information between
framework has been presented by Chen et al. [9] for CBIR different resolutions in the output data. This redundancy has
using HSV colour features and Daubechies wavelet been reduced by [1] with the following strategy: Let Ul and Uh
transformation. The works on the CBIR for medical denote the lower and upper frequency of interest, S be the total
applications are rarely found before; however it is gaining a number of scales, and K be the total number of orientations (or
lot of attention recently due to large number of medical translations) to be computed. Then the design strategy is to
images in digital format generated by medical institutions ensure that the half-peak magnitude support of the filter
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE. Downloaded on May 23, 2009 at 09:01 from IEEE Xplore. Restrictions apply.
responses in the frequency spectrum touch each other as N −1
⎡ − j 2πkn ⎤ (4)
a ( k ) = ∑ z (n ) exp ⎢ ⎥ ,0 ≤ k ≤ N − 1
shown in Fig. 1, for S = 4 and K = 6. The Gabor transform is n=0 ⎣ N ⎦
then defined by:
The complex coefficients a(k) are called the Fourier
∫
W mn ( x, y ) = I ( x1 , y1 ) g mn ∗ ( x − x1 , y − y1 )dx1 dy1 (1) Descriptors of the boundary. 64-point Discrete Fourier
Transform (DFT) is used which results on 64-dimension of
where * indicates the complex conjugate and m, n are integers,
feature vector.
m = 1,2,…,S and n = 1,2,…,K. The Gabor transform therefore
produce SxK number of the output images, and energy within E. Gray Level Histogram (intensity)
each image is used as feature, resulting in SxK dimension of
Colour histograms are the most common way of describing
features where S=6 and K=4.
low-level colour properties of images. Since medical images
N −1 M −1
y=∑
are only available in grayscale, a simpler histogram called
n=0
∑ W (i, j )
m=0
(2)
gray level histogram (GLH) is used to describe intensity of
gray level colour map. A GLH is presented by a set of bins,
where each bin represents one or more level of gray intensity.
It is obtained by counting the number of pixels that fall into
each bin based on their intensity [6]. Fig. 2 shows an example
of GLH for different images using 64 bins histogram.
504
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE. Downloaded on May 23, 2009 at 09:01 from IEEE Xplore. Restrictions apply.
III. MEDICAL IMAGES used (Table III). These vectors are stored in separate feature
Medical image collection used in this experiment is vector databases according to the different techniques. During
provided by Putrajaya Hospital, Malaysia. It consists of 3,032 the online stage, the feature vector of the query image is
computed tomography (CT) images of human brain in the computed using one selected technique and is compared to all
DICOM image format. The images are of 512x512 feature vectors in the feature vector database of that technique.
resolutions, scanned from 95 patients with each patient having Distance metric is used to compute the similarity between
scans ranging from 15 to 56. To quantitatively evaluate the feature vectors of the database image. Small distance implies
performance of the texture- and intensity-based feature that the corresponding image is similar to the query image and
extraction techniques, the images are divided into 4 different vice versa. Images are then retrieved based on increasing
classes according to visual similarity, called general image distance. The flow of this process is shown in Fig. 4.
classification. The ability of the system to retrieve images TABLE III
from the same class to the query images indicates the accuracy DIFFERENT LENGTH OF FEATURE VECTORS
of the feature extraction techniques. To evaluate the Technique FV Length
performance of the shape-based techniques, different Gabor Transform 24
classification is used, called shape image classification. The Discrete Wavelet Frame 10
classification is based on the head contour obtained by Hu Moment Invariants 7
segmenting the head from its background using fuzzy C- Fourier Descriptor 64
means clustering algorithm. Note that the shape-based feature GrayLevel Histogram 64
extraction techniques employed will only search for similar GrayLevel Coherence Vector 128
shape of the head itself, and not the shapes of different object
inside it. Visually the shape of the head can also be classified
into 4 different classes. Some examples of the images for both
the general and shape classifications are shown in Table I and
II. From general image classification, 638 of 3,032 images in
the database belong to Class 1, 808 from Class 2, 1134 from
Class 3 and 452 images are from Class 4. For shape image
classification, 293 belong to Class 1, 1012 from Class 2, 981
from Class 3 and 746 from Class 4.
TABLE I Fig. 3. Offline feature extraction stage
GENERAL IMAGE CLASSIFICATION
Example
B
505
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE. Downloaded on May 23, 2009 at 09:01 from IEEE Xplore. Restrictions apply.
n of them belong to Class 1, 5 to Class 2, 10 to Class 3 and 4 to
Manhattan = ∑ (x
k =1
ik − x jk ) (6) Class 4. First image from Class 1 is selected as the query
image, keyword ‘156027’ is used with field ‘Patient ID’, and
n xik − x jk
Normalized Euclidean = ∑( σk
)2 (7) system will retrieve all 25 images of Patient14 according to
k =1 increasing distance. Perfect retrieval for this query image
n xik − x jk would be the retrieval of the other five images from Class 1
Normalized Manhattan = ∑(
k =1 σk
) (8) (excluding the query image itself) within the top 5 ranked
images, followed by images from other classes. The average
where σk is the standard deviation of the kth feature in the FV recognition rate is used to evaluate the retrieval accuracy, and
database. its calculation is as shown in (9). For example, if there are N
After the performance of each individual technique is images in Class 1 for a particular patient, then the average
obtained, the best technique among the intensity, texture and recognition rate is computed as the number of images from
shape features is chosen for experiment. The selected similar class within the top N retrieved images.
techniques are combined to see if the retrieval performance Average No. of images found from the same
can be further improved. This can be achieved by adding-up recognition = class within top N retrieved images (9)
the dissimilarity measures of the combined techniques without rate No. of images per class, N
affecting the relative distances between the query image and
Retrieval process is performed to all CT brain images from
the database images of each technique.
95 patients in image database using the six feature extraction
V. RESULTS AND DISCUSSIONS techniques with suitable distance metric as discussed
previously. Table V summarizes the retrieval accuracy for
In the initial setup of the experiment, eight images were
each class of all texture and intensity techniques, and Table VI
selected (two from each class) to test all the techniques with
for the two shape techniques. From Table V, recognition rate
all distance metrics to find the most suitable metric to be used
of all techniques for Class 1 and Class 3, and to some extent
for each technique. The result is summarized in Table IV. It
Class 4, is satisfactory but not for Class 2. The reason is that
was found that different feature extraction techniques give
classification was done visually based on human vision and
different performance for each distance metrics. Gabor
some ambiguity are present where images from Class 2 can
transform shows the best results using normalized Manhattan
also be classified as Class 1 or Class 3. Hence, it affects the
metric, Discrete Wavelet Frame performs best using
overall accuracy of Class 2 images. Overall, the average
normalized Euclidean metric, Fourier descriptor presents high
recognition rate per patient is recorded to be above 70% for
accuracy using Manhattan metric, while Hu moment
texture and intensity extraction techniques. From Table VI,
invariants, gray level histogram and gray level coherence
the accuracy of shape classification of both techniques varies
vector show high accuracy when using Euclidean distance
according to classes. Retrieval for Class 3 recorded highest
metric.
accuracy. However the accuracy is substantially lower
TABLE IV compared to the other four techniques from Table V because
RETRIEVAL ACCURACY FOR ALL FEATURE EXTRACTION TECHNIQUES
TESTED USING ALL DISTANCE METRICS
shape features are represented based on the contour of the
segmented object in the image and it depends a lot on the
Average retrieval accuracy of 8 segmentation accuracy itself. This problem can be fixed with
Technique query images for TOP 50 (%) better segmentation and shape extraction techniques to
E M NE NM distinguish images of human brain in CT scans.
Gabor 58.25 58 59.25 59.5
TABLE V
DWF 43.25 44.5 67.75 66.25 PERCENTAGE OF RETRIEVAL ACCURACY FOR TEXTURE AND INTENSITY
Hu moment 62 55.75 51 45.25 TECHNIQUES
Fourier Desc. 89.75 91.75 90.75 88
GLH 71.25 70.75 69.25 71.5 % Average for 95 patient per class % Average
Technique
GLCV 74 73.25 25 25 Class 1 Class 2 Class 3 Class 4 per patient
E=Euclidean, M=Manhattan, NE=Normalized Euclidean Gabor 78.45 59.48 84.43 76.78 74.51
and NM=Normalized Manhattan DWF 80.72 64.91 84.44 78.98 77.05
GLH 86.98 55.31 88.26 60.88 72.02
To evaluate the performance of each feature extraction
GLCV 85.48 56.11 87.49 62.84 72.21
technique, all 3,032 CT Brain images are used as query one-
by-one to check if similar images from the same patient and TABLE VI
class are retrieved successfully. It is easier to do analysis by PERCENTAGE OF RETRIEVAL ACCURACY FOR SHAPE TECHNIQUES
checking the similarity per patient instead of all images from % Average for 95 patient per class % Average
the database because the total number of images to be Technique
Class 1 Class 2 Class 3 Class 4 per patient
considered is then smaller. This operation involves hybrid-
based image retrieval from our previous work in [12] where Hu moment 52.75 43.14 67.1 32.22 48.53
PatientID is used as input in text-query and is combined with Fourier
47.55 67.05 75.35 74.28 67.39
CBIR. As an example, Patient14 (ID 156027) has 25 scans – 6 Descriptor
506
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE. Downloaded on May 23, 2009 at 09:01 from IEEE Xplore. Restrictions apply.
The other observation recorded is that the average metric produce a very small measurement, hence Normalized
recognition rate varies from patient to patient, depending on Euclidean is used as a replacement because it generates
the difficulty level in visually classifying the images into a second highest accuracy in Table IV. There is no change in
particular class. Certain patients recorded a low average DWF technique. Results for the combination of techniques are
recognition rate because some of its images can be visually shown in Table VIII. From the table, it can be seen that the
classified into 2 classes, hence, effecting the recognition combination of DWF and FD techniques give the highest
measurement. The average recognition rate for all 95 patients average retrieval rate. The pattern of accuracy per class is
using all techniques is presented in Fig. 5. equivalent to the one in Table V, where Class 1 and Class 3
The experiments were conducted using Matlab 7.3 on an give better results, as well as Class 4, but a bit low for Class 2.
Intel Core Duo 2.0GHz processor with 1GB memory. Average Combining DWF with either GLH or FD performs retrieval
time taken for each technique to complete retrieval process is faster compared to the other combinations. Obviously more
summarized in Table VII. For texture image element, average time is needed to compute combination of all three techniques.
time recorded for both techniques are the same, but when It is also interesting to note that combining all three
referring to the retrieval accuracy, DWF gives better results. techniques does not further improve the retrieval accuracy, in
As for gray level intensity, the histogram technique performs fact it performs worst than all the combination-of-two
the retrieval process much faster than the coherence vector techniques. This shows that we cannot simply bundle together
technique, even though both techniques result similar retrieval a lot of feature extraction methods in order to get higher
accuracy. Between the two tested shape features, Hu moment accuracy.
technique can execute the retrieval up to three times faster TABLE VIII
than Fourier Descriptor (FD). However, after considering the PERCENTAGE OF RETRIEVAL ACCURACY FOR MULTI-TECHNIQUES
poor retrieval accuracy, FD is chosen for further analysis in
experimenting combination of feature extraction techniques. % Average for 95 patient per %
Average
class Average
Techniques time
TABLE VII per
AVERAGE RETRIEVAL TIME FOR EACH TECHNIQUE Class 1 Class 2 Class 3 Class 4 taken
patient
DWF + GLH 88.04 57.93 88.19 66.37 77.81 10s – 11s
Technique Average time taken
Gabor Transform 5s – 6s DWF + FD 82.03 64.94 84.46 78.98 80.60 11s – 12s
Discrete Wavelet Frame 5s – 6s GLH + FD 86.97 55.31 88.26 60.89 75.34 15s – 16s
Gray Level Histogram 11s – 12s DWF + GLH
69.17 61.37 87.96 72.93 74.01 18s – 19s
Gray Level Coherence Vector 19s – 20s + FD
Hu moment 3s – 4s
Fourier Descriptor 10s – 11s To ease the work of testing and analyzing the images, a
graphical user interface (GUI) was developed using Matlab
Since the combination of feature extraction techniques environment. It consists of two main panels which are Query
involved summation operation of the techniques dissimilarity Panel (left side) and Result Panel (right side). The
measures, the distance for each technique cannot be too development of this system is meant for flexible hybrid
dominant compared to others, so a small modification is retrieval system, so in the Query Panel, the type of retrieval
needed. For gray level histogram (GLH), the distance is very can be selected – content-based (CBIR), text-based (TBIR) or
large which ranges from 107 up to 109 because the feature both (Hybrid). Accuracy of the system can be analyzed
vector consists of number of pixels in a specific bin. To visually by looking at the Result Panel. Fig. 6 shows an
normalize the GLH features, total number of pixels in each example of retrieval results obtained by texture extraction
bin is divided by the total number of pixels for all bins. For technique of DWF. As can be seen, visually similar scans are
FD technique, it was found that using Manhattan distance retrieved accordingly.
507
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE. Downloaded on May 23, 2009 at 09:01 from IEEE Xplore. Restrictions apply.
Fig. 6. GUI for image retrieval system
508
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE. Downloaded on May 23, 2009 at 09:01 from IEEE Xplore. Restrictions apply.