Академический Документы
Профессиональный Документы
Культура Документы
Abstra t
High information redundan
y and
orrelation in fa
e images result in ineÆ
ien
ies when su
h images
are used dire
tly for re
ognition. In this paper, dis
rete
osine transforms are used to redu
e image
information redundan
y be
ause only a subset of the transform
oeÆ
ients are ne
essary to preserve the
most important fa
ial features su
h as hair outline, eyes and mouth. We demonstrate experimentally that
when DCT
oeÆ
ients are fed into a ba
kpropagation neural network for
lassi
ation, a high re
ognition
rate
an be a
hieved by using a very small proportion of transform
oeÆ
ients. This makes DCT-based
fa
e re
ognition mu
h faster than other approa
hes.
Key words: Fa
e re
ognition, neural networks, feature extra
tion, dis
rete
osine transform.
1 Introdu
tion
High information redundan
y present in fa
e images results in ineÆ
ien
ies when these images are used
dire
tly for re
ognition, identi
ation and
lassi
ation. Typi
ally one builds a
omputational model to
transform pixel images into fa
e features, whi
h generally should be robust to variations of illumination, s
ale
and orientation, and then use these features for re
ognition. Several te
hniques for fa
ial feature extra
tion[3℄
have been proposed. They in
lude methods based on geometri
al features, statisti
al features[12℄ , feature
points[11; 14; 30℄ and neural networks[17; 18; 21; 31; 32℄ .
The geometri
al approa
h represents fa
es in terms of stru
tural measures that in
lude parameters su
h
as ratios of distan
es, angles, and areas between elementary features su
h as eyes, nose, mouth or fa
ial
templates su
h as nose width and length, month position and
hin type[1℄ . Then the features are used to
re
ognise unknown images mainly by mat
hing to the nearest neighbour in the stored database, although
neural networks are also used by some resear
hers as a nonlinear
lassier[20℄ . The system performan
e
depends on the normalisation used to determine the head lo
ation, translation, rotation and s
ale.
Statisti
al features are usually generated by algebrai
methods su
h as prin
ipal
omponents analysis
(PCA)[15; 16; 27℄ , or the
losely related Karhunen-Loeve transform[10; 28℄ , or singular value de
omposition[8℄ .
These features are in the form of a set of orthogonal bases su
h as prin
ipal
omponents, or the eigenve
tors
(referred to as eigenfa
es). On
e the eigenve
tors are
hosen, any image in the gallery (set of training images)
an be approximately re
onstru
ted with a linear
ombination of eigenfa
es, and their
omponents will be
stored in memory. For an unknown image, its
omponents are
al
ulated by proje
ting it to the fa
e spa
e
(spa
e generated by eigenfa
es) and looking for the
losest mat
h.
The well known method of Gabor jet with graph mat
hing and Dynami
Link Ar
hite
ture (DLA) extra
ts
feature points from a fa
e image by nding optimum points to Gabor lter response. Re
ognition is
arried
out by graph mat
hing to nd the
losest stored graph in the database[11; 14℄ . As the mat
hing pro
ess has
1
to
ompare an image with all the fa
es in the database, its re
ognition
ost might be too expensive to be
pra
ti
al for large databases and real time appli
ations.
As for neural network re
ognition systems, due to the diÆ
ulty of sele
ting a representation that
ould
apture features robustly, most approa
hes avoid the feature extra
tion pro
edure by feeding the pixel images
dire
tly to neural networks and making use of the ability of neural networks as an information pro
essing
tool[18℄ . Nevertheless, Lawren
e et al.[13℄ applied self-organising map (SOM) as a feature extra
tor and then
the generated features were exploited as the input of a
onvolutional neural network for re
ognition, a mu
h
similar ar
hite
ture to neo
ognitron[19℄ . Training either the SOM or the
onvolutional neural network are
tremendously
omputationally expensive.
Re
ognition systems are mainly
ompared on the basis of their re
ognition rate, training time and re
og-
nition time. The re
ognition rate is the most important index of a re
ognition system. The smaller the
training time, the more resour
es will be available to exploit other te
hniques to improve the performan
e,
for example, instead of one MLP one
an apply ensemble or bootstrapping te
hniques to redu
e the gener-
alisation error, or self-renovate the system after mis
lassi
ation. The requirement of real time appli
ations
is the short re
ognition time. For example, one
ould not ask everybody to wait minutes to pass an a
ess
ontrol system.
For the fa
e re
ognition te
hniques mentioned above, most approa
hes have to build a database to store
the features from the known fa
es in order to
ompare the features extra
ted from an unknown image with
that in the database, while others, like
onvolutional neural network approa
h, are too training expensive.
In this paper, we present a new approa
h to high speed fa
e re
ognition using dis
rete
osine transforms
(DCTs) as a way of information pa
king. Redundan
y removal to fa
ilitate data pro
essing and image
ategorisation is not a new idea[2; 24℄ . For example, the Karhunen-Loeve transform (KLT) is widely applied
to image analysis for dimensionality redu
tion[3; 9; 15; 28℄ . Although the KLT, not the DCT, is the optimal
transform in an information pa
king sense[9℄ , we will apply dis
rete
osine transforms instead of KLTs to fa
e
images. That is be
ause KLT is data dependent and obtaining the KLT basis images, in general, is a nontrivial
omputational task, whereas there exist fast algorithms to
ompute 2D dis
rete
osine transforms[4℄, whi
h
makes DCTs extremely
ompetitive in terms of
omputational
omplexity.
2 Information Pa
king
2.1 Dis
rete
osine transform
The
osine transform, like the Fourier transform, uses sinusoidal basis fun
tion. The dieren
e is that the
osine transform basis fun
tions are not
omplex; they use only
osine fun
tions without sine fun
tions. The
dis
rete
osine transform of an N M image f (x; y ) is dened by
h
X1 N
X1 (2x + 1)u i h (2y + 1)v i
p 2 (u)(v)
M
2
of DCT have proved to be of su
h pra
ti
al value that it has be
ome an international standard as the Joint
Photographi
Experts Group (JPEG) image
ompression method. In these standards, a two-dimensional
DCT is applied to 8 8 blo
ks of pixels in the image. The 64 (8 8 = 64)
oeÆ
ients produ
ed by the DCT
are then quantized to provide the nal
ompression. DCTs have also been su
essfully used to generate keys
for image retrieval from a large database[26℄ .
Compared to other transforms, DCT has the advantages of having been implemented in a single integrated
ir
uit be
ause of its input independen
y, pa
king the most information into the fewest
oeÆ
ients for most
natural images, and minimizing the blo
klike appearan
e,
alled blo
king artifa
t, that results when the
boundaries between subimages be
ome visible. This last property is parti
ularly important in
omparisons
with the other sinusoidal transforms.
Another advantage of DCT is that most DCT
oeÆ
ients on real world images turn out to be very small
in magnitude, espe
ially as u and v approa
h the image/subimage width and height respe
tively. Trun
ating,
or removing these small
oeÆ
ients from the representation introdu
es only small errors in the re
onstru
ted
image. For some image
lasses (e.g. fa
es), most of the information exists in the
oeÆ
ients with small u
and v (i.e., the upper-left
orner of Figure 1(b)). This
hara
teristi
simplies optimum
oeÆ
ient sele
tion
for appli
ations su
h as image re
ognition.
As shown in Figure 1(b), the DCT
oeÆ
ients with large magnitudes (lighter pixels) are mainly
on
entrated
in the upper-left
orner (
orresponding to low spatial frequen
y DCT
omponents in the image). The
remaining
oeÆ
ients are very small (almost bla
k). Figures 2(a), (b), (
), (d) and (e) show re
onstru
tions of
Figure 1(a) using 35, 100, 500, 1000 and 2500 DCT
oeÆ
ients respe
tively (from a total of 92 112 = 10; 304
available DCT
oeÆ
ients). In ea
h
ase, the
oeÆ
ients were sele
ted by starting at the top left and s
anning
the spe
ied number of
oeÆ
ients in the order illustrated in Figure 1(
). The re
onstru
ted images were
obtained by setting the remaining
oeÆ
ients to zero before taking the inverse DCT transform. The errors
of the re
onstru
ted images against the original image and the per
entage of DCT
oeÆ
ients used are
illustrated in Table 1. As illustrated in Figure 2(a), using only 35
oeÆ
ients , i.e. 0:34% of the full set is
suÆ
ient to allow one to re
ognise the image as a fa
e. The experiments in Se
tion 5 demonstrate that in
fa
t these few
oeÆ
ients have most of the information ne
essary for fa
e re
ognition.
3
b
a
( ) ( )b
( )
Figure 1: A 92 112 8-bit fa e image and the log magnitude of its dis rete osine transform.
a
( ) b
( )
( ) d
( ) e
( )
f)
( g
( ) (h) ( ) i j
( )
Figure 2: Ee
t of in
reasing the number of
oeÆ
ients on re
onstru
ted images: (a), (b), (
), (d) and (e) the
re
onstru
ted images using 35, 100, 500, 1000, 2500
oeÆ
ients of the dis
rete
osine transform of image in
Figure 1(a) respe
tively; (f ), (g ), (h), (i) and (j ) are the
orresponding s
aled dieren
es of the re
onstru
ted
images to the original image; the
orresponding errors and the per
entage of dis
rete
osine transform
oeÆ
ients
used are illustrated in Table 1.
and
omputational
omplexity. In general, the subimage size should be an integer power of 2, to simplify
the
omputation of the subimage transform[4℄ . If the image dimensions do not divide by the subimage size,
the image may be zero-padded to the next multiple of that.
Figures 3 and 4 illustrate graphi
ally the impa
t of subimage size on re
onstru
tion error. The data
plotted were obtained by dividing the image of Figure 1(a) into subimages of size n n, for n=4, 8, 16,
1
32, 64, and then re
onstru
ting the image using only of the resulting
oeÆ
ients. The
orresponding
16
4
numbers of blo
ks to these subimage sizes are 23 28, 12 14, 6 7, 3 3, 2 2 respe
tively. Generally, it is
true that both the level of
ompression and
omputational
omplexity in
rease as subimage size in
reases[7℄.
However, the re
onstru
tion error rea
hes the optimum when the subimage size is 16 16 in our
ase. The
reason is that our image size is not a power of 2. We have to zero-pad some subimages whi
h may introdu
e
re
onstru
tion errors for large subimage size.
a
( ) b
( )
( ) d
( ) e
( )
(f) g
( ) (h) i
( ) j
( )
Figure 3: Illustration of the ee
t of the size of subimages on re
onstru
ted images: (a), (b), (
), (d) and (e)
the re
onstru
ted images by dividing image in Figure 1(a) into subimages of size 4 4, 8 8, 16 16, 32 32,
1
64 64, respe
tively and then retaining (6.25%) of the DCT
oeÆ
ients; (f ), (g ), (h), (i) and (j ) are the
16
orresponding s
aled dieren
es of the re
onstru
ted images to the original image; the
orresponding errors are
illustrated in Table 4.
28
bc
mse c
b
200
psnr b
peak signal-to-noise ratio
mean square error
b
27
180
b
c
b
b
160
26
b
c
b
140 c
b
c
b
b
25
120
4 4 8 8 16
16 32 32 64 64
subimage size
5
3 System des
ription
The main idea of our approa
h is to apply the DCT to redu
e the information redundan
y and to use the
pa
ked information for
lassi
ation. For a fa
e image, the system rst
omputes the DCT
oeÆ
ients of the
image or its subimages, then sele
ts only a limited number of the
oeÆ
ients and feeds them as input into a
lassier, here a multi-layer per
eptron (MLP). DCT
omputation and subimage division are performed in
the manner des
ribed in se
tion 2. A diagrammati
des
ription of our DCT-based system for fa
e re
ognition
is shown in Figure 5.
6
the global maximum and minimum. In this way, even for unknown images, the
oeÆ
ients are
onverted
roughly into [ 1; 1℄. To formulate this idea, suppose x1 ; x2 ; ; xn are the
oeÆ
ients retained from the
oeÆ
ient sele
tion pro
edure and f(x(1j ) ; x(2j ) ; ; x(nj ) ); j = 1; 2; ; pg are
oeÆ
ients retained from the
training images, where n is the number of DCT
oeÆ
ients retained and p is the number of training images.
Then the upper bounds (bi ) and lower bounds (ai )
an be determined by
Where > 1 is a fa
tor to extend the bounds. Then the input ve
tors f(z1(j ) ; z2(j ) ; ; zn(j ) ); j = 1; 2; ; pg
of the neural network
ould be determined by
x( ) a j
z( ) = 2
j i i
1; i = 1; 2; ; n: (7)
i
b a i i
For an unknown image, the s
aling fa
tors obtained for the training set are applied to the retained
oeÆ
ients
to obtain the input ve
tor to the MLP.
4 ORL database
The ORL database was built at the Olivetti Resear
h Laboratory in Cambridge, UK and is available free
of
harge from http://www.
am-orl.
o.uk/fa
edatabase.html. The database
onsists of 400 dierent
images, 10 for ea
h of 40 distin
t subje
ts. There are 4 female and 36 male subje
ts. For some subje
ts, the
images were taken at dierent times, varying the lighting, fa
ial expression (open/
losed eyes, smiling/not
smiling) and fa
ial details (glasses/no glasses). All the images were taken against a dark homogeneous
ba
kground with the subje
ts in an upright, frontal position with toleran
e for limited side movement and
limited tilt up to about 20 degrees. The size of ea
h image is 92 112 pixels, with 256 grey levels per pixel.
Thumbnails of all images
an be viewed from http://www.
am-orl.
o.uk/fa
esataglan
e.html.
5 Simulations
5.1 Experimental Setup
In the following experiments, the weights and biases of the MLP are initialised to random values in [ 0:5; 0:5℄.
Three learning parameters max ; 0 , and de
ay used in Qui
kprop[5; 25℄ are set to 0.02, 0.008, 0.0001, respe
-
tively. The maximum number of training epo
hs is 1000. The multipli
ation fa
tor in Equations 5 and 6 is
set to 1.1. No attempt was made to optimise these parameters. To redu
e the in
uen
e of the presentation
order of training samples, for every training loop, the training samples were shued on
e randomly. The
neural network used is a multi-layer per
eptron with one hidden layer. For the ORL database, the number
of outputs of the MLP is always 40 and a winner-take-all strategy was used for
lassi
ation.
To allow
omparisons, the same training and test set size are used as in [13,22,23℄ , i.e., the rst 5 images
for ea
h subje
t are the training images and the remaining 5 images are used for testing. Hen
e there are
200 training images and 200 test images in total and no overlap exists between the training and test images.
Due to the small size of the available data, a validation set was not used and the best-so-far re
ognition rate
on testing images is reported as the testing re
ognition rate.
In ea
h of the following statisti
al results, 30 random runs are
arried out with randomly initialised
weights and biases for ea
h MLP. The T-tests are based on the 0.05 level of signi
an
e, whi
h mean the
T-test statisti
has to ex
eed 1.645 for experimental results to be
lassied as statisti
ally dierent from the
referen
e
ase (35 DCT
oeÆ
ients, 75 hidden neurons).
7
number of number of mean max min T-test
signi
an
e
oeÆ
ients hidden neurons (%) (%) (%) statisti
20 60 87.67 0.0094 89.0 86.0 21.203 X
25 60 91.55 0.0111 93.5 89.0 4.929 X
30 60 92.52 0.0085 94.0 90.5 1.513
30 75 92.53 0.0092 94.0 90.5 1.388
35 60 92.57 0.0097 94.5 90.5 1.204
35 75 92.87 0.0096 94.5 91.0 | |
40 60 91.03 0.0126 93.5 88.0 6.354 X
40 75 91.67 0.0113 93.5 89.5 4.440 X
45 75 91.22 0.0121 93.0 88.0 5.868 X
50 60 91.65 0.0164 94.5 88.0 3.524 X
50 75 92.30 0.0119 94.0 90.0 2.046 X
60 60 89.32 0.0126 92.0 87.0 12.270 X
60 75 89.15 0.0117 91.5 87.0 13.475 X
70 60 88.60 0.0130 92.0 86.5 14.502 X
70 75 88.63 0.0163 92.0 85.5 12.272 X
80 75 86.93 0.0153 89.5 84.0 18.004 X
90 75 84.82 0.0137 87.0 82.5 26.398 X
100 75 84.47 0.0178 87.5 81.0 22.799 X
Table 2: Re
ognition performan
e on testing images versus number of DCT
oeÆ
ients retained.
8
1.00
0.80
Re
ognition rate
0.60
0.40
mean of best-so-far
maximum of best-so-far
0.20 minimum of best-so-far
0.00
0 200 400 600 800 1000
Training epo
h
Figure 6: Evolution of re ognition rate with 35 DCT oeÆ ients retained and 75 hidden neurons.
in the MLP. For 8 8 subimages only the 3 best performan
es are listed. A more extensive listing is provided
for 16 16 subimages. Note that to a
hieve re
ognition rates
omparable to the full-image
ase, a larger
number of
oeÆ
ients, and twi
e as many hidden neurons are ne
essary. Thus, for a given re
ognition rate,
use of subimages does not redu
e
omputational load. The reason may be that our original fa
e image size
is not very large.
5.5 Comparison of dierent re
ognition approa
hes based on the ORL database
The ORL database has been used to test several fa
e re
ognition approa
hes[13; 22; 23℄ . The re
ognition rates
of the best models and their training/
lassi
ation times (if available) are shown in Tabel 5. The
lassi
ation
time of the Hidden Markov Model (P2D-HMM)[23℄ is based on a model with parameters (3-6-6-6-3,12,8,9,6)
and the
lassi
ation of full resolution images, while 4-times smaller images (redu
ed to a resolution of 23 28
9
Subimage number of no of hidden mean max min T-test
signi
an
e
Size
oeÆ
ients neurons (%) (%) (%) statisti
88 168 100 89.93 0.0168 92.5 87.5 8.331 X
88 168 150 89.83 0.0135 92.5 87.5 10.021 X
88 168 250 88.62 0.0164 92.0 85.0 12.239 X
16 16 42 60 91.80 0.0145 94.0 88.5 3.373 X
16 16 42 75 91.97 0.0095 93.5 90.0 3.671 X
16 16 42 100 92.22 0.0103 94.5 89.5 2.540 X
16 16 42 120 92.10 0.0074 94.0 91.0 3.487 X
16 16 42 150 92.65 0.0123 95.0 90.5 0.774
16 16 84 100 92.15 0.0114 95.0 89.5 2.648 X
16 16 126 100 88.88 0.0178 91.5 84.0 10.799 X
16 16 168 100 89.93 0.0168 92.5 87.5 8.331 X
16 16 672 100 74.60 0.0221 78.5 70.0 41.585 X
Table 4: Re
ognition performan
e on testing images versus subimage size, number of DCT
oeÆ
ients retained
and number of hidden neurons in the MLP.
pixels by averaging 4 4 windows) were used in the Convolutional Neural Network (CNN) approa
h[13℄ . For
omparison, the performan
e of an MLP applied to the similarly redu
ed images is also shown. The MLP
has one hidden layer with 60 hidden neurons, the numbers of input neurons and output neurons are 644 and
40 respe
tively. Other learning parameters and the target output and input ve
tors for training the MLP
are the same as in se
tion 5.1.
re
ognition rate relative
approa
h training time re
ognition time
best mean speed
HMM[22℄ 87% | | | |
eigenfa
es(PCA)[23℄ 90% | | | |
P2D-HMM[23℄ 95% | | | 4 minutesy 1/192
onvolutional NN 98.5%z 96.2%x 0.004x 4 hours{ < 0:5 se
onds{ 1
[13℄
Table 5: Performan e omparison of dierent approa hes to re ognition applied to the ORL database.
As shown in Table 5, the re
ognition rate of our DCT-based system is
omparable to the best reported
results (the CNN and the P2D-HMM). However, the training and
lassi
ation times of our DCT-based
method are mu
h faster than the other approa
hes. It is diÆ
ult to
ompare the speed of algorithms exe
uted
on dierent
omputing platforms be
ause of the intera
tions of a large number of fa
tors su
h as CPU speed,
memory and
a
he size,
ompiler eÆ
ien
y and even the programmer's skill. The relative re
ognition speeds
given in Table 5 are extrapolated from ben
hmark evaluations using the MATLAB ben
hmark utility and the
published SPEC CPUfp92 data (available from http://www.spe
.org). A
ording to these ben
hmarks,
the 450MHz Pentium used in our experiments is approximately 34 times faster than an SGI Indy MIPS
R4400 100MHz system, and approximately 910 times faster than a Spar
II. The right hand
olumn in
Table 5 shows the relative re
ognition speed of the various methods normalised to a
ount for the above
10
dieren
es in pro
ess speed. Note that the
lassi
ation time for our DCT-based method is around 600
times faster than the
onvolutional neural network approa
h. The
lassi
ation of speed of the
onvolutional
neural network approa
h is itself about 200 times faster than P2D-HMM approa
h (Lawren
e et al[13℄ report
CNN to be 500 times faster than P2D-HMM, but their
omparison ignores pro
essor speed dieren
es).
The above speed
omparison is
onservative. For example, note from Figure 6 that the lengths of our
training epo
hs
ould be redu
ed to a quarter or less without signi
ant loss of
lassi
ation performan
e.
Furthermore, for the above
omparison, input images to the CNN and the MLP were a quarter of full
resolution. For N N images, the
omputational
ost of these approa
hes is proportional to O(N 2 ). For
omparison, the
omputational
omplexity of fast DCT (where N is a power of 2) is only O(N log(N )).
6 Con
lusions
In this paper, we have presented a very fast and eÆ
ient approa
h to fa
e re
ognition, whi
h
ombines image
ompression and neural network te
hniques together. The
ompression is a
hieved by applying a Dis
rete
Cosine Transform to the fa
e images and trun
ating the unimportant
omponents. For fa
e images, high
frequen
y DCT
omponents are negligibly small and
an be trun
ated without loss of the most important
fa
ial features su
h as hair, eyes, and mouth outline and lo
ation. In our approa
h, the
ompressed transform
oeÆ
ients rather than the pixel data are used for neural network
lassi
ation. The experiments reported
above demonstrate that for the ORL database, using only 0:34% of all the available DCT
oeÆ
ients produ
es
a re
ognition rate
omparable to the best results reported to date while the pro
essing speed is more than
2 orders of magnitude faster.
Referen
es
1. R. Brunelli and T. Poggio, \Fa
e re
ognition: Features versus templates," IEEE Transa
tions on Pattern Analysis
and Ma
hine Intelligen
e, vol. 15, no. 10, pp. 1042{1052, 1993.
2. B. Chalmond and S. Girard, \Nonlinear modeling of s
attered multivariate data and its appli
ation to shape
hange," IEEE Transa
tions on Pattern Analysis and Ma
hine Intelligen
e, vol. 21, no. 5, pp. 422{432, 1999.
3. R. Chellappa, C. L. Wilson, and S. Sirohey, \Human and ma
hine re
ognition of fa
es: A survey," Pro
eedings
of the IEEE, vol. 83, no. 5, pp. 705{740, 1995.
4. C. Christopoulos, J. Bormans, A. Skodras, and J. Cornelis, \EÆ
ient
omputation of the two-dimensional fast
osine transform," in SPIE Hybrid Image and Signal Pro
essing IV, (Orlando, Florida, USA), pp. 229{237, 1994.
5. S. E. Fahlman, \An empiri
al study of learning speed in ba
k-propagation networks," Te
hni
al Report, CMU-
CS-88-162, Department of Computer S
ien
e, Carnegie Mellon University, September 1988. ftp://ftp.
s.
mu.
edu/afs/
s/proje
t/
onne
t/tr/qp-tr.ps.Z.
6. L. Fausett, Fundamentals of Neural Networks: Ar
hite
tures, Algorithms and Appli
ations. Englewood Clis,
NJ: Prenti
e Hall, 1994.
7. R. Gonzalez and R. Woods, Digital Image Pro
essing. Reading, MA: Addison-Wesley, 1992.
8. Z. Hong, \Algebrai
feature extra
tion of image for re
ognition," Pattern Re
ognition, vol. 24, pp. 211{219, 1991.
9. J. Karuhnen and J. Joutsensalo, \Generalization of prin
ipal
omponent analysis, optimization problems and
neural networks," Neural Networks, vol. 8, no. 4, pp. 549{562, 1995.
10. M. Kirby and L. Sirovi
h, \Appli
ation of the Karhunen-Loeve pro
edure for the
hara
terization of human
fa
es," IEEE Transa
tions on Pattern Analysis and Ma
hine Intelligen
e, vol. 12, no. 1, pp. 103{108, 1990.
11. M. Lades, J. Vorbruggen, J. Buhmann, J. Lange, C. Von Der Malsburg, R. Wurtz, and W. Konen, \Distortion
invariant obje
t re
ognition in the dynami
link ar
hite
ture," IEEE Transa
tions on Computers, vol. 42, no. 3,
pp. 300{311, 1993.
12. A. Lanitis, C. Taylor, and T. Cootes, \Automati
interpretation and
oding of fa
e image using
exible models,"
IEEE Transa
tions on Pattern Analysis and Ma
hine Intelligen
e, vol. 19, no. 7, pp. 743{756, 1997.
13. S. Lawren
e, C. Lee Giles, A. Tsoi, and A. Ba
k, \Fa
e re
ognition: A
onvolutional neural network approa
h,"
IEEE Transa
tions on Neural Networks, vol. 8, no. 1, pp. 98{113, 1997.
14. T. Maurer and C. von der Malsburg, \Tra
king and learning graphs and pose on image sequen
es of fa
es," in
Pro
eedings of the 2nd International Conferen
e on Automati
Fa
e and Gesture Re
ognition, (Killington, USA),
pp. 176{181, IEEE Computer So
iety Press, 1996.
11
15. B. Moghaddam and A. Pentland, \Probabilisti
visual learning for obje
t representation," IEEE Transa
tions
on Pattern Analysis and Ma
hine Intelligen
e, vol. 19, no. 7, pp. 696{710, 1997.
16. B. Moghaddam, W. Wahid, and A. Pentland, \Beyond eigenfa
es: Probabilisti
mat
hing for fa
e re
ognition,"
in Pro
eedings of the International Conferen
e on Automati
Fa
e and Gesture Re
ognition, (Nara, Japan), Apr.
1998.
17. C. Nebauer, \Evaluation of
onvolutional neural networks for visual re
ognition," IEEE Transa
tions on Neural
Networks, vol. 9, no. 4, pp. 685{696, 1998.
18. A. J. O'Toole, H. Abdi, and D. Valentin, \Fa
e re
ognition," in Handbook of Brain Theory and Neural Networks
(M. Arbib, ed.), pp. 1388{390, Cambridge (MA): M.I.T. Press, 1995.
19. Z. Pan, T. Sabis
h, R. Admas, and H. Bolouri, \Staged training of neo
ognitron by evolutionary algorithms,"
in Pro
. of IEEE Congress on Evolutionary Computation(CEC'99), (Washington D.C., USA), pp. 1965{1972,
1999.
20. M. Reinders, R. Ko
h, and J. Gerbrands, \Lo
ating fa
ial features in image sequen
es using neural networks," in
Pro
eedings of the 2nd International Conferen
e on Automati
Fa
e and Gesture Re
ognition, (Killington, USA),
pp. 230{235, IEEE Computer So
iety Press, 1996.
21. H. Rowley, S. Baluja, and T. Kanade, \Neural network-based fa
e dete
tion," IEEE Transa
tions on Pattern
Analysis and Ma
hine Intelligen
e, vol. 20, no. 1, pp. 23{38, 1998.
22. F. Samaria and A. Harter, \Parameterisation of a sto
hasti
model for human fa
e identi
ation," in Pro
eedings
of 2nd IEEE Workshop on Appli
ations of Computer Vision, (Sarasota Florida, USA), De
1994.
23. F. Samaria, Fa
e Re
ognition using Hidden Markov Models. PhD thesis, Trinity College, University of Cambridge,
Cambridge, 1994.
24. E. Saund, \Dimensionality-redu
tion using
onne
tionist networks," IEEE Transa
tions on Pattern Analysis and
Ma
hine Intelligen
e, vol. 11, no. 3, pp. 304{314, 1989.
25. W. S
himann, M. Joost, and R. Werner, \Optimization of the ba
kpropagation algorithm for training multilayer
per
eptrons," Te
hni
al Report, Institute of Physi
s, University of Koblenz, ftp://ftp.
is.ohio-state.edu/
pub/neuroprose/s
hiff.bp_speedup.ps.Z, 1994.
26. M. Shneier and M. Abdel-Mottaleb, \Exploiting the JPEG
ompression s
heme for image retrieval," IEEE
Transa
tions on Pattern Analysis and Ma
hine Intelligen
e, vol. 18, no. 8, pp. 849{853, 1996.
27. M. Turk and A. Pentland, \Eigenfa
es for re
ognition," Journal of Cognitive Neuros
ien
e, vol. 3, pp. 71{86,
1991.
28. M. Uenohara and T. Kanade, \Use of Fourier and Karhunen-Loeve de
omposition for fast pattern mat
hing
with a large set of templates," IEEE Transa
tions on Pattern Analysis and Ma
hine Intelligen
e, vol. 19, no. 8,
pp. 891{898, 1997.
29. S. E. Umbaugh, Computer Vision and Image Pro
essing: A pra
ti
al approa
h using CVIPtools. Prenti
e-Hall
International, In
., 1998.
30. A. Yuille, D. Cohen, and P. Hallinian, \Feature extra
tion from fa
es using deformable templates," in Pro
eedings
of IEEE Computer So
. Conferen
e on Computer Vision and Pattern Re
ognition, pp. 104{109, 1989.
31. J. Zhang, Y. Yan, and M. Lades, \Fa
e re
ognition: Eigenfa
e, elasti
mat
hing, and neural nets," Pro
eedings
of the IEEE, vol. 85, no. 9, pp. 1423{1435, 1997.
32. M. Zhang and J. Ful
her, \Fa
e re
ognition using arti
ial neural network group-based adaptive toleran
e (GAT)
trees," IEEE Transa
tions on Neural Networks, vol. 7, no. 3, pp. 555{567, 1996.
12