Вы находитесь на странице: 1из 66

Imperial College of Science,

Technology and Medicine

(University of London)

Department of Computing

Real-time face recognition and identification using 3D surface model

By

Yuet Chung Kwok

Submitted in partial fulfilment

of the requirements for the MSc

Degree in Computing Science of the

University of London and for the

Diploma of Imperial College of

Science, Technology and Medicine.

September 2002
Abstract
One of the most widely used and effective key elements to characterize a person’s
identity is the person’s face. However, to have a system which can recognize and
identify faces in 2D images is a very challenging task due to the constant change of
the face's pose, illumination and expression in real-life situations. This requires a
large amount of data in order to build a face database.

This project aims to develop a face recognition system that uses 3D face models
instead of using 2D images. By capturing different people showing different facial
expression using a real-time surface reconstruction system to build the 3D face model,
then using a statistical model based approach to help construct a face database, and
using it to classify each individual face. In face recognition stage, each face in the
database will be searched for the closest match based on surface and texture features
in order to identify and recognize the person using the classification from the face
database.

I
Acknowledgements
I would like to thank my project supervisor Dr Daniel Rueckert for constantly
provided me helps when I faced difficulty during the project. He also guided me
towards the problems to focus when I am not too sure what should be done during
implementation.

I would also like to thank for the people who are behind the making of visualization
toolkit (VTK), for which it provided me a very good environment for my
implementation in which to produce the final software product. It also gives me a very
good understanding of 3D visualization.

II
Table of Contents
Abstract………………………………………………………………………….........I
Acknowledgements………………………………………………………………......II
1. Introduction..............................................................................................................1
1.1 Project Motivation.....................................................................................1
1.2 Face Recognition Scenario........................................................................1
1.3 Project Scope..............................................................................................1
1.4 Report Structure........................................................................................2
2. Background...............................................................................................................4
2.1 Overview.....................................................................................................4
2.2 The Basic of Face Recognition…………………………………………..4
2.3 Head Tracking Using Texture-Mapped 3D Models…………………....4
2.4 Statistical Shape Models of Appearance..................................................5
2.5 Registration of 3D Shapes – ICP...............................................................5
2.6 Principal Component Analysis..................................................................7
3. Designs.....................................................................................................................10
3.1 Decision on Method..................................................................................10
3.2 Program Architecture..............................................................................10
3.2.1 Inputs..........................................................................................11
3.2.2 Data Processing.........................................................................12
3.2.3 Output........................................................................................12
3.3 Development Environment......................................................................12
3.4 Design Methodology.................................................................................13
4. Implementation – Part 1........................................................................................15
4.1 Overview………………………………………………….......................15
4.2 The “File” Class........................................................................................15
4.3 The “ICP” Class.......................................................................................16
4.4 The “PCA” Class......................................................................................17
4.4.1 Statistical Shapes Model...........................................................17
4.4.2 Implementation of “PCA” class...............................................18
4.5 The “FaceMatch” class............................................................................20
5. Implementation – Part 2........................................................................................22
5.1 Overview...................................................................................................22
5.2 Document/View architecture..................................................................22
5.3 Active Face Recognition – The Application...........................................23
5.4 The ICP GUI – “ICP_Handler” Class....................................................27
5.5 The PCA GUI – “PCA_Handler” Class.................................................28
5.6 Face Recognition GUI – “Face_Handler”Class…………………….....30
6. Evaluation of Active Face Recognition.................................................................33
6.1 The Application – How stable is it?........................................................33
6.2 ICP – Does It Really Works?..................................................................33
6.3 PCA – Is it running slow?........................................................................36
6.4 Face Recognition – Can it find the match for a face?...........................38
7. Conclusion...............................................................................................................42
7.1 Achievements............................................................................................42
7.2 Limitation..................................................................................................42

III
7.3 Future Work.............................................................................................42
Bibliography……...…………………………………………………………………45
Appendix A - Formulation of ICP…........................................................................48
Appendix B – Applying PCA when there are fewer samples than dimension…..50
Appendix C – “File” Class Declaration…………………………………………....51
Appendix D – “ICP” Class Declaration……………………………………...……53
Appendix E – “PCA” Class Declaration………………………………………..…54
Appendix F – “FaceMatch” Class Declaration………………………………...….56
Appendix G – User Guide for Active Face Recognition………………………….57

IV
1. Introduction
There exist many features to hold a person’s identity, such as face, fingerprint,
signature, and more. But the face is probably the most widely used feature on a person
when comes to recognition and identification, since it is the first learnt way for a
human to identify a person for which the brain performs the task.

1.1 Project Motivation

The problems with recognizing and identifying faces from 2D images is that they are
very inefficient in terms of storage space, since a large amount of data are requires
when it comes to build the face database, and the difficulty arise when the analysis
takes place for which the faces have different poses, expression and illumination in a
real life situation. Recognizing and identifying faces in 3D surface model can
overcome these problems. The problem of different poses and illumination can be
dealt with using different orientation and illumination properties of a face model
object when rendering in 3D. This mean it is only require storing 3D model data sets
of a person’s face with different facial expression. So that the system can have the
ability of identify and recognize people effectively without the sacrifice of the storage
space, this is always welcome as the need for storage space is never enough, even
thought the price for a storage device falls dramatically during these few years.

With the recent increase in video card power and the fall in prices of both high-speed
memory and processor, it is become apparent that real time analysis of a large amount
of 3D data sets is a feasible possibility within an office/home PC technology. This
helps software development in the computer vision area and giving it a very solid
ground to start to pick up its pace in the research area as well.

1.2 Face Recognition Scenario

To move into the usefulness of face recognition, let’s have an example scenario which
would involve a subject's face being scanned on some special scanner camera. Then
identification can take place once the data have been collected. If it happens that the
subject is a known threat and exists in the face database then this can alert the security
and that the subject can be dealt with before it is too late.

This kind of system can have a large area of usage, such as Intelligent Building,
airports, banks, etc. But better place the scanner some distance away from the target
since identification may take some time.

1.3 Project Scope

The main objective of this project is to develop a program capable of performing real-
time face recognition and identification using 3D face models. It shall also have the
ability to store the derived face database for later use for convenience purpose.

1
The project can then be broken down into two main sections, which were completed
in the following order:

Data Preparation Recognition Stage

• Data Preparation – develop algorithms in order to cast the data sets in a


manageable manor for which the features of the face shall be distinctive in
order to build the face database.
• Recognition Stage – there shall be a means of classifying individuals in the
face database having built the database, and using it to do face recognition.

1.4 Report Structure

The remainder of the report is structured as follow:

• Chapter Two gives an insight into different methods in face recognition that
are suitable for this project, along with some of the theoretical explanation on
the methods that has been use in this project.

• Chapter Three explain the conceptual design of this project along with the
choice of the methodologies and development environment that has been used
for this project.

• Chapter Four discuss the implementation of the algorithms into classes that
has been used for this project and also explain the implementation of the data
structure for the face model.

• Chapter Five discuss the implementation of the Graphical User Interface for
the application that was produced for this project, also with the explanation on
the integration of the algorithm classes with the applications.

• Chapter Six discuss some of the issue that has face when testing the
application, with also some result on the face recognition.

• Chapter Seven gives the overall conclusion of this project and a suggestion
on the future optimization for the application.

2
3
2. Background
2.1 Overview

This part of the report will have an overview of the methods which were being
considered during early stage of the project. This also provides background
information on some of the techniques which have been used throughout the project.

There is an interesting source described facial data processing techniques [1], this is a
very interesting read as it provides some good information on facial data processing,
for which to use in face recognition.

Here is a brief description of the basic idea of face recognition, which is actually
source from [1].

2.2 The Basic of Face Recognition

The conventional method for face recognition is to do a geometric feature-based


matching. What this does is it uses a database with a model for each face with the
information of the size and position of eyes, mouth, head shape, and relationships
between these features. For each face, the distances between all the features are
calculated. The aim is to get a one-to-one correspondence between the subject and the
stored faces in the database. By using a vector of features calculated with vertical and
horizontal gradients and then recognition is performed with a nearest neighbour
classifier.

The detection of face features is actually works better in 3D than in 2D, because the
features are dependent upon the facial orientation, so that the extraction of features is
easier as the 3D model can be easily be manipulated in a 3D rendering space.
However, there is a drawback on this method due to the fact that automate the
extraction of facial features is very difficult on some 3D face model, because not all
the 3D face data coming out of the surface re-constructer is that “clean”.

2.3 Head Tracking Using Texture-Mapped 3D Models


(source from [2])

This technique requires the use of an image sequence. The idea is to use a texture
mapped surface model to approximate the head, account for self occlusions and to
approximate the head shape, then using image registration technique in the texture
map to fit the model with the data.

The technique works by assuming that the head is a cylinder with a 360-degree wide
area for texture mapping, to put the image sequence showing different facial
expression. In any frame, there is only 180 degree wide slice of the texture is visible,
and is corresponding to the visible part of the image. If the initial position of the
cylinder is known, then it is possible to compute the texture map of the current visible
portion from the incoming image.

4
It is possible to use this technique in face recognition as a new frame from the image
sequence is acquire, the position and orientation of the cylinder can be estimated by
using the texture from the incoming frame to best matches the reference texture. That
means the 3D head parameters are estimated by performing image registration in the
model’s texture map.

However, in real world, the heads are not cylindrical objects, so that this technique
has to take into account of this modelling error. Also changes in lighting can have a
relevant effect and therefore must be corrected in some way when perform this
technique.

A more detail of the formulation of this technique is in [2].

2.4 Statistical Shape Models of Appearance


(source from [3])

A 3D face model can be describes as a set of points in 3D space. It is defined such


that it is invariant under similarity transformation that is when the model is translated,
rotated and scaled. By using this fact, any 3D face model is treated as a set of points,
and that is possible to apply formal statistical method to sets of 3D face models in
order to do analysis of shape differences and changes.

This technique uses a large set of 3D face data to construct a statistical model. All the
data are aligned to a reference face model by using Iterative Closest Point (ICP,
outline in section 2.5) algorithm to have a common coordinate system, then using
Principal Component Analysis (PCA, outline in section 2.6) to reduce the
dimensionality of the data, in order to model the shape variation, using this can
generate new examples similar to those in the training data set. This gives the data set
a classification, with them it has the mode of variation and also the range for the data
to vary.

Then the statistical face model database is constructed and can be use for face
recognition, by comparing the subject’s shape parameter to find out the closely
matched face in the database.

2.5 Registration of 3D Shapes – ICP


(source from [4])

Iterative Closest Point algorithm is a technique for global and local shape matching. It
supports the following form of geometric data:

• Point Sets
• Line Segment Sets (Polylines)
• Implicit Curves: g(x, y, z) = 0
• Parametric Curves: (x(u), y(u), z(u))
• Triangle Sets (Faceted Surfaces)
• Implicit Surfaces: g(x, y, z) = 0

5
• Parametric Surfaces: (x (u, v), y (u, v), z (u, v)).

Given a model set of geometric data, the algorithm can estimate an optimal rotation
and translation for a data set of geometric data, which may be correspond to the model
set, to be able to aligned, or registers the two shapes to give the a minimal distance
between the shapes. Therefore allowing determination of the equivalence of the
shapes via a mean square distance metric. That is useful when a statistical model
based approach requires that all the data sets of face model to be aligned in a common
coordinates frame.

Detail formulation of this algorithm is given in Appendix A. The general method for
the registration is:

1. Compute the closest points: Yk = C ( Pk , X )


r
2. Compute the registration: (q k , d k ) = Q ( P0 , Yk )
r
3. Apply the registration: Pk +1 = q k ( P0 )
4. Terminate the iteration when the change in mean square error falls below a
preset threshold value τ > 0, d k − d k +1 < τ . Otherwise go back to 1.

Y is the resulting closest point set between the model point set and data point set from
the closest point operator C, P is the data set, X is the model set, q is the registration
vector, d is the mean square point matching error and Q is the registration function to
produce the optimal rotation and translation as a form of the registration vector. Given
that the point set P with N p points from data shape, and point set X with N x points
from the model shape. The iteration is initialized so that P0 = P, with the registration
vector q0 = [1,0,0,0,0,0,0] and index k = 0.

Below is an example of how the algorithm works:

These images are extracted from http://www.robotic.dlr.de/vision/projects/PoseEst/,


where there are some more examples of ICP.

Initial pose Iteration 1 Iteration 5

6
Iteration 30 Iteration 70 Iteration 110

2.6 Principal Component Analysis


(source from [5])

Principal Component Analysis (PCA) is a classical statistical method. This linear


transform has been widely used in data analysis and compression, and that it has been
used successfully in face recognition techniques as show in [6, 7]. What it does is that
it extracts the main variants in the sets of data, allow the use of only a fewer extracted
features to reconstruct the whole set of data. Therefore reduces the dimensionality of
the multivariate data without a significant loss of information and can examine the
relationship of the correlation between the sets of data.

Due to the similarity in the 3D face models, the set of points from the face model are
not randomly distributed and can be said there are some correlation between them,
therefore can be describe by a lower dimensional subspace using PCA. It is important
that in face recognition to reduce the dimensionality of data because the size of the
matrices involved in the recognition can be vast and computationally expensive.

Figure 2.1 example of PCA, source from [5]

7
Figure 2.1 show the result after PCA applied to the set of data, the eigenvectors (the
two orthogonal lines in the diagram) forms the new axis of the data, that means the
data now can be described completely by the two “principal components” in this case.

The general formulation of PCA begins with the formation of the feature vector

x = ( x1 , x 2 ,..., x n ) T (2.1)

and the average of this feature vector is denoted by

1 n
µ x = E{x} = ∑ xi
n i =1
(2.2)

and the covariance matrix of the data set is

1 n
C = E{( x − µ x )(x − µ x ) } = ∑ [( xi − µ x )( xi − µ x )T ]
T
(2.3)
n i =1

This is a matrix helps to reduce the computation when calculating eigenvectors and
eigenvalues. The eigenvectors e i and the corresponding eigenvalues λi are the
solution of equation

Ce i = λi e i , i = 1,...., n (2.4)

The eigenvector correspond to the largest eigenvalue in the set of solution correspond
to the largest variation in the original data and has the most significance information.
Typical choice to choose the “useful” eigenvalues is for the eigenvalues add up to
about 90% to 98% of the sum of the eigenvalues in the order from largest to smallest.
That way, the data contains the useful information and can ignore the effect of noise
from the computation for the smaller eigenvalues set of eigenvectors.

There is a very useful method when applying PCA in the situation where the number
of sample is fewer than the dimension of the component of the feature vector, which
is show in Appendix B.

8
9
3. Designs
3.1 Decision on Method

The method chosen was the statistical model base approach by Cootes et al [3],
outlined in section 2.4. The reason for using that is it has the freedom of not limited
by the use of texture as it is with the head tracking approach (outlined in section 2.3).
The data collected from the surface model re-constructor is already in the form of
point set, and there is no reason for changing the form of data to cylindrical for the
representation of the head. Because of the difficulty of automatic feature extraction
from the face, then the conventional method for face recognition is also not into
consideration when thinking for the choice of implementation.

A more detail description and formulation of the chosen method will be explained in
later chapter.

3.2 Program Architecture

To have a better understanding of the program architecture, first see how the data
flow for the face data sets from raw data to the recognition stage.

Collections of raw data Pick a reference data set,


of face model from Align all the other data
Surface Model Re- sets to the reference
constructor model using ICP

Face Recognition by
matching an input to Construct face database
each of the face model using PCA
in the face database

Figure 3.1 Data Flow Diagram

This is a schematic of how the data flow, with this in mind, the program can be
constructs into three different parts representing the different stages for which the data
have gone through. The different parts of the program are input, data processing and
output, each of the part contains a number of modules used for storing, processing and
displaying the data depends on the nature of the module.

10
The overall program architecture can be seen in figure 3.2, which is shown below:

Input Data Processing Output

Display Face Data

Retrieve data File


Face Handler
Store Face Data Using ICP
Data

Data Show alignment


Align-
Stores transform ment
Graphical
Trans- Face Input Display
form Using PCA Show database

Build
Face
Use database
Database
PCA Show result
Face Store database Face
Database Recogni
tion
Use database

Figure 3.2 Conceptual Designs

Each of the modules has its own purpose, a detail description of each of the module is
given from the following subsections.

3.2.1 Inputs
These are the data stores in the storage device, the simplest case is the
hard disk.

Face Data – This represents the set of raw face data capture from the
surface model, each of the face model is store independently on the
storage device.

Transform – These files contains the transformation matrix after the


data have been aligned to the reference data model by using ICP. The
transform matrices are later use when the data sets are going through
PCA.

PCA Face Database – This is the file contains all the important
information of the face database, this is stored so that PCA does not
require to run every time a face recognition takes place.

11
3.2.2 Data Processing

File Handler – The most central component of the program, it handles


the reading and writing of file. But not only that, it also provides a data
structure for the face data going through ICP, PCA, face recognition
and the displaying on the render window.

Data Alignment – Using ICP to aligned the set of data to a common


coordinate system as the reference set model, this provides PCA to be
able to extract the face “features” on the data set to constructs the
database. This stores the transformation matrix on file so that there is
no need to do ICP again when PCA takes place, only requires the
transform apply to the face data when needed.

Build Face Database – Using PCA on the whole set of data, this
produces a statistical model with different modes corresponds to the
different eigenvalues and different range corresponds to the positive
and negative of the standard deviation given by the eigenvalues. This
set of data is the face database and can be store in file so that it is not
necessary for running through PCA again when face recognition takes
place.

Face Recognition – Get an input from face data, then match it with
each of the face model in the face database, to see which is the closely
match face model, then classify the input by the modes and range in
the face database.

3.2.3 Output

Graphical Display - Produce face model(s) in the render window, it


can be use for ICP for showing the result of data alignment, or display
the face data in the face database after PCA has run or when a PCA
profile is loaded from file, or showing the result of face recognition, or
just simply display a face data from file.

3.3 Development Environment

This project was developed using Microsoft Visual C++ 6.0 and was run and tested on
an AMD Athlon XP 1800+ with 256MB DDR RAM, please keep in mind the
configuration of the testing computer that had been used when later the performance
of the program is quote.

Visual C++ incorporate with Microsoft Foundation Class (MFC) offers the possibility
of producing professional application in the Windows environment through the use of
Windows API.

12
Visualization Toolkit (VTK) was used to provide a good rendering and interacting
environment for the face models, it also comes with three helper classes to works with
MFC’s document/view (D/V) architecture.

The only disadvantage of using Visual C++ is that MFC produced application does
not provides platform independence, which means it only works on Microsoft
Windows platform and not on any other operating system. Although this is a slight
drawback but this will not affect the project in any way.

3.4 Design Methodology

One of the reasons for choosing Visual C++ for this project is its object oriented
capabilities. Different algorithm is derived into different classes, this provides a clear
line for which the functionality of the class can handle, and future extension shall be
made minimal in difficulty so that it is simple to add extra function into the class.

13
14
4. Implementation – Part 1
4.1 Overview

This chapter will discuss the implementation of the algorithms that had been used
during the project, it also provides a detail description of the file handler, which it
gives the data structure provides for the use of the algorithm. The next chapter will
discuss the Windows framework and such of the implementation.

4.2 The “File” Class

This is the class which handle the reading and writing of the face data, not only that, it
also provides a very good data structure for use in the algorithms. The header file of
this class is show in Appendix C.

The face data captured from the surface model re-constructor is wave font file format.
This file format gives the face data in the following form:

• The position of the x, y, and z components of the vertices in the x, y, and z


coordinates system.
• The texture coordinates of the texture map of the corresponding vertices.
• The polygonal data which gives indication of which vertices map onto the
polygon accordingly.

With this in mind, the data structure of this class is divided into minimum of three
types, they are the double pointer vertice_data, texture_data and polygon_data as
declared in the header file. For getting the data from file, the program first count the
number of each of the data first, then construct the double pointer into a 2D array
using the new operator and read the file again to obtain the data. An alternative
approach for getting the data is using link list, so that the link is adding while getting
the data when reading the file at the first time, but this way, manipulating data in later
stage would have some performance penalty because the data from a link list is not
necessary in a consecutive memory block, and this requires the program to search in
the memory for data. Therefore, the first approach was implemented with sacrifice the
time at the getting data stage.

The file class also have a member variable called scalar_data, this is to be use later
when after gone through PCA to display the facial texture as a substitute of the texture
map. The reason will reveal in later section in the discussion of PCA. Therefore the
program can render using both texture-mapped model and scalar rendered model.

There are a few member variables in this class only used exclusively by ICP, the
reason being that the initial implementation of the ICP algorithm supports taking
several data set as inputs to be aligned to one model set, therefore each “File” object
can have its own data to calculate the registration vector and such. But in later
implementation of the ICP algorithm, it only supports one data set as input, the reason
will be reveal in later section. Therefore those member variables are still keeps in the

15
File class, however, they can be part of the class belongs to the ICP algorithm, but
either way, it does not affect the program in any way in general.

For display the face data using VTK, this class also provides a member function
which transform the data into VTK format so that the rendering can be done. Also
there is a member function to transform the data from VTK format back to this data
structure, that is important as the data can be write to file after gone through some
operation in VTK that change the data.

The member variable texture belongs to class BMPData, this provides a data
structure to holds the texture map data, it also have the member function to change the
data to VTK format in order to display the texture map in VTK render windows.

4.3 The “ICP” Class

This is the class to do the registration of 3D shapes using the ICP algorithm. The
header file of this class is show in Appendix D.

The most central function of this class to perform the ICP algorithm is the Run()
function. After the data set and the model set have been initialize, i.e. when both set of
data have obtained the data from file, then the algorithm can perform the operations
which are necessary for the algorithm. Below show the function body of Run():

bool ICP::Run()
{
if(!this->model_set->have_data_already || !this->data_set->have_data_already)
{
AfxMessageBox("Data Are Not Prepare To Run The Algorithm");
return true;
}

if(this->data_set->matched==true)
{
return true;
}
this->data_set->old_mean_dist = this->data_set->mean_dist;

this->close_matching();
this->CalculateMeanDist();
this->Registration();
this->Transform_Data();

this->delta = this->data_set->old_mean_dist - this->data_set->mean_dist;

if((this->delta < threshold && this->delta >= 0.0) || store_iteration == 0)


{
this->data_set->matched = true;
return true;
}

this->store_iteration = this->store_iteration - 1;

16
return false;
}

It is the function caller’s job to make sure that this function is run iteratively, because
this function only runs once each time it is called. The return value indicated whether
the iteration is finish or not. The bolded text in the function corresponds to the
operations as outlined for ICP in section 2.5.

The closest point operator in this class was implemented by close matching point set
by point set. That is the operator finding the closest point on the model set, which is
corresponds to the point for the data set. This is the most basic and simplest way to
implement this algorithm, however later in the report, when discussing the
performance of the program, it proves that this is actually not the best way to
implement this operation.

After the iteration is finish, this class provides the member function, void
WriteTransformMatrixToFile(const char *name) to write the transformation
matrix that have been derived using ICP to put into file format. The argument of this
function is the file name that it is going to store. This way, the data set can be
transformed into the model set’s coordinate system when required in running PCA.

There is also another way to put the transformation matrix is to use the VTK type
matrix for this class. One of the purposes is to test the algorithm, by using the same
face data as both the model set and data set showing in the render window. Then
make some transformation on the data set using the interactor in the render window
and run the algorithm to see how the algorithm converges to align the data set onto the
model set.

4.4 The “PCA” Class

This is the class to do PCA and provides the output data as classify in different modes
and ranges as described in [3]. The header file of this class is show in Appendix E.
But first, let’s see the relationship between PCA and statistical shapes model.

4.4.1 Statistical Shapes Model

As mention earlier, PCA is a classical statistical method. However, when PCA is


applying to shapes model, this can be use to modelling shape variation based on the
set of shapes as the components of the feature vector. That means, equation (2.1)
becomes:

Γ = (Γ1 , Γ2 ,..., ΓM ) T (4.1)

All of the Г in the feature vector should be in a common coordinate frame before
PCA can takes place. Not only that, each of the Г should have the same length of data
as the reference Г, that is all other Г should be cast so that each point in it have a
corresponding point in the reference model. Suppose the reference data is in three-

17
dimensions and it has N vertices, then each of Г is a one-dimension vector that is 3N
in length. The data is arrange so that the first N number of data corresponds to all the
x-components of the vertices, the second N number of data corresponds to all the y-
components of the vertices and the last N number of data corresponds to all the z-
components of the vertices. However, with the use of scalar value rendering, this
forms the fourth N number of data in the feature vector. Therefore the overall length
of the feature vector is 4N. The reason for scalar rendering is that using this approach,
the number of texture map data is big and cannot be able to go through PCA with only
N vertices. Initial implementation did try to use the texture map data with N landmark
points, but this only made N “holes” in the texture map and does not change the
texture map in any way, therefore scalar rendering is used instead.

After the feature vector is build, equation (2.2) to (2.4) can be applied to equation (4.1)
in the usual way. When the set of eigenvectors (denoted Ф) and the set of the
corresponding eigenvalues have been calculated, then any of the data in the training
set can be approximated by:

Γ ≈ Γ + Φb (4.2)

where Γ is the average data from equation (2.2), Φ = (e1 | e 2 | ... | e t ) , t is the number
of eigenvectors that have non-zero eigenvalues and b is a t dimensional vector given
by:

b = Φ T (Γ − Γ ) (4.3)

This vector defines a set of parameters of a deformable model. In general it is used so


that the shape varies according to the elements of this vector as from equation (4.2).
The variance of the i th parameter, bi , across the training set is given by λi . By
applying limits of ± 3 λi to the parameter bi to ensure that the shape generated is
similar to those from the training set.

With that, the shape database is built by using different parameters of b and the use of
equation (4.2).

4.4.2 Implementation of “PCA” class

A typical face model data contains around 1000 vertices, therefore the length of the
vector to put into the feature vector is around 4000 points in length. Usually there are
at most to use hundreds of face model to build the face database. Therefore the PCA
that has been implemented is using the algorithm shown in Appendix B, because there
are fewer samples than the dimension of the vector. This helps reduce the
computation time dramatically.

18
Due to the large amount of face data to handle, all the input face data will be deleted
except the reference face data when the class internal call of the GetData() function
for building the feature vector. This is to ensure the system do have enough memory
and do not require using virtual memory on disk to execute the operations in PCA,
otherwise the performance hit to execute the algorithm will be dramatic, and the
possibility of getting the OS to crash when low in memory.

Just as like the ICP implementation, this class has a central function to run the PCA,
after the initialization of the data has been done, the void Execute() can be call to
execute PCA. Below shows part of the function body of this function:

void PCA::Execute()
{
if(!this->reference->have_data_already)
this->GetReferenceData();
this->GetData();
this->CalculateAverageSet();
this->Subtracted_Set();
.
.
float **M_CM;
.
.
this->ConstructCovarianceMatrix(M_CM);

JacobiN(M_CM,this->M,this->M_evalues,this->M_evectorsData);
.
.
.
}

The bolded text functions are corresponds to equations (4.1), (2.2), (2.3) and (2.4)
from top to bottom. The function JacobiN( , , , ) calculates the eigenvalues and
eigenvectors of the matrix put in the first argument, in this case the matrix is the
covariance matrix of all the face data. When the eigenvectors are calculated, the
resultant eigenvectors are calculated followed Appendix B then put in the member
variable OneDim_result_set. This member variable is corresponds to Ф in equation
(4.2).

Once the Ф is calculated, the function void PutDataToFileFormat(File *data, int


set, int range) is to calculate a face data from the face database with the mode
(second parameter) and range specify. The operation of this function is corresponds to
equation (4.2). Therefore, by changing the set and range parameters can generate all
the face data in the face database. The face database can be save and load with the
function void WriteData(const char* filename) and void ReadData(const char
*filename) respectively, therefore there is no need to run PCA again when the
database is saved in file.

19
4.5 The “FaceMatch” class

This is the class to do the face recognition for which it takes in two faces data and
match them to get the mean distance between them, the algorithm for matching the
face data is the same one that was used in the ICP class, which is points matching.
However, the mean distance is incorporated with the difference in scalar values as
well when doing face recognition with the scalar value. The general formula for the
mean distance is show in equation (4.4). The header file of this class is shown in
Appendix F.

n n
D = ∑ ( pi − ni ) 2 + λ ∑ ( scalar ( pi ) − scalar (ni )) 2 (4.4)
i i

Notice the function void Execute(int mode, int range) takes in two parameters, this
function is built in mind with using the PCA class coupling together, so that the caller
is calling this function iteratively with changing the modes and ranges in the face
database and this two parameters can be store in the member variables when there are
the mode and range to gives the smallest mean distance.

20
21
5. Implementation – Part 2

5.1 Overview

This chapter will discuss the implementation of the Graphical User Interface (GUI) in
Windows environment. And showing the integration of the algorithms class discussed
in chapter 4 into each of the GUI components, but begins with an overview of the
MFC D/V architecture which forms the bases of this project. However, detail
description and general idea of windows programming is not going to be explain in
this report, please consult [8] for more detail.

5.2 Document/View architecture

D/V architecture aims to simplify the development of windows application by


providing a framework to separate the application’s data management and the visual
representation of the data on screen. The document class provides the internal data
management, and the view class provides the visual representation of the data of its
associated document.

Figure 5.1 SDI document view architecture, extracted from [8]

22
The following description of the document and view is extracted from [9]. The role of
document is it conceptually contains the data upon which the application operates,
responsible for storing and retrieving its data to a persistent medium and acts as a
synchronization agent for the various views that may be opened on the same
document object. The view is essentially an input-output device connecting the
document to the user. It extracts the relevant data from the document and draws, or
renders on its Windows window, but it can also intercept user’s actions on the view to
update the document accordingly.

The three helper classes from VTK provide the bridge between the VTK rendering
engine and D/V architecture. One of the classes inherited from CDocument class
(from MFC) and the other two classes inherited from CView class (from MFC). By
producing the application’s document and view classes inherited from the VTK
document and view classes, the application have the advantage of using VTK to
render the data handle in the document class with minimal effort. VTK also provides
the render window an interactor, so that the object rendered in the render window can
be interact via the use of mouse and keyboard.

5.3 Active Face Recognition – The Application

The application is divided into several components, this section will describe the basic
structure of the application. The “Function” part which consists of the GUI for ICP,
PCA and Face Recognition will explain in later sections in this chapter.

The best way to start is to look at the member variables of the document class of the
application, because that is the core for data management for a window application:

protected:

File *ObjReader;
vtkPolyDataMapper *PolyMapper;
vtkActor *Actor;
vtkTexture *Texture;
vtkImageData *Idata;
vtkLinearSubdivisionFilter *Filter;
vtkPolyData *Poly_Data;
bool have_texture, have_scalar, have_data;
PCA_Handler *PCApanel;
ICP_Handler *ICPpanel;

The member variables in bolded text forms the bases of the application’s document
class, they are all required for rendering the object in the form of the member variable
ObjReader. That means when data is obtain by whatever means and cast into File
class format, then the data can be copy into ObjReader and from it can the render
object using VTK. The types of variable begins with “vtk” are belongs to VTK, they
are use for rendering the data, for more information on VTK and the usage of their
classes, please consult [10, 11]. The have_texture and have_scalar member variables
are used for whether the object model has the options to render using texture map or
using scalar values.

23
Figure 5.2 Main frame of the application and system function.

The New… function in the file menu produce an empty render window, what it does
it creates a document class object and attach a view class object. The member
variables of the document class initialize according to the constructor.

The Open… function in the file menu produce an model by taking in wave font file
format in the render window. The document object is created but the member
variables go through the function associated with Open… and update accordingly,
then attach a view object to render the data it contains. Below shows a fragment of the
Open… function:
this->ObjReader->SetFileName(lpszPathName);
this->ObjReader->Getdata();

if(!this->ObjReader->have_data_already)
return FALSE;

this->ObjReader->ChangeToVTKFormat(this->Poly_Data);

if(this->ObjReader->have_texture)
{
this->ObjReader->texture->ChangeToVTKFormat(Idata);
this->Texture->SetInput(Idata);
this->Texture->InterpolateOn();

24
}

this->have_scalar = this->ObjReader->have_scalar;
this->have_texture = this->ObjReader->have_texture;
this->have_data = this->ObjReader->have_data_already;

this->ObjReader->high_res_call = true;

this->ObjReader->Delete();
this->Actor->VisibilityOn();
this->Actor->PickableOn();

lpszPathName is the string contains the path and file name of the file of the data the
function going to open. The application also provides the Save function which does
not shown in figure 5.2, the reason is that there are two types of main frame menu, the
other type appears when there is a document object is active in the application. The
Save function is to save the face data into the wave font file format. When later there
will be have two face models in the render windows, this function can only save the
face model which is “greyish” white in colour.

Figure 5.3, Active Face Recognition opens a face data

When the application has an active document object, the main frame is changed and
provides another set of functions. Of particular interest is the Render option as
pointed by an arrow in figure 5.3. It provides the function to render using texture

25
maps, scalar values and render the object using 4 times the polygon count on the
original model. This set of function is closely related to the document object and is
part of the CMainFrame class, which controls the activity on the main frame. Below
shows the code fragment for texture mapping function.

if(!CDoc->have_texture)
CDoc->Texture->SetInput(NULL);

else
{
if(CDoc->Texture->GetInput() && CDoc->have_texture)
{
CDoc->Texture->SetInput(NULL);
if(CDoc->have_scalar)
CDoc->PolyMapper->ScalarVisibilityOn();
}
else
{
CDoc->Texture->SetInput(CDoc->Idata);
if(CDoc->have_scalar)
CDoc->PolyMapper->ScalarVisibilityOff();
}

CDoc->UpdateAllViews(NULL);
}

The variable CDoc is an object of the application’s document class. As can be seen
from the first if statement, when the data contains in CDoc does not have a texture
map, then this function make sure the Texture variable of CDoc does not have any
input. Otherwise, when the data do have texture maps, pressing this function gives out
a negate effect on the data, that is if the data is not render using texture maps, it will
makes the data to render with texture maps when pressed and vice versa. However,
this function is implemented so that when the function is pressed, rendering with
texture map became rendering with scalar values and vice versa, this is done on
purpose so that comparison can be made between the two methods of rendering.

The implementation of scalar rendering function is different, although the nature of


this function and the texture mapping function is similar, but scalar rendering is
required when doing face recognition and therefore there are two face models in the
render window. When there are more than one “actors” (face models) in the render
window, VTK provides a method to extract all the actors in the render window so it is
require looping through all the actors and enabling or disabling scalar rendering for
the face models. Also enabling and disabling this function will have no effect on
texture map rendering unlike the texture mapping function.

The implementation of high resolution model function is just the same as the texture
mapping function in nature, but it have no effect on other rendering function just as
like the scalar rendering function. Also this has no effect on the ICP, PCA and Face
Recognition rendering window, this is to prevent something unexpected to happen.

26
The CMainFrame class holds the function to create the GUI for ICP, PCA and Face
Recognition. Those three GUI are all derived from MFC’s CDialog class and all are
created to be a “modeless” dialog, which means they are created via the new operator
and require to “clean up” when the object is destroy. This have the advantage of
giving the application’s handle back to the application, this means the application can
still be interact with the user when the GUI is open. Below shows the function to
create an ICP GUI from CMainFrame:

void CMainFrame::OnFunctionIcp()
{
// TODO: Add your command handler code here
ICP_PANEL = new ICP_Handler();
ICP_PANEL->Create(IDD_ICP_PANEL, this);
ICP_PANEL->ShowWindow(SW_SHOW);
}

where ICP_PANEL is a member variable of CMainFrame.

5.4 The ICP GUI – “ICP_Handler” Class

This GUI provides an easy to use interface to put the face models to be aligning
together using the ICP algorithm implemented in the ICP class. The GUI,
ICP_Handler was implemented by inherited from MFC’s CDialog class, it also
contains an ICP object as a member variable. The panel is shown in figure 5.4 below:

Figure 5.4, ICP GUI

This panel can be brings up by pressing the “ICP” icon on the toolbar or by choosing
it from the main window frame under “Function”.

Both the “Browse” buttons creates a CFileDialog object and shows the file panel
when pressed. This gives the user the ability to choose the file from the CFileDialog
object, just like using a panel when try to opens a word document from Microsoft

27
Word. The CFileDialog object is created to be a “modal” dialog, which means the
application’s handle is on the CFileDialog object, the application cannot interact with
the user while the panel is open, however the application get back the handle when the
panel is close. The code fragment below shows part of the function by press the
“Browse” button:

CString Temp;

static char BASED_CODE szFilter[] = "Face Files (*.obj)|*.obj|All Files (*.*)|*.*||";

CFileDialog CFDialog(TRUE, NULL, NULL, OFN_HIDEREADONLY |


OFN_OVERWRITEPROMPT, szFilter, NULL);
CFDialog.DoModal();
Temp = CFDialog.GetPathName();

m_Model.SetWindowText(Temp);

Once the file is chosen, the full path and file name of the file will be shown in the edit
box under either the label “Source File – Model Set” or the label “Source File – Data
Set” depends on which “Browse” button that have been pressed. These are important
when the data required running the ICP algorithm, this gives the paths for the member
variables in ICP class model_set and data_set to get the data.

The “Show Image” button associated the application’s document class to the
ICP_Handler class in order to display the face data contains in the ICP object in a
render window.

The “Step ICP” button when press is to call the Run() function once in the ICP class,
then display the resultant transformation on the data set in the render window. The
“Run ICP” button has the similar effect but the Run() function is called iteratively
until the algorithm has reached the termination condition, the display shows each
steps of transformation on the data set in the render window.

The “Save Result” button calls a CFileDialog object similar to the “Browse” button
but with a different parameter in the CFileDialog object constructor to indicate that
the panel is use for saving file. Once the file name is chosen, the function calls the
void WriteTransformMatrixToFile(const char *name) member function of the
ICP class to save the transformation matrix onto file.

The buttons labelled “OK”, “Cancel” and “X” on the frame window are use to close
the panel and provides some “clean up” operations to make sure there are no memory
leakage when the GUI is destroy.

5.5 The PCA GUI – “PCA_Handler” Class

This GUI provides an interface to use PCA implemented in the PCA class, and also
provides some function to change and display the model in the model (face) database
once it is constructed when PCA has finish. Figure 5.5 shows the GUI.

28
Figure 5.5, PCA GUI

This panel can be brings up by pressing the “PCA” icon on the toolbar or by choosing
it from the main window frame under “Function”.

The “Add Obj” and “Add Trans” buttons produce the same effect as the “Browse”
buttons in the ICP GUI, but the CFileDialog object created is allow to select multiple
files, therefore the CFileDialog object contains multiple strings of the file path and
file names to each of the files that has been selected. The file paths and file names of
the model data are put in the left list box, and the file paths and file names of the
transformation are put in the right list box. The “Remove Obj” and “Remove Trans”
buttons removes the selected name on the left list box and right list box respectively.

When the files are prepare, the PCA can be run by press the button “Run PCA”. Let’s
have a look on part of this function:

if(number_of_obj_files == number_of_tra_files && finish != true)


{
.
.

File *data = new File[number_of_obj_files];

CString Temp;

for(int x = 0; x < number_of_obj_files; x++)


{
m_ObjList.GetText(x, Temp);

29
data[x].SetFileName(Temp);

if(x != 0)
{
m_TransList.GetText(x, Temp);
data[x].GetTransFromFiles(Temp);
}

pca->SetRenderMode(this->use_high_res);
pca->SetReferenceFile(&data[0]);
pca->SetNumberOfFiles(number_of_obj_files);
pca->SetDataSet(data);
.
.
}

pca is a member variable of the PCA_Handler class and is an object of PCA class.
This function first creates an array of type File to holds the numbers of files that exist
in the list box, then all the initializations are done as shown so that pca is ready to call
the member function void Execute() in a later stage of this function. Once the pca is
finish executing, it will calls the function associated with the button “Show Image” to
displays the model data that is the average set out of all the data set in the model
database. The edit boxes on top of each of the slider bar indicated which mode and
range of the model out of the database is the rendering window displaying.

The slider bars controls the values shows on top of its edit boxes. When the slider bar
is changed, the application then pump a windows message to call a function to update
the value in its associated edit box, that value then pass to the function associated with
the button “Show Image”, where there the pca calls the member function void
PutDataToFileFormat(File *data, int set, int range), with the value on the top edit
box as the second parameter and the value on the bottom edit box as the third
parameter. Then the resultant data calculated from the model database is render in the
render window.

The check box “Using High Res” is to tell pca to obtains the data in high resolution
model when checked and then performs PCA. This calculates all the data in high
resolution and gives a better result in scalar value rendering for the model in the
database when it is constructed.

The “Save PCA” and “Load PCA” buttons calls the pca member function void
WriteData(const char* filename) and void ReadData(const char *filename)
respectively to save and load the file provided with the file names from CFileDialog.

5.6 Face Recognition GUI – “Face_Handler”Class

This GUI can only be open when there is a “PCA” model database available either by
loading a profile from file or after performing PCA. This GUI can be open by
pressing the “Face” icon on the toolbar or by choosing it from the main window frame
under “Function” or in the PCA GUI. The GUI is shown in figure 5.6.

30
Figure 5.6, Face Recognition GUI

This GUI class is actually a member variable in the PCA_Handler class, this gives a
strong bonding between these two classes. Most of the implementation of this GUI is
similar as the previous two GUI that has mentioned.

When the input model is obtained and display in the render window, before doing
face recognition it is best to aligned the input model with the average model from the
database by either using icp with the “Align Data” button or can be done manually by
using the interactor in the render window (the reason for doing manually will be
discuss in later chapter).

The “Start Process” button provides the function to begin face recognition by calling
the member function void Execute(int mode, int range) of a FaceMatch object
iteratively until the models in the database have all been matched. The check box
“With Scalar” is use so that the matching algorithm will be matching scalar value as
well when the check box is checked. The result of the closely matched model in the
database and the input model will then be display in the render window, and the slider
bars in the PCA GUI will set to the mode and range of that model in the database.
With the values of the mode and range, the input face can be classify in the model
database.

31
32
6. Evaluation of Active Face Recognition
This chapter discuss the performance and stability issue with the application,
especially the performance of the ICP, PCA and Face Matching algorithms that has
been used for this project.

6.1 The Application – How stable is it?

Throughout the testing phrase of the application, there are a lot of problems that have
came up and most of them have been solved. However, there remain two major
problems that still unsolved and can cause some disaster result to the application.

The first problem related to the filter that transforms a model to high polygon model.
When constantly switching on or off for rendering in high polygon, the application
suddenly stop working and the OS gives out an error message relate to memory
violation, then asked to close the application immediately, however this only happens
on some data model but not all. Another occasion for this to happens is when after
performing PCA with high polygon model, the application terminates with the same
error message when changing the model with the slider bars, but this happens some
time but not the other.

The second problem related to memory leakage. When a document object is created
through either by opening a model from files or creating an empty render window (a
new document), the memory usage of the application will increase and this is normal.
However when the render window is close, the memory usage for the application is
decrease but not to the level when before the render window is open. There is no trace
to be found for this problem as yet of writing but considering that when too many
render windows are open and causing too much memory leakage will cause the OS to
crash. But this is unlikely for the machine that has been used since the number of
times a render windows required to open and cause the OS running low in memory is
up to an estimated of few thousand times. There is a trick to get around this problem
is that every time the application is minimized to the taskbar, the memory usage of the
application will return to normal and this can ensure the OS is running healthy.

6.2 ICP – Does It Really Works?

This section discusses the performance issue of the “chosen” implementation of ICP
algorithm in the ICP class.

To test the effectiveness of the algorithm, the best way to start is to use the same set
of data as both inputs of the ICP algorithm but with different starting position, i.e.
transform the data set away from the model set, this can easily be done in the render
window as shown in figure 6.1. Figure 6.2 shows the result of the data set mapped
exactly onto the model set as expected with twenty-two iterations and this is done in
few seconds.

33
Figure 6.1, example of ICP with same set of data initialized in different starting
positions

Figure 6.2, example of ICP finish running with the correct mapping of the data set on
to the model set

34
However, let’s see what happens if the data set has an awkward starting position as
show in figure 6.3 and the result show in figure 6.4.

Figure 6.3, example of ICP with same set of data initialized in with data set starting in
an awkward position

Figure 6.4, example failure on alignment of both set of model

As can be seen in figure 6.4, both the model set and data set did not aligned together,
however this has been mentioned in [4], where it said the initial position of the data
set is important towards the registration.

35
With this in mind and that some of the face model data captured do have garbage
polygons hanging around. Then using two different set of data and run ICP again
produce the result shows in figure 6.5. The result shows that the two face misaligned
completely but this is due to the polygons as circled in the figure.

Figure 6.5, another example of ICP failure

To compensate for this problem, the application provides using the interactor in the
render window to give the data a better initial position and/or aligned the face data
manually. The “Step ICP” button is also useful to try not letting the ICP algorithm to
overshoot in some occasion (when the face data have some garbage polygons or the
pose of the face is in extreme angle). The transformation matrix can be saved in file
just like when running with the ICP.

In the conclusion part of the report, there will be a part devoted to future improvement
of the application, where there will shows what can be done to make the ICP
algorithm more robust and more “automatic” to align the data.

6.3 PCA – Is it running slow?

This section discusses the performance of running PCA. The result however, can only
be test by using some synthetic data of basic model shapes, such as cubes, spheres, etc.
Then by construct a set of data model with different size and stretching in different
axis, and go through PCA can produce a set of data with one type of shapes belongs to
one principal component and so on. Therefore with the assumption that it works with
synthetic data, have to believe it works in complex shape model such as the face

36
model that had been used in this project, as long as the face models are in a common
coordinate frame. Figure 6.6 below shows a typical result from PCA, the set of result
is running through PCA with six faces from the same person to construct the face
database. Probably the best way to distinguish each face is looking at the eye
movements.

Figure 6.6, main modes of variation of face data through PCA. From top to bottom,
mode 1, mode 2, mode 3 and mode 4 with the ranges from left to right, -1, 0 and +1
times the corresponding eigenvalues of the modes.

37
With fifty face data sets, it takes about thirty seconds to complete PCA. This is not
bad considering that each component in the feature vector have about 4800 points.
When using high polygon model with fifty face data sets however, it takes about 3
minutes and 30 seconds to complete PCA. It is interesting to see that the time does not
increase linearly and this maybe due to the fact that the calculation time required to
produces high polygon model and the time to re-map the scalar values from the
texture maps for each of the face model.

The problem with rendering in scalar values is that the resolution on the face model is
low compare to texture mapping, and that is the reason for using high polygon model.
The difference of the face can be seen on figure 6.6 below:

Figure 6.7, comparison on different rendering model. From left to right, normal face
model with texture mapping, high polygon model with scalar rendering and normal
face model with scalar rendering.

This is the reason for the PCA to use high polygon model. If in a real life situation, it
is require to build the face database using high polygon model, then less than ten
minutes is not bad with the amount of data to process. Since the application provides
the storage of the database, then it is only require to run once every time there is a
new data set available.

6.4 Face Recognition – Can it find the match for a face?

This section describes the performance and the effectiveness of the face matching
algorithm.

The performance of matching an input face to faces in database is important, since the
system shall be set in real time and that means the faster to identify an input face from
the face database the better. The high polygon model produces a clear and crisp image
on the face by using scalar rendering, and therefore shall be a better choice for face
recognition, because it is easier to identify a person from a more accurate looking face
model.

38
There were two tests carried out to test the face matching algorithm. Considering that
the algorithm is using a “dumb” search approach, there were only six face data in the
face database, the input face was one of the faces in the database, and this is also to
test the effectiveness of face recognition, and that the faces are from the same person.

The first test was carried out using high polygon models, since this shall be a
favourable approach for the result. But even with using only six face models in the
face database, it took around 2 minutes 23 seconds to finish the matching to find the
appropriate face in the face database. That is too slow considering the application
shall run in real time. However, the second test was carried out using the same face
database and the input face but using the normal models. It took only five seconds to
run through the face database and find the appropriate face, both tests gave the same
classification.

This is an important issue and it took away the robustness of the application. The
application was built in mind to face recognises face models in real time and
performance is as important as the accuracy of the face match algorithm. However,
using both high polygon models and normal models do gives the same classification
for the input face from the face database, and since classification is the way to find the
face from the face database. Therefore it might be possible to run face recognition in
normal polygon models but look at the output classification in high polygon model.
This was put into further testing with eighteen face data sets in the face database from
the same person, and indeed was the case.

But further testing is still required in order to make sure the application can work in
that way, since that the data sets are still from the same person.

Another issue with the face match algorithm is whether it is capable to recognize a
face model from a face database. The result is show in figure 6.7 and figure 6.8. The
testing was run with six faces from face database using high polygon model, the
testing were using different input face. From straight comparison, the application is
capable to match the most similar face model from the face database to an input face
model. Therefore in terms of face recognition, the application is up to the job.

In the conclusion part of the report, there will be a part devoted to future improvement
of the application, where there will shows what can be done to make the face match
algorithm search faster.

39
Figure 6.8, face recognition, the left model is the input face, the right face model is
the face from database.

Figure 6.9, face recognition, the left model is the input face, the right face model is
the face from database.

40
41
7. Conclusion
This chapter gives an overview of this project with its achievements, limitation and
future improvements.

7.1 Achievements

The aim of this project was to develop a system capable of real-time face recognition
using 3D face model. Statistical model based approach was the chosen method of
implementation.

In early stage of data processing, Iterative Closest Point algorithm provides a good
way to automatically aligned one face model to another. Alignment of data set into a
common coordinates frame is required for Principal Component Analysis to take
place.

Principal Component Analysis offered a good way to represent the main variation in
the face model data, while offering the advantage of reduced dimensionality. This is
evident when the face database is constructed, we only required changing the modes
and ranges to represent all the data in the training set, and generated some new data
set as well.

Face recognition is effective, it can recognize and identify an input face from the set
of data which constructed the face database.

7.2 Limitation

Not all areas of the project are successful. ICP can fail if initial positions of the input
data are bad and with the data contains too many garbage polygons. Face recognition
takes too long if using high polygon model, with normal model the resolution on the
face model is too bad. Stability is still an issue with the program.

7.3 Future Work

There are a number of improvements can be applied to the application in the future
including:

• Improved Point Search Algorithm – The “dumb” search algorithm is too


slow for point matching. There are more advance search algorithms can be
implemented that shall improves the performance, one suggestion is to use
Octree Search. ICP, PCA and Face Match algorithm can all benefits from this.

• Different Implementation of ICP – The point to point match for the closest
point operator is not the best approach, this was implemented because of its
simplicity. A more robust approach shall be using point to planar surface

42
match for the closest point operator. Data alignment shall be able to work
automatically rather then semi-automatic.

• Face Model Data Filter – This can helps to filter out the garbage polygons
from the face data in order to improve ICP.

• Code Optimization – During development of this project, good design and


functionality had a higher priority than code optimization. With the stability
issue with the program application and memory leakage problem, code
optimization can helps benefits the execution of functions and increase in
performance.

• Further Testing – Obtain more data by capturing more people showing


different facial expression. The data available so far is limited, there is not
enough data to able to guarantee that the face recognition is effective. A larger
database therefore can measure the effectiveness of the application in general.

43
44
Bibliography

Here is a list of reference that provides information of the report, and reference during
implementation:

[1] General Facial Data Processing Techniques


http://www.cis.upenn.edu/~hms/pelachaud/workshop_face/subsection3_7_2.html

[2] Fast, Reliable Head Tracking under Varying Illumination: An Approach


Based on Registration of Texture-Mapped 3D Models.
Marco La Cascia, Stan Sclaroff, Member, IEEE and Vassilis Athitsos

IEEE Transactions on Pattern Analysis And Machine Intelligence, Vol. 22, No. 4, April 2000

[3] Statistical Models of Appearance for Computer Vision

T.F. Cootes and C.J. Taylor

http://www.isbe.man.ac.uk

[4] A Method for Registration of 3-D Shapes

Paul J. Besl, Member, IEEE, and Neil D. McKay

IEEE Transactions on Pattern Analysis And Machine Itelligence, Vol. 14, No. 2, February 1992

[5] Principal Component Analysis


http://149.170.199.144/multivar/pca.htm

[6] Face Recognition using infra red images and eigen faces
Ross Cutler, 1996

http://www.cs.umd.edu/~rgc/pub/ir_eigenface.pdf

[7] Face Recognition Using Active Appearance Models

G.J. Edwards, T.F. Cootes, and C.J. Taylor

http://www.isbe.man.ac.uk

[8] Programming Windows with MFC Second Edition


Jeff Prosise

Microsoft Press

45
[9] The MFC Answer Book – Solution for Effective Visual C++ Applications
Eugène Kain

Addison - Wesley

[10] The Visualization Toolkit


http://public.kitware.com/VTK/

[11] VTK Tutorial and Basics

http://www.imaging.robarts.ca/~dgobbi/vtk/

There are more advance and background readings, and useful references that have not
mentioned in the body of report:

[12] Microsoft Technical Library

http://msdn.microsoft.com

[13] Mathematical Background on ICP

http://www.ticam.utexas.edu/~skvinay/PointSet/math_background.htm

[14] Active Appearance Models

G.J. Edwards, T.F. Cootes, and C.J. Taylor

IEEE Transactions on Pattern Analysis And Machine Intelligence, Vol. 23, No. 6, JUNE 2001

[15] Interpreting Face Images using Active Appearance Models

G.J. Edwards, T.F. Cootes, and C.J. Taylor

http://www.isbe.man.ac.uk

[16] Three Derivations of Principal Component Analysis

J.P. Lewis

[17] The Parallel Iterative Closest Point Algorithm

Christian Langis, Michael Greenspan and Guy Godin

Institute for Information Technology

[18] Efficient Variants of the ICP Algorithm

Szymon Rusinkiewicz and Marc Levoy

Standford University

46
[19] ICP Registration Using Invariant Features

Gregory C. Sharp, Student Member, IEEE, Sang W. Lee, Member, IEEE, and David K.Wehe,
Member, IEEE.

IEEE Transactions on Pattern Analysis And Machine Intelligence, Vol. 24, No.1, January 2002

47
Appendix A – Formulation of ICP
(extracted from [4])

This is to show the formulation and chosen method of implementation for the ICP
algorithm that is currently working in the application.

The closest point operator is using basic point by point matching to find the closest
point. The determination of a subject point to a closest point from a set of point is
r r r
determined by the Euclidean distance d (r1 , r2 ) between the two points r1 = ( x1 , y1 , z1 )
r r r r r
and r2 = ( x 2 , y 2 z 2 ) is d (r1 , r2 ) = r1 − r2 = ( x 2 − x1 ) 2 + ( y 2 − y1 ) 2 + ( z 2 − z1 ) 2 . Then
r r
let A be a point set with N a points denoted ai , then the distance between the point p
and the point set A is:
r r r
d ( p, A) = min d ( p, a i )
i∈{1,..., N a }

r r r r
The closest point a j of A satisfies the relations d ( p, a j ) = d ( p, A) .
For each point in a data set, every point in the model set is checked using this
algorithm to determine the closest point, the set of closest points are therefore
constructed in this way.

The calculation carried out for the point set registration algorithm is using quaternion
based transformation. When the transformation is calculated, the quaternion vector is
then transformed back to Euclidean coordinate system matrix transformation.
r
The unit rotation quaternion is a four vector q R = [q0 q1q 2 q 3 ] , where q0 ≥ 0 , and
t

q02 + q12 + q 22 + q32 = 1 . The rotation matrix in Euclidean coordinate system generated
by the unit rotation quaternion is:

q 02 + q12 + q 22 + q32 2(q1q 2 − q 0 q 3 ) 2(q1q 3 + q 0 q 2 ) 


 
R =  2(q1q 2 + q 0 q3 ) q +q −q −q
2
0
2
2
2
1
2
3 2(q 2 q 3 − q 0 q1 ) 
 2(q1 q3 − q0 q 2 ) 2(q 2 q3 + q 0 q1 ) q 02 + q32 − q12 − q 22 

r
Let qT = [q 4 q5 q6 ]t be a translational vector, then the complete registration state
r r r r
vector q is denoted q = [q R | qT ]t . Suppose P is the data point set to be aligned with a
model point set, the closest point set of P with the model point set is denoted Y, where
N y = N p and each point in P corresponds to the point in Y with the same index. The
mean square objective function to be minimized is

NP 2
1 r r r
d ms =
NP

i =1
y i − q ( pi )

48
r r
The “center of mass” µ P of point set P and the center of mass µ Y for the point set Y
are given by:

NP NY
r 1 r r 1 r
µP =
NP

i =1
pi and µ Y =
NY
∑y
i =1
i

The cross-covariance matrix Σ PY of the sets P and Y is given by:

NP NP
1 r r r r 1 r rt r r
Σ PY =
NP

i =1
[( pi − µ P )( y i − µY ) t ] =
NP
∑[ p y ] − µ
i =1
i i P µY

The cyclic components of the anti-symmetric matrix Aij = (Σ PY − Σ TPY ) ij are used to
form the column vector ∆ = [A23 A31 A12 ] . This vector is then used to form the
T

symmetric 4 by 4 matrix Q (Σ PY ) :

tr (Σ PY ) ∆T 
Q (Σ PY ) =  
 ∆ Σ PY + Σ PY − tr (Σ PY ) I 3 
T

r
where I 3 is the 3 by 3 identity matrix. The unit eigenvector q R = [q0 q3 ]
T
q1 q2
corresponding to the maximum eigenvalue of the matrix Q (Σ PY ) is selected as the
optimal rotation. The optimal translation vector is given by:
r r r r
qT = µ Y − R(q R ) µ P

Therefore, the transformation matrix in Euclidean coordinate system is constructed,


and that is the registration vector to apply to the data set P to have a new P and see if
the difference in mean square point matching error with the previous match is below
the pre-set threshold.

49
Appendix B – Applying PCA when there are fewer samples
than dimension
The following is extracted from [3]. When applying PCA to s set of n-D vectors, xi ,
where s < n. In the stage to construct the covariance matrix, the dimension of it is n by
n and this can be large. However, the eigenvalues and eigenvectors can be calculated
from a smaller s by s matrix derived from the data. Because the time taken for an
eigenvector decomposition goes as the cube of the size of the matrix, therefore the
reduction in dimensionality can gives considerable savings.

Subtract the mean from each data vector and put them into the matrix D:

D = (( x1 − x ) | ... | ( x s − x ))

The covariance matrix can be written:

1
S= DD T
s

Let T be the s by s matrix:

1 T
T= D D
s

Let ei be the s eigenvectors of T with the corresponding eigenvalues λi sorted in


descending order. It can be shown that the s vectors Dei are all eigenvectors of S with
corresponding eigenvalues λi , and that all remaining eigenvectors of S have zero
eigenvalues. Note that Dei is not necessarily of unit length and therefore require
normalization.

50
Appendix C – “File” Class Declaration
// File Loader Header File

#ifndef FILELOADER_H
#define FILELOADER_H

#include "vtkFloatArray.h"
#include "vtkPoints.h"
#include "vtkPolyData.h"
#include "vtkCellArray.h"
#include "vtkTransform.h"
#include "MathTool.h"
#include "BMPData.h"
#include "vtkMatrix4x4.h"
#include "vtkUnsignedCharArray.h"

class BMPData;

class File{
public:
char filename[255];

float **vertice_data;
float **texture_data;
float **vertice_normal;
float **closest_data;
unsigned int **scalar_data;
float *dist_diff;
float *scalar_diff;

float Rotation[3][3];
float Translation[3];

int **polygon_data;

BMPData *texture;
bool have_texture, have_scalar, invalid_file;

int no_vert, no_poly, no_text;


float mean_dist, old_mean_dist;

bool matched, have_data_already, with_texture, high_res_call;

File();
~File();
File(char f[]);

51
void Getdata();
void CountData();
void WriteToFile(const char *newname);
void CalculateNormal();
void SetFileName(const char* name);
void ChangeToVTKFormat(vtkPolyData* polydata);
void GetTransFromFiles(const char* name);
void SetDataSize(int size, int poly);
void Delete();
void CopyData(File *data);
void VTKBackToFile(vtkPolyData* polydata);
void ChangeToTextureName();
void ApplyTransform();
// this is copy the transfoemation matrix
void PutTransformationFromVTK(vtkMatrix4x4 *matrix);
void MatchScalarFromTexture();
void ConvertToScalar();
// this is to multiply the transformation matrix
void ACCTransformation(vtkMatrix4x4 *matrix);
};

#endif

52
Appendix D – “ICP” Class Declaration
// ICP algorithm Header

#ifndef ICP_H
#define ICP_H

#include "fileloader.h"
#include "MathTool.h"
#include "vtkMatrix4x4.h"

class ICP{
public:
float Rotation[3][3];
float Translation[3];
float delta;
float threshold;
int Max_iteration, store_iteration;
File *model_set, *data_set;
float print_transform[4][4];

ICP();
~ICP();
bool Run();
void SetModel(File *m);
void SetData(File *d);
void ChangeThreshold(float num);
void WriteTransformMatrixToFile(const char *name);
int ShowIteration();
void Delete();
void PutTransformMatrix(vtkMatrix4x4 *matrix);
void GetMatrix(vtkMatrix4x4 *matrix);
void Reset();
void SetMaxIteration(int m);
private:
void close_matching();
void CalculateMeanDist();
void Registration();
void Transform_Data();
void GetTransformMatrix();
};

#endif

53
Appendix E – “PCA” Class Declaration
// PCA algorithm header file

#ifndef PCA_H
#define PCA_H

#include <fstream>
#include "fileloader.h"
#include "mathtool.h"
#include "vtkLinearSubdivisionFilter.h"
#include "vtkPolyData.h"

class PCA{
File *reference;
File *file;
int M; // number of data set
int length, size;
float *average_set;

float **OneDim_vertice_set; // 1st comp of the array determine the file index
// 2nd comp of the array is the actual data set;
// i.e. 1D_vertice_set[M][3*no_verts]

float **OneDim_result_set;
float *M_evalues; // evalues[M]
float **M_evectorsData; // edata[M][M]

bool high_res, use_scalar;

void GetData();
void MatchVector(File *data, float **closest_data, float **closest_texture);
void Subtracted_Set(); // change the 1D vertice set to subtract the average set
void ConstructCovarianceMatrix(float **C);
void CalculateAverageSet();
void GetReferenceData();
public:
PCA();
PCA(int n, File *r, bool mode);
void Execute();
void SetDataSet(File *f);
void SetNumberOfFiles(int n);
void SetReferenceFile(File *r);
void PutDataToFileFormat(File *data, int set, int range);
void PutAverageToFileFormat(File *data);
void Delete();
int GetNumberOfSample();
int GetMaxSet();

54
void SetRenderMode(bool mode);
void WriteData(const char* filename);
void ReadData(const char *filename);
};

#endif

55
Appendix F – “FaceMatch” Class Declaration
// File Loader Header File

#ifndef FACEMATCHING_H
#define FACEMATCHING_H

#include "fileloader.h"

class FaceMatch{
File *face_in_database;
File *unknown_face;

int store_mode, store_range;


float small_dist;
bool with_scalar;

void close_matching();
void CalculateMeanDist();

public:
FaceMatch();
void SetDatabaseFace(File *r);
void SetUnknownFace(File *i);
void Execute(int mode, int range);
int GetMode();
int GetRange();
void SetMatchScalar(bool w);
void Reset();
};

#endif;

56
Appendix G – User Guide for Active Face Recognition
This guide gives the user a step by tutorial to operate this application towards face
recognition stage.

G.1 Basic Operations

The application can open wave font file format to display the data handle by it. It also
provides the option to render the model using texture-mapping, scalar value rendering
and high polygon model. Once a file is opened, the three options are available as
shown in figure G.1 below:

Figure G.1, the application in action, the render options are circled as shown

The render window shown provides an interactor so that user can interact with the
model, the file menu on the main frame bar provides some options to opens a wave
font file, create an empty render window and can save the model data shown in the
render window. The function menu provides the user to operates the ICP panel, PCA
panel and the Face Recognition panel. Figure G.2 shows the icons on the toolbar that
operate these three panels as well when pressed.

57
Figure G.2, Toolbar Icons

G.2 The ICP Panel Operations

Figure G.3, ICP in action

Before the database can construct, the set of data required to aligned into a common
coordinate frame. Therefore the ICP panel is used to aligned the data set. When the
“ICP” icon is pressed from the toolbar or from the “Function” menu, an ICP panel
will come up, using this panel can help to aligned the set of data.

First need to find a reference model as the model set source file for ICP, that can be
done by pressing the top “Browse” button which brings up a file dialog box to choose
the file, then do the same for any other data wish to aligned to the model set but using
the bottom “Browse” button. Once both files are set, then the user can press the
“Show Image” button to see the models. From the render window, the blue model
corresponds to the data set and the “greyish” white model corresponds to the model
set.

The user now have the choice either using ICP to step through the iterations or run
through the iterations until completion, this can done by pressing the “Step ICP”

58
button and “Run ICP” button respectively. If however, ICP does not aligned both the
models completely, then the user can use the interactor to aligned the models
manually. Once the user is satisfy the result, then can save the resultant
transformation into file by pressing the “Save Result” button, the transformation will
then save into a filename with “.tra” extension.

G.3 The PCA Panel Operations

Figure G.4, PCA in action

When the set of data are aligned to a common frame coordinate as the reference
model, then they are ready to undergoing PCA in order to construct the model
database. The PCA panel can be activate either by the “PCA” icon from the toolbar or
from the “Function” menu.

Before PCA can run, the first thing to do is to put the files of the set of models data
into the left list box and the corresponding transformation files into the right list box,
this can be done by pressing the “Add Obj” button and the “Add Trans” button
respectively. The filenames can be removed from the list box by the removes button
depends which list box item the use want to remove. The check box can be check if
the user want to analysis in high polygon model, this check box is only working
before PCA has run, it has no effect after PCA has run. Then pressing the “Run PCA”
button to run PCA.

When finish running PCA, a render window will pop up with rendering the model
corresponds to the mode “1” and range “0” model from the database. This is the

59
average model, and in fact all range “0” model corresponds to the average model. The
user can then use both the slider bars to change the model from the database being
rendered in the render window. The “Save PCA” and “Load PCA” buttons provides
the user to either save or load a PCA database respectively, therefore there is no need
to run PCA every time it is required.

G.4 The Face Recognition Panel Operations

Figure G.5, Face Recognition in Action

The Face Recognition Panel can only be brought up either by the “face” icon from the
toolbar or from the “Function” menu or from the “Face Recognition” button from
PCA panel but is when the PCA panel has an active database either by just finish
running PCA or load a database from file, figure G.5 shows the latter case.

The “Get Input” button get an input model which want to be recognize from the
model database, the input model is shown in the reddish colour when rendered.
Before recognition takes place, it is better to align the two models since the matching
algorithm is determined by the closest distance between the two models. Also it is
better to use the average model to be aligned. There is two options to aligned the
models, either by press the “Align Data” button or can do it manually by the interactor.
The user can have the options to also running the recognition using scalar values as
well and the choice of the maximum mode to evaluate. Then the user can begins face
recognition by press the “Start Process” button. When face recognition is finished, the
result of the model from the database will then be shown in the render window with
the mode and range are set with the slider bars in the PCA panel.

60

Вам также может понравиться