Seminar Report

1
ACADEMY OF TECHNOLOGY
Certificate
This is to certify that the project report entitled Face Recognition using Deep
Learning, submitted to the Department of Electronics and Communication Engineering,
Academy of Technology, in partial fulfillment for 6th semester SEMINAR PRESENTATION
[EC-681] of Bachelor of Technology in Electronics and Communication Engineering, is a
record of bona fide work carried out by Rik Mitra, Roll No-16900316074 and Pritam
Sengupta, Roll No-16900316082, under my supervision and guidance.
All help received by them from various sources have been duly acknowledged.
No part of this report has been submitted elsewhere for award of any other degree.
(Sahadeb Santra)
Assistant Professor
Seminar Guide
Place: Adisaptagram,Hooghly.
Date:
3
Acknowledgement
We are thankful to my guide Prof. Sahadeb Santra whose personal enrolment
in the technical seminar presentation and report has been a major source of inspiration for us
to be flexible in our approach and thinking for tackling various issues. He assumes the critical
role of ensuring that we are always on the right track.
We also extend our gratitude to Prof. Abhijit Banerjee (H.O.D, Electronics

and Communicztion Dept.) without whose support, co-operation and guidance this report
presentation would not have been a success.
Last but not the least we would like to give a big thanks to all the staffs and
assistants of Electronics and Communication.
Rik Mitra Pritam Sengupta

(16900316074) (16900316082)
ECE1 ECE1
4
Abstract
Face recognition is the task of identifying an individual from an image of their face and a
database of know faces. Despite being a relatively easy task for most humans,
“unconstrained” face recognition by machines, specifically in settings such as malls, casinos
and transport terminals, remains an open and active area of research. However, in recent
years, a large number of photos have been crawled by search engines, and uploaded to social
networks, which include a variety of unconstrained material, such as objects, faces and
scenes. This large volume of data and the increase in computational resources have enabled
the use of more powerful statistical models for general challenge of image classification. This
research project evaluates the use of deep learning approaches such as deep convolutional
neural networks for image classification for the problem of unconstrained facial recognition.
Deep learning is an emerging area of machine learning (ML) research. It comprises multiple
hidden layers of artificial neural networks. The deep learning methodology applies nonlinear
transformations and model abstractions of high level in large databases. The recent
advancements in deep learning architectures within numerous fields have already provided
significant contributions in artificial intelligence. This article presents a state of the art survey
on the contributions and the novel applications of deep learning. The following review
chronologically presents how and in what major applications deep learning algorithms have
been utilized. Furthermore, the superior and beneficial of the deep learning methodology and
its hierarchy in layers and nonlinear operations are presented and compared with the more
conventional algorithms in the common applications. The state of the art survey further
provides a general overview on the novel concept and the ever-increasing advantages and
popularity of deep learning.
5
Theory
Introduction
Face Recognition (FR) is one of the areas from Computer Vision (CV) that has drawn more
interest for long. The practical applications for it are many, ranging from biometrical security,
to automatically tagging your friends pictures, and many more. Because of the possibilities,
many companies and research centers have been working on it.
1.1 The Face Recognition Problem

That being said, this problem is also a really difficult one, and it has not been until recent
years that quality results are being obtained. In fact, this problem is usually split into
different sub-problems to make it easier to work with, mainly face detection in an image,
followed by the face recognition itself. There are also other tasks that can be performed in-
between, such as frontalizing faces, or extracting additional features from them. Through the
years, many algorithms and techniques have been used, such as eigen faces or Active Shape
models. However, the one that is currently mostly used, and providing the best results,
consists in using Deep Learning (DL), especially the Convolutional Neural Networks (CNN).
These methods are currently obtaining high quality results, so, after reviewing the current
state of art, we decided to focus on this report.
1.2 Goal and implementation

Our goal was to create a complete Face Recognition system, capable of working with any
kind of images, and to constantly improve itself. This improvement had to be autonomous,
and to allow it to better recognize people in it, and to include new ones. On top of that, the
time requirements were also an issue, as this recognition must be done as close to real-time
as possible. The task of recognizing faces, especially outside of controlled conditions, is an
extremely difficult problem. In fact, there have been many approaches throughout the history
that have not succeeded. Apart from the variance between pictures of the same face, such as
expression, light conditions or facial hair, it is difficult to determine what makes a face
recognizable. As such, our intention at the beginning of this report making was not to start
from the scratch, but to make use of some of the already existing research. This would allow
us to speed up the process, and to make it more feasible to obtain quality results. In order to
do so, we researched the history and current state of the field. By doing so, we looked for
successful ways of addressing the problem in which we could inspire. The main reasons are
the good results obtained − being really close to the state of art −, and the quality of the
description. It consists in a 3 step process. First, the face in the image is located and
frontalized, so that it is looking at the camera. Then, the frontalized face is sent through a
CNN, and a set of relevant features are extracted. Finally, these features are used as attributes
to compare pairs of images to determine whether they belong or not to the same person.
6
Face Recognition Problem

The recognition of a human face is one of the most basic activities that humans perform with
ease on a daily basis. However, when this problem is tried to be solved using algorithms, it
proves to be an extremely challenging one. The idea of a machine capable of knowing who is
the person in front of them has existed for a long time, the first attempts happening on the 70s
[Kelly, 1971]. The researchers have ranged from computer engineers to neural scientists
[Chellappa, Wilson, and Sirohey, 1995]. However, during many years no quality solutions
were obtained. It has not been until the late 2000s and beginning of the 2010s that functional
systems have started to appear.
The uses for an automatic face recognition system are many. Typical ones are biometric
identification − usually combined with other verification methods −, automatic border
control, or crowd surveillance. One of its main advantages is its non intrusivity. Most
identification methods require some action from people, either putting the fingerprint in a
machine, introducing a password, etc. On the contrary, face recognition can work by simply
having a camera recording. Among other uses, some of its most well knows uses belong to
the social network field.
As of 2016, there are already system being used that rely on face recognition, a brief sample
of which are introduced here. This sample is by no means exhaustive, but it tries to show the
variety of applications. It comes as no surprise that one of the most uses that draws most
attention is to track criminals. As forensic TV series have shown, having a system
automatically scanning city cameras to try to catch an escapee would be of great help. In fact,
United States is already using this technology. Although far from the quality level depicted in
fiction, they are already using it − although there is some skepticism regarding whether it
works − to identify people from afar. Although the large criticism there is involving this kind
of methods, there is little doubt that in the future they will become widely used. A not so well
known use of face recognition is to authorize payments. As a part of a pilot test, some users
are, under some circumstance, asked to take a picture of themselves before the payment is
accepted. This kind of applications have a double goal: to facilitate the process to users −
being easier than remembering a password −, and to discourage credit card thefts.
On a more technical way, there have been, historically, many approaches to the problem.
However, there is one key issue in the face recognition problem that most of them have
shared, that is, the feature extraction. Most approaches to the problem start by transforming
the original images to a more expressive set of features, either manually crafted, or
automatically selecting some statistically relevant ones. In fact, working with the raw images
is extremely difficult, due to factors such as light, pose, or background, among others.
Therefore, by keeping only the information relevant to the face, most of this “noise” is
discarded. Finding an efficient feature selection strategy is likely to benefit almost any kind
of ulterior classification method. There have been, traditionally, two main approaches to the
problem: the geometric, which uses relevant facial features and the relations between them,
and the photometric ones, which extracts statistical information from the image to use in
different kinds of comparisons.
7
Deep Learning
In recent years a new method has appeared which has affected the whole Computer Vision
community. Since its appearance, Deep Learning, and more concretely Deep Neural
Networks and Convolutional Neural Networks, has steadily achieved state-of-art results in
many CV problems, even in those in which research was stuck. We provide a more technical
description of this method, so here we will just say that DL is, roughly, a kind of Neural
Network composed of multiple layers. When applied to CV, they are capable of
automatically finding a set of highly expressive features. Based on empirical results, these
features have proven to be better than those manually crafted in many occasions. They have
the additional advantage of not having manually design these features, as it is the network the
one in charge of doing so. On top of that, the features learned can be considerably abstract.
Interestingly, the way CNNs work is closely related to the way biological visual system
works [Itti and Koch, 2001; Kim, Kim, and Lee, 2015]. Whether this is the reason of its
success is out of the scope of this document, but it can not be denied that the results they are
obtaining make them a choice to consider when faced with CV problems. In fact, a large
number of the most successful applications of CV in recent years have used CNNs, and this
tendency is expected to continue. Because of this, the work in this thesis makes use of them.
Two of the most successful applications of CNNs in the FR problem are DeepFace [Taigman
et al., 2014] and FaceNet [Schroff, Kalenichenko, and Philbin, 2015]. These two have
provided state-of-art results in recent years, with the best results being obtained by the second
ones. Although there are other methods providing close results, such as involving Joint
Bayesian methods [Cao et al., 2013; Chen et al., 2013], we decided to focus on CNN. The
reasons were not only result driven, but also interest driven, as we were personally interested
in working with them.
Problems
Unfortunately, even though its potential, automatic face recognition has many problems. One
of the most important ones is face variability in a single person. There are many factors that
can influence so that two pictures from the same person look totally different, such as light,
face expression or occlusion. Actually, when dealing with faces in controller environments,
8
face recognition systems are already delivering quality results, but they still have problems
when faced with faces in the “wild”. Even more, factors such as sunglasses, beards, different
hairstyles,emotions or even age, can greatly difficult the task. An example of these problems
can be seen in Figure. Another problem to be taken into account is the environment. Except
in controlled scenarios, face pictures have very different backgrounds, which can make the
problem of face recognition more difficult. In order to address this issue, many of the most
successful systems focus on treating the face alone, discarding all the surroundings. Taking
all of it into consideration, our goal was to develop a system capable of working with faces in
uncontrolled environments. In order to do so, we used Convolutional Neural Networks as a
feature extraction method. We also planned on applying some pre-processing in order to
minimizing the impact of the environment, and make our system more robust. That being
said, we were aware of the difficulties involved in such a project, so we were cautious about
the expected results.
Fig. Variability in emotions of same Face

9
Technological Details
Theoretical Background: CNN
We aim to provide an introduction into the concept of Convolutional Neural Networks. In
order to do so, it is necessary to understand the concept of Artificial Neural Network, so the
first part of the chapter is devoted to do so. After that, Deep Learning and CNN are
explained.
Artificial Neural Network (ANN)

Inspired in their biological counterparts, Artificial Neural Networks are sets of
interconnected computational nodes, usually with square or cubic shapes. They are a
computational approach for problems in which the solution of the problem, or finding a
proper representation, is difficult for traditional computer programs. The way they process
information could be understood as receiving external inputs that can elicit, or not, a response
in some of the nodes of the system − neurons. The whole set of responses determines the final
output of the network.
They have proven their capacity in many problems, such as Computer Vision ones, which
are difficult to address by extracting features in a traditional way. This section aims to briefly
introduce the main technical concepts of the method, in order to make it easier to understand
the Deep Learning explained afterwards.
What are they?

The power of ANN comes from a set of computationally simple nodes that combine together,
that is, the neurons. These neurons are structured in layers, which are connected between
them, similarly to the way biological neurons are connected by axons. These layers are
divided into 3 main types: input, hidden and output. The input layer corresponds to the data
that the network receives. It could be understood as the input vector from other methods. This
layer is connected to a hidden layer, that is, the ones that are not in the extremes. This is
where their name comes, as they are not “visible” from the outside. Another interesting
interpretation would be that, contrary to other methods, once the network is trained, looking
at them does not provide any insight on what they do. As such, ANN are sometimes referred
as black boxes, as it is nigh impossible to understand their functioning. There can be multiple
hidden layers, each of them connected to the previous one. Every neuron in hidden and
output layers are traditionally connected to all neurons from previous layer. Each edge has an
associate weight, which indicates how strongly related the two neurons are, either directly or
inversely, similarly to the way biological neurons are connected. Finally, the last layer is
called output layer, and it delivers the result of the ANN, with one output per class. This is
important, as ANN are mostly used for classification problems.
10
Fig. Three layers of ANN
This is, roughly speaking, the basic structure of an ANN. There are many variations over it,
such as Recurrent Neural Networks, in which connections form a directed circle, but they are
all based in this. They can be understood as a function f that maps an input X into an output
Y . The training task, then, consists in learning the weight associated to each edge.
How do they work?

ANN are used to approximate an unknown mathematical function, which can be either linear
or non-linear. They are capable, theoretically, to approximate any function. Its basic unit is
the neuron, that computes a “simple” activation function given its inputs, and propagates its
value to the following layer. Therefore, the whole function is composed by gathering
activation values from all neurons. Having hundreds of neurons − which is not too many −,
the number of edges can reach orders of magnitude higher, and thus the difficulty in
interpreting them.
How are they trained

One of the main requirements for training this kind of algorithms is data. All learning
algorithms use data in their training processes, but ANN require more than most.
Given the data, there are various learning algorithms, from which gradient descent combined
with backpropagation can be considered, given its widely spread use, the most successful of
all of them. In fact, to a certain degree it could be considered that using it is enough for
training most ANNs.
This algorithm starts by initializing all weights in the network, which can be done following
various strategies. Some of the most common ones include drawing them from a probability
distribution, or randomly setting them, although low values are advisable. The process
followed afterwards consists of 3 phases that are repeated many times over. In the first one,
an input instance is propagated through all the network, and the output values are calculated.
Then, this output is evaluated, using a loss function, with the correct output, and this is used
to calculate how far off the network is. The final phase consists in updating each weight in
order to minimize the obtained error. This is done by obtaining the gradient of each neuron,
that could be understood as a “step” towards to actual value. When these three phases are
repeated for all input instances we consider this an epoch. The algorithm can run for as many
11
epochs as specified, or as required to find the solution. Briefly, the obtaining of the gradient
goes as follows. Once the outputs have been calculated for an instance, we obtain the error
achieved for each output neuron o, calling it δo. This value allows finding the gradient of each
o. For this, we need to find the derivative of the output of o with respect to its input Xo, that
is, the partial derivative of its activation function φ. For the logistic regression case, this
becomes:
∂o/ ∂Xo = ∂[φ(Xo)]/ ∂Xo = φ(Xo)(1 − φ(Xo))
Deep Learning
One of the key aspects in most machine learning methods is the way data is represented, that
is, which features to use. If the features used are badly chosen, the method will fail regardless
of its quality. Even more, this selection affects the knowledge with which the method can
work: if you have trained your market analysis algorithm with numerical values, it will not be
able to make any sense from a written report, no matter its quality. Therefore, it is no surprise
that there has been an historical interest on finding the appropriate. Theoretical Background:
CNN features. This becomes especially relevant in the case of Computer Vision problems.
The reason is that, when faced with an image, there are usually way too many features − a
simple 640 × 480 RGB image has almost 1 million pixels −, and most of them are irrelevant.
Because of this, it is important to find some way of condensing this information in a more
compact way.
There are two main ways of obtaining features, manually choosing them − such as
physiological values in medical applications − or automatically generating them, an approach
known as representation learning. The latter has proven to be more effective in problems such
as computer vision, as it is very difficult for us humans to know what makes an image
distinguishable. Instead, in many cases machines have been able to determine which features
were relevant for them, resulting in some state of art results. The most paradigmatic case of
representation learning are the autoencoders. They perform a 2 step process, first they encode
the information they receive into a compressed representation, and they later try to decode, or
reconstruct, the original input from this reduced representation.
We are going to focus on Computer Vision problems from now on, as it will make it easier
to understand some of the next sections. Regarding the features extracted, people may have
some clear ideas about what makes an object, such as a car, recognizable. Having 4 wheels,
doors in the lateral, a glass at the front, it is made of metal, etc. However, these are high level
features, that are not easy for a machine to find in an image. To make it even worse, each
kind of object in the world has its particular features, usually with a large intra-class
variability. Because of this, developing a general object recognition application would be
impossible, as we would need manually selected features for each of them. Therefore, it has
not been a successful line of research recently. On the contrary, if machines are capable of
12
determining on their own what is representative of an object for them on their own, they will
have the potential of learning how to represent any object they are trained with.
However, there is an additional difficulty for this kind of problems, that is, the variability
depending on the conditions of each picture. We do not only have to deal with the intra-class
variability, but also the same object variability. The same car can be pictured in almost
endless ways, depending on the pose of the car, light conditions, image quality, etc. Us
humans are capable of making rid of this variation by extracting what we could consider
abstract features. These features can be include the ones we mentioned before, such as
number of wheels, but also others we are not aware of, such as the fact that they are usually
on a road, or that their wheels should be in contact with the floor. In order to develop a
successful representation learning method, it should be able to extract this kind of high-level
features, regardless of their variation. The problem is that this process can be extremely
difficult to develop into a machine, which may lead into thinking that it makes no sense to
make the effort of doing so. This is, precisely, where Deep Learning has proven to be
extremely useful.
Deep Neural Networks

Even though there are various approaches to Deep Learning, such as Deep Kernel Methods
[Cho and Saul, 2009], the one that has been most used, by far, uses neural networks, and it is
known as Deep Neural Networks (DNN). More concretely, they can be roughly understood
as an ANN with many hidden layers. One of the most commonly used ANN approach for
DNN is the MLP. As already explained previously in this chapter, neural networks are
composed by layers of interconnected neurons. In principle, there is no limit regarding
neither the number of layers or neurons per layer, but, in practice. Theoretical Background:
CNN it has been almost impossible to successfully training more than a handful of hidden
layers. As already explained, the number of weights in a network can easily reach the
thousands, or even millions in the larger ones, meaning a large number of parameters to learn.
This requires both extremely large computational times and data to feed to the training stages.
There have been attempts at doing so since decades ago, but it has not been until the late
2000s that the means for effectively doing so have been available.
Convolutional Neural Networks

Among the Deep Neural Networks, the ones that are most widely used in Computer Vision
problems are the Convolutional Neural Networks, based in the Multi Layer Perceptron
architecture. Whereas normal ANNs are inspired in general neuronal behavior, CNNs follow
the same principles as animals visual cortex. This consists in neurons that process only small
portions of the input image − or visual field − and are in charge of recognizing relevant
patterns. These neurons are stacked in structures similar to layers, allowing increasingly
complex patterns. On its own, this may remind of the general DNN structure. However, there
is a key issue differentiating them, that is, shared weights.
13
Applications
You’re used to unlocking your door with a key, but maybe not with your face. As strange as
it sounds, our physical appearances can now verify payments, grant access and improve
existing security systems. Protecting physical and digital possessions is a universal concern
which benefits everyone, unless you’re a cybercriminal or a kleptomaniac of course. Facial
biometrics are gradually being applied to more industries, disrupting design, manufacturing,
construction, law enforcement and healthcare. How is facial recognition software affecting
these different sectors, and who are the companies and organisations behind its development?
1. Payments
It doesn’t take a genius to work out why businesses want payments to be easy. Online
shopping and contactless cards are just two examples that demonstrate the seamlessness of
postmodern purchases. With FaceTech, however, customers wouldn’t even need their cards.
In 2016, MasterCard launched a new selfie pay app called MasterCard Identity Check.
Customers open the app to confirm a payment using their camera, and that’s that. Facial
recognition is already used in store and at ATMs, but the next step is to do the same for
online payments. Chinese ecommerce firm Alibaba and affiliate payment software Alipay are
planning to apply the software to purchases made over the Internet.
2. Access and security

As well as verifying a payment, facial biometrics can be integrated with physical devices and
objects. Instead of using passcodes, mobile phones and other consumer electronics will be
accessed via owners’ facial features. Apple, Samsung and Xiaomi Corp. have all installed
FaceTech in their phones. This is only a small scale example, though. In future, it looks like
consumers will be able to get into their cars, houses, and other secure physical locations
simply by looking at them. Jaguar is already working on walking gait ID – a potential parallel
to facial recognition technology. Other corporations are likely to take advantage of this, too.
Innovative facial security could be especially useful for a company or organisation that
handles sensitive data and needs to keep tight controls on who enters their facilities.
3. Criminal identification
If FaceTech can be used to keep unauthorised people out of facilities, surely it can be used to
help put them firmly inside them. This is exactly what the US Federal Bureau of Investigation
is attempting to do by using a machine learning algorithm to identify suspects from their
driver’s licences. The FBI currently have a database which includes half of the national
population’s faces. This is as useful as it is creepy, giving law enforcers another way of
tracking criminals across the country. AI equipped cameras have also been trialled in the UK
to identify those smuggling contraband into prisons.
4. Advertising
The ability to collect and collate masses of personal data has given marketers and advertisers
the chance to get closer than ever to their target markets. FaceTech could do much the same,
14
by allowing companies to recognise certain demographics – for instance, if the customer is a

male between the ages of 12 and 21, the screen might show an ad for the latest FIFA game.
Grocery giant Tesco plans to install OptimEyes screens at 450 petrol stations in the UK to
deliver targeted ads to customers. According to company CEO Simon Sugar, the cameras
could change the face of British retail. Perhaps he’s right – but only if the cameras can
correctly identify customers. Being classified as the wrong age or gender is far less amusing
than having your name spelt wrong on a Starbucks cup.
5. Healthcare
Instead of recognising an individual via FaceTech, medical professionals could identify
illnesses by looking at a patient’s features. This would alleviate the ongoing strain on medical
centres by slashing waiting lists and streamlining the appointment process. The question is,
would you really want to find out you had a serious illness from a screen? If it’s a choice
between a virtual consultation or a month long wait for an appointment, then maybe so.
Another application of facial biometrics within healthcare is to secure patient data by using a
unique patient photo instead of passwords and usernames.
Advantages
 The Improvement of Security Level
As we said in the first paragraph, a face biometric system greatly improves your security
measures. All corporation’s premises would be protected since you’ll be able to track both
the employees and any visitors that come into the area. Anyone who doesn’t have access or
permission to be there will be captured by the recognition system that alerts you instantly
about the trespassing.
As an example, let’s take a 24/7 drugstore. Any owner prefer to keep their money and clients
safe, avoiding unpleasant troubles with difficult visitors. When you have a FRT in place,
you’d be instantly alerted as soon as the wanted or suspicious character arrives. Which leads
to a significant reduces of expenses one usually spends on security staff.
 Easy Integration Process

Most of the time, integratable facial recognition tools work pretty flawlessly with the existing
security software that companies have installed. And they’re also easy to program for
interaction with a company’s computer system.
Why is it great for business? Well, you won’t need to spend additional money and time on
redeveloping your own software to make it suitable for FRT integration. Everything will be
already adaptable.
 High Accuracy Rates

These days, the success level of face tracking technology became higher than ever before.
Thanks to the assistance of 3D facial recognition technologies and infrared cameras the
process of identification happens to be incredibly accurate and showing great results. It’s
possible but difficult to fool such system, so you can be sure that an FR digital security
15
software will successfully track every aspect of attendances to provide a better level of
protection for your facilities.
Accuracy ensures that there won’t be any misunderstandings and uncool awkwardness that
comes from bad face recognition software. With high levels of accuracy you’d sure that the
right person will be recognized at the right time.
 Full Automation
Instead of manual recognition, which is done by security guards or the official representatives
outside of company’s premises, the facial recognition tech automates the identification
process and ensures its flawlessness every time without any haltings. You won’t even need an
employee to monitor the cameras 24/7.
Automation means convenience and reduces the expenses too. Therefore, any entrepreneur
would be fond of the fact that image identification systems are fully automated.
 Forget the Time Fraud

One of the big benefits that facial recognition technology companies offer is the time
attendance tracking that allows excluding the time fraud among the workers. No more
buddy favours from securities for staff members, since everyone now has to pass a face
scanning devices to check-in for work. And the paid hours begin from this moment till the
same check-out procedure. And the process will be fast due to the fact that employees don’t
have to prove their identities or clock in with their plastic cards.
It’s crucial for businessmen to trust their workers but keep an eye on them just in case.
Unfortunately, time fraud is one of the most common violations of the work ethics, but the
facial identification tech will spare you a headache regarding this matter.
Disadvantages
 Processing & Storing

Storages are like gold in a digital world since you have to save huge amounts of data for
future usage. Even though you get HD-video in a pretty low resolution, it still requires a
significant space. Just as the high-quality image visuals. There is no need to process every
video’s frame – it’s an enormous waste of resources. That’s why most of the time only a
fraction (around 10 – 25%) is actually being put through an FRT.
Professional agencies use whole clusters of computers in order to minimize total processing
time. But every added computer means considerable data transfer via network, which can be
influenced by input-output limitations that lower a processing speed.
 Image Size & Quality

It’s obvious that a facial recognition is a super advanced software that requires HQ digital
cameras for algorithms to operate accurately. A face-detection system captures a face in the
photo or screen-shot from a video, then the relative size of that face image will be compared
with the size of enrolled one. So, the photo’s quality here affects the whole face recognition
process, how well it would be done. Imagine, the already small size picture is coupled with a
16
distance that was between a target and a CCTV… What proportions will the detected face
have? No more than 100×200 pixels.
Pretty hard to get a clear identity in such case. What’s more, scanning a photo for varying
face sizes is a processor-intensive task. Most systems allow identification of a face-size range
to eliminate false recognition and speed up image processing. But the initial investment in
such face tracking software is not a cheap one, however, it will pay off in no time.
 Surveillance Angle
The identification process is also under a great pressure of the surveillance angle that was
responsible for the target’s face capturing. To enroll a face through the recognition software,
the multiple angles are being used – profile, frontal, 45-degree, etc. But to generate a clear
template for the face, you’ll need nothing less than a frontal view. The higher resolution
photo has and the more direct its angle is (goes for both enrolled and compared images) the
more accurate resulting matches would be.
Then, there are also troubles with such things as facial hair or sunglasses. One can still fool
the FRT with a suddenly appeared or removed beard, same goes for obscuring face’s parts
with glasses or masks. To avoid such failures, the databases must be regularly updated with
the most up-to-date images.
Future Scope
The use of spherical canonical images allows us to perform matching in the spherical
harmonic transform domain, which does not require preliminary alignment of the images.
The errors introduced by embedding into an expressional space with some predefined
geometry are avoided. In this facial expression recognition setup, end-to-end processing
comprises the face surface acquisition and reconstruction, smoothening, sub sampling to
approximately 2500 points. Facial surface cropping measurement of large positions of
distances between all the points using a parallelized parametric version is utilized.
The general experimental evaluation of the face expressional system guarantees better face
recognition rates. Having examined techniques to cope with expression variation, in future it
may be investigated in more depth about the face classification problem and optimal fusion of
color and depth information. Further study can be laid down in the direction of allele of gene
matching to the geometric factors of the facial expressions. The genetic property evolution
framework for facial expressional system can be studied to suit the requirement of different
security models such as criminal detection, governmental confidential security breaches etc.
Conclusions
The facial expression recognition system presented in this research work contributes a
resilient face recognition model based on the mapping of behavioural characteristics with the
physiological biometric characteristics. The physiological characteristics of the human face
with relevance to various expressions such as happiness, sadness, fear, anger, surprise and
17
disgust are associated with geometrical structures which restored as base matching template
for the recognition system.
The behavioural aspect of this system relates the attitude behind different expressions as
property base. The property bases are alienated as exposed and hidden category in genetic
algorithmic genes. The gene training set evaluates the expressional uniqueness of individual
faces and provide a resilient expressional recognition model in the field of biometric security.
The design of a novel asymmetric cryptosystem based on biometrics having features like
hierarchical group security eliminates the use of passwords and smart cards as opposed to
earlier cryptosystems. It requires a special hardware support like all other biometrics system.
This research work promises a new direction of research in the field of asymmetric biometric
cryptosystems which is highly desirable in order to get rid of passwords and smart cards
completely. Experimental analysis and study show that the hierarchical security structures are
effective in geometric shape identification for physiological traits.
The facial expression based face recognition system is made efficient with genetic algorithm
invariants of the facial surface resulting to a recognition rate of 95.4%. The illustration of this
model is given in this research work to build expressional representations using the concept
of hierarchy based embedding approach. The facial representation model is deployed in
laptop for biometric authentication process. The impact of the embedding space choice on the
metric (distortion) concludes that spaces with spherical geometry are more favorable for
representation of facial surfaces.
Bibliography
Aarts, Emile and Jan Korst (1989). Simulated Annealing and Boltzmann Machines: A
Stochastic Approach to Combinatorial Optimization and Neural Computing. New York, NY,
USA: John Wiley & Sons, Inc. ISBN: 0-471-92146-7. Abadi, Martín et al. (2016).
“TensorFlow: A system for large-scale machine learning”. In: CoRR abs/1605.08695.
Belhumeur, P. N. et al. (2011). “Localizing Parts of Faces Using a Consensus of Exemplars”.

In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition.
CVPR ’11. Washington, DC, USA: IEEE Computer Society, pp. 545–552. ISBN: 978-1-
4577-0394-2.
Berg, Thomas and Peter N. Belhumeur (2012). “Tom-vs-Pete Classifiers and Identity-
Preserving Alignment for Face Verification”. In: BMVC.
Cao, Xudong et al. (2013). “A Practical Transfer Learning Algorithm for Face Verification”.
In: Proceedings of the 2013 IEEE International Conference on Computer Vision. ICCV ’13.
Washington, DC, USA: IEEE Computer Society, pp. 3208–3215. ISBN: 978-1-4799-2840-8.
18

Seminar Report

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Seminar Report

Загружено:

Авторское право:

Доступные форматы

1

We also extend our gratitude to Prof. Abhijit Banerjee (H.O.D, Electronics

Rik Mitra Pritam Sengupta

1.1 The Face Recognition Problem

1.2 Goal and implementation

Face Recognition Problem

Fig. Variability in emotions of same Face

Artificial Neural Network (ANN)

What are they?

Fig. Three layers of ANN

How do they work?

How are they trained

∂o/ ∂Xo = ∂[φ(Xo)]/ ∂Xo = φ(Xo)(1 − φ(Xo))

Deep Neural Networks

Convolutional Neural Networks

2. Access and security

by allowing companies to recognise certain demographics – for instance, if the customer is a

 Easy Integration Process

 High Accuracy Rates

 Forget the Time Fraud

 Processing & Storing

 Image Size & Quality

Belhumeur, P. N. et al. (2011). “Localizing Parts of Faces Using a Consensus of Exemplars”.

Вам также может понравиться