Viola Jones Presentation

Object Detection: The Viola-Jones Face Detector
Augusto Morgan
Institute of Computing - University of Campinas
augusto.morgan@students.ic.unicamp.br
June 9, 2014
Augusto Morgan (IC)
Viola-Jones Face Detector
June 9, 2014
1 / 22
Overview
Object Detection

Haar-like features and the integral image
AdaBoost
Cascade of Weak Classifiers
Haar-like Features Extended Set
Augusto Morgan (IC)
June 9, 2014
2 / 22
Object Detection
How can we detect objects in an image?
Augusto Morgan (IC)
June 9, 2014
3 / 22
Object Detection
How can we detect objects in an image?

We can use a classifier:
Given an image, is it the object we are looking for or not?
But what if the images contains a lot of other objects?
We are interested in finding where in the image are the objects.
Augusto Morgan (IC)
June 9, 2014
3 / 22
Sliding Window
We can use the classifier in small portions of the image!

We slice the image in small subwindows and apply the classifier on each
one of them.
Problems?
Augusto Morgan (IC)
June 9, 2014
4 / 22
Viola-Jones Real-Time Face Detector
Proposed in 2001 by Paul Viola and Michael Jones

It discards a great number of negative samples before applying too
much processing time on them, achieving high frame-rates
How does it achieve that?
Augusto Morgan (IC)
June 9, 2014
5 / 22
Haar wavelet function

The classifier used in the paper is bases on Haar-like features.
Haar wavelet
function:
1 0 t < 21 ,
(t) = 1 12 < t 1,
0 otherwhise.
Figure: Haar wavelet
Augusto Morgan (IC)
June 9, 2014
6 / 22
Haar-like Features
Rectangles representing a score based on positive areas and negative areas.
Three kind of features: 2, 3 and 4 rectangles.
Each feature is calculated by:
X
X
f (i) =
IWhite
IBlack
Figure: The different types of Haar-Like

Features
Augusto Morgan (IC)
June 9, 2014
7 / 22
Haar-like Features
Rectangles representing a score based on positive areas and negative areas.
Three kind of features: 2, 3 and 4 rectangles.
Each feature is calculated by:
X
X
f (i) =
IWhite
IBlack
Problem: The number of
Haar-Like Features is too large!
For a 24x24 pixels window there
are more than 160,000 distinct
Haar-Like Features.
Note: this set is overcomplete.

Augusto Morgan (IC)
Figure: The different types of Haar-Like

Features
June 9, 2014
7 / 22
The Integral Image

New intermediate representation of
the image, similar to the Summed
Area Table used in CG.
Each pixel (x,y) contains the sum of
the original pixels above and to the
left of (x,y), inclusive.
ii(x, y ) =
i(x 0 , y 0 )
x 0 <x
y 0 <y
It can be computed in one pass over

the original image.
Augusto Morgan (IC)
Figure: The integral image
June 9, 2014
8 / 22
Features Calculation using the Integral Image
The sum of each rectangle can

be calculated using the integral
image in four array references.
Sum(R) = ii(A)ii(B)ii(D)+ii(C )
Figure: The sum of one
rectangle using the integral
image
Augusto Morgan (IC)
Each feature can then be

calculated in a few array
references.
June 9, 2014
9 / 22
Advantages and Drawbacks
Rectangular Features are very simple and coarse.

However they are really fast!
Augusto Morgan (IC)
June 9, 2014
10 / 22
Advantages and Drawbacks
Rectangular Features are very simple and coarse.

However they are really fast!
They can be calculated at different scales without the need to calculate a
Gaussian Pyramid and each level integral image, wich speeds up its use
with multiscale detection.
Every other feature strategy that need the Pyramid to be calculated for
multiscale runs slower than this approach.
Augusto Morgan (IC)
June 9, 2014
10 / 22
Training the Classifier
Given the features and the set of positive and negative examples, any
classifier can be trained.
There are, however, a huge number of features.
A very small number of features can be combined to create an effective
classifier.
How to find these features?
Augusto Morgan (IC)
June 9, 2014
11 / 22
A weak classifier
The weak classifier used in the paper takes as input a sub-window (x) and
consists of a feature (f ), a threshold () and a polarity (p) indicating the
direction of the following inequality:

1 pf (x) < p,
h(x, f , , p) =
0 otherwhise.
The weak classifier used can be viewed as a single node decision tree, a
stump.
For each feature, an optimal threshold is associated, which is used to
minimize the number of missclassifications.
Augusto Morgan (IC)
June 9, 2014
12 / 22
AdaBoost
AdaBoost is used to boost the performance of a simple learning algorithm.

It combines weak classification functions, to create a more powerfull one.
Augusto Morgan (IC)
June 9, 2014
13 / 22
AdaBoost
AdaBoost is used to boost the performance of a simple learning algorithm.

It combines weak classification functions, to create a more powerfull one.
At each round the examples are re-weighted to emphasize those which
were incorrectly classified by the previous weak classifier.
The final strong classifier is a weighted combination of weak classifiers
followed by a threshold.
Augusto Morgan (IC)
June 9, 2014
13 / 22
AdaBoost
We can see the AdaBoost procedure as a greedy feature selection process:

AdaBoost is actually selecting a small set of good features.
Augusto Morgan (IC)
June 9, 2014
14 / 22
AdaBoost
We can see the AdaBoost procedure as a greedy feature selection process:

AdaBoost is actually selecting a small set of good features.
This way, the weak learning algorithm tries to select the single rectangle
that best separate the positive and negative examples.
Augusto Morgan (IC)
June 9, 2014
14 / 22
Training
Done in multiples rounds.
Augusto Morgan (IC)
June 9, 2014
15 / 22
Training

All examples start with the same weight.
Augusto Morgan (IC)
June 9, 2014
15 / 22
Training

At each round it searches over a large set of features and thresholds,
choosing the feature/threshold that minimize the weighted error.
Augusto Morgan (IC)
June 9, 2014
15 / 22
Training

At each round it searches over a large set of features and thresholds,
choosing the feature/threshold that minimize the weighted error.
The examples wrongly classified have their weight changed and the process
is repeated.
Augusto Morgan (IC)
June 9, 2014
15 / 22
Considerations
Huge set of possible features and related thresholds (NK , where N is the
number of examples and K the number of features).
For 20000 samples and 160000 features (the number for the 24x24 pixels
subwindow) contains 3.2 billion distincts classifiers!
If using M rounds, AdaBoost takes O(MKN).
Augusto Morgan (IC)
June 9, 2014
16 / 22
Considerations
Huge set of possible features and related thresholds (NK , where N is the
number of examples and K the number of features).
For 20000 samples and 160000 features (the number for the 24x24 pixels
subwindow) contains 3.2 billion distincts classifiers!
If using M rounds, AdaBoost takes O(MKN).
For each subwindow, all the classifiers are used and combined to get the
final answer.
What if we could eliminate subwindows earlier?
Augusto Morgan (IC)
June 9, 2014
16 / 22
The Attentional Cascade

The insight is that smaller, and therefore more efficient, boosted classifiers
can be constructed which reject many of the negative sub-windows while
detecting almost all positive instances.
Augusto Morgan (IC)
June 9, 2014
17 / 22

The insight is that smaller, and therefore more efficient, boosted classifiers
can be constructed which reject many of the negative sub-windows while
detecting almost all positive instances.
This can be done by adjusting the threshold in the AdaBoost algorithm, to
minimize false-negatives.
Figure: The first features selected by AdaBoost
Augusto Morgan (IC)
June 9, 2014
17 / 22

They achieved 100% Hit Rate, and 50% False Positive in the first 2
feature classifier.
Far from acceptable, but, with a few operations they can discard around
50% of the non-face sub-windows. And this is only the first classifier.
Augusto Morgan (IC)
June 9, 2014
18 / 22

They achieved 100% Hit Rate, and 50% False Positive in the first 2
feature classifier.
Far from acceptable, but, with a few operations they can discard around
50% of the non-face sub-windows. And this is only the first classifier.
A cascade of classifiers is built this way, with the positive output of each
one, activating the next one, using the more complex classifiers only in the
sub-windows that are more likely a face.
Since the great majority of sub-windows of an image are negative, the
cascade tries to eliminate as many sub-windows as possible at the earliest
stage possible.
Augusto Morgan (IC)
June 9, 2014
18 / 22
Figure: The Classifier Cascade
In the end, a post-processing step is taken to handle multiple-detections of

the same face, to have no duplicates.
Augusto Morgan (IC)
June 9, 2014
19 / 22
Haar-like Features Extended Set

Proposed by Rainer Lienhart and Jochen Maydt in 2002.
Same principle, more variability.
Figure: The extended Haar-like feature set
Augusto Morgan (IC)
June 9, 2014
20 / 22
Rotated Summed Area Table
Figure: The rotated integral image
Augusto Morgan (IC)
June 9, 2014
21 / 22
References
Viola, P. and Jones. M., CVPR 2001, Rapid Object Detection using a
Boosted Cascade of Simple Features
Viola, P. and Jones. M., International Journal of Computer Vision
v. 57 2004, Robust Real-Time Face Detection.
Lienhart, R. and Maydt, J., IEEE ICIP 2002, An Extended Set of
Haar-like Features for Rapid Object Detection
Weisstein, Eric W. Haar Function. From MathWorldA Wolfram
Web Resource. http://mathworld.wolfram.com/HaarFunction.html
Augusto Morgan (IC)
June 9, 2014
22 / 22

Viola Jones Presentation

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Viola Jones Presentation

Загружено:

Авторское право:

Доступные форматы

Object Detection: The Viola-Jones Face Detector

Augusto Morgan (IC)

Viola-Jones Face Detector

Viola-Jones Face Detector

Haar-like Features Extended Set

Augusto Morgan (IC)

Viola-Jones Face Detector

How can we detect objects in an image?

Augusto Morgan (IC)

Viola-Jones Face Detector

How can we detect objects in an image?

Augusto Morgan (IC)

Viola-Jones Face Detector

We can use the classifier in small portions of the image!

Augusto Morgan (IC)

Viola-Jones Face Detector

Viola-Jones Real-Time Face Detector

Proposed in 2001 by Paul Viola and Michael Jones

Augusto Morgan (IC)

Viola-Jones Face Detector

Haar wavelet function

Augusto Morgan (IC)

Viola-Jones Face Detector

Figure: The different types of Haar-Like

Viola-Jones Face Detector

Note: this set is overcomplete.

Figure: The different types of Haar-Like

Viola-Jones Face Detector

The Integral Image

It can be computed in one pass over

Augusto Morgan (IC)

Viola-Jones Face Detector

Figure: The integral image

Features Calculation using the Integral Image

The sum of each rectangle can

Augusto Morgan (IC)

Each feature can then be

Viola-Jones Face Detector

Advantages and Drawbacks

Rectangular Features are very simple and coarse.

Augusto Morgan (IC)

Viola-Jones Face Detector

Advantages and Drawbacks

Rectangular Features are very simple and coarse.

Augusto Morgan (IC)

Viola-Jones Face Detector

Training the Classifier

Augusto Morgan (IC)

Viola-Jones Face Detector

Augusto Morgan (IC)

Viola-Jones Face Detector

AdaBoost is used to boost the performance of a simple learning algorithm.

Augusto Morgan (IC)

Viola-Jones Face Detector

AdaBoost is used to boost the performance of a simple learning algorithm.

Augusto Morgan (IC)

Viola-Jones Face Detector

We can see the AdaBoost procedure as a greedy feature selection process:

Augusto Morgan (IC)

Viola-Jones Face Detector

We can see the AdaBoost procedure as a greedy feature selection process:

Augusto Morgan (IC)

Viola-Jones Face Detector

Done in multiples rounds.