Вы находитесь на странице: 1из 30

Object Detection and

Tracking
Mike Knowles
11
th
January 2005
http://postgrad.eee.bham.ac.uk/knowlm
Introduction
Goal to detect and track objects moving
independently to the background
Two situations to be considered:
Static Background
Moving Background
Applications of Motion Tracking
Control Applications
Object Avoidance
Automatic Guidance
Head Tracking for Video Conferencing
Surveillance/Monitoring Applications
Security Cameras
Traffic Monitoring
People Counting
My Work
Started by tracking moving objects in a
static scene
Develop a statistical model of the
background
Mark all regions that do not conform to the
model as moving object
My Work
Now working on object detection and
classification from a moving camera
Current focus is motion compensated
background filtering
Determine motion of background and
apply to the model.

Detecting moving objects in a static
scene
Simplest method:
Subtract consecutive frames.
Ideally this will leave only moving objects.
This is not an ideal world.

Using a background model
Lack of texture in objects mean incomplete
object masks are produced.
In order to obtain complete object masks
we must have a model of the background
as a whole.
Adapting to variable backgrounds
In order to cope with varying backgrounds
it is necessary to make the model dynamic
A statistical system is used to update the
model over time
Background Filtering
My algorithm based on:
Learning Patterns of Activity using Real-Time Tracking
C. Stauffer and W.E.L. Grimson. IEEE Trans. On Pattern
Analysis and Machine Intelligence. August 2000
The history of each pixel is modelled by a
sequence of Gaussian distributions
Multi-dimensional Gaussian
Distributions
Described mathematically as:




More easily visualised as:
(2-Dimensional)

( )
( )
( ) ( )
t t
T
t t
X X
n
t
e X

t
q
E

E
= E
1
2
1
2
1
2
2
1
, ,
Simplifying.
Calculating the full Gaussian for every
pixel in frame is very, very slow
Therefore I use a linear approximation
How do we use this to represent a
pixel?
Stauffer and Grimson suggest using a
static number of Gaussians for each pixel
This was found to be inefficient so the
number of Gaussians used to represent
each pixel is variable

Weights
Each Gaussian carries a weight value
This weight is a measure of how well the
Gaussian represents the history of the pixel
If a pixel is found to match a Gaussian then the
weight is increased and vice-versa
If the weight drops below a threshold then that
Gaussian is eliminated
Matching
Each incoming pixel value must be
checked against all the Gaussians at that
location
If a match is found then the value of that
Gaussian is updated
If there is no match then a new Gaussian
is created with a low weight
Updating
If a Gaussian matches a pixel, then the
value of that Gaussian is updated using
the current value
The rate of learning is greater in the early
stages when the model is being formed
Static Scene Object Detection and
Tracking
Model the background and subtract to
obtain object mask
Filter to remove noise
Group adjacent pixels to obtain objects
Track objects between frames to develop
trajectories
Moving Camera Sequences
Basic Idea is the same as before
Detect and track objects moving within a
scene
BUT this time the camera is not
stationary, so everything is moving
Motion Segmentation
Use a motion estimation algorithm on the
whole frame
Iteratively apply the same algorithm to
areas that do not conform to this motion to
find all motions present
Problem this is very, very slow
Motion Compensated Background
Filtering
Basic Principle
Develop and maintain background model as
previously
Determine global motion and use this to
update the model between frames

Advantages
Only one motion model has to be found
This is therefore much faster
Estimating motion for small regions can be
unreliable
Not as easy as it sounds though..
Motion Models
Trying to determine the exact optical flow
at every point in the frame would be
ridiculously slow
Therefore we try to fit a parametric model
to the motion
Affine Motion Model
|
|
.
|

\
|
|
|
.
|

\
|
+
|
|
.
|

\
|
=
|
|
.
|

\
|
y
x
a a
a a
a
a
v
u
5 4
2 1
3
0
The affine model describes the vector at each
point in the image
Need to find values for the parameters that best
fit the motion present
Background Motion Estimation
Uses a framework developed by Black and
Anandan
Black M.J. and Anandan P. The robust estimation of motion
models: Parametric and Piecewise-smooth Fields, Computer
Vision and Image Understanding, Vol. 63, No. 1, pp. 75-104,
January 1996.
For more details see my talk from last year
Examples

Other approaches to Tracking
Many approaches using active contours
a.k.a. snakes
Parameterised curves
Fitted to the image by minimising some cost
function often based on fitting the contour to
edges
Constraining shape
To avoid the snake being influenced by
point we arent interested in, use a model
to constrain its shape.
CONDENSATION
No discussion on tracking can omit the
CONDENSATION algorithm developed by
Isard and Blake.
CONditional DENSity propagATION
Non-gaussian substitute for the Kalman
Filter
Uses factored sampling to model non-
gaussian probabiltiy densities and
estimate propogate them though time.
CONDENSATION
Thus we can take a set of parameters and
estimate them from frame to frame, using
current information from the frames
These parameters may be positions or
shape parameters from a snake.
CONDENSATION - Algorithm
Randomly take samples from the previous
distribution.
Apply a random drift and deterministic diffusion
based on a model of how the parameters
behave to the samples.
Weight each sample on the basis of the current
information.
Estimate of actual value can be either a
weighted average or a peak value from the
distribution
Summary
Static-scene background subtraction
methods
Extensions to moving camera systems
Use of model-constrained active contour
systems
CONDENSATION

Вам также может понравиться