Mean Shift Algorithm

Mean Shift Algorithm
Pi19404
September 23, 2013
Contents
Contents
0.1 Introduction . . . . . . . . . . . . . . . . . . 0.2 Kernel density Estimation . . . . . . . 0.3 Mean Shift . . . . . . . . . . . . . . . . . . . 0.3.1 Modes of Smooth function . 0.3.2 Using the Gradient . . . . . . . 0.3.3 Local Maxima . . . . . . . . . . . . . 0.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 3 4 7 8 8
2|8

0.1 Introduction
In the article we will look at the basics of Mean Shift Algorithm.
0.2 Kernel density Estimation

let us first consider a univariate gaussian PDF and sampled data from the PDF. The kernel density estimation uses fact that the density of samples about a given point is proportional to its probability. It approximates the probability density by estimating the local density of points as seen in figure ?? is resonable. Large density of points are observed near the maximum of PDF. The KDE estimate the PDF by superposition of kernel at each data point,which is equivalent to convolving the data points with a gaussian kernel.
0.3 Mean Shift

Mean shift is a tool for finding the modes in a set of data samples which is sampled from underlying PDF.The aim of the mean shift algorithm is to find the densest region in given set of data samples. Data points density implies PDF value . Let us consider a 2D region.The points in the 2D region are sampled from a underlying unknown PDF. Let
X
= [x; y ]be
random variables associated with a multi-variate PDF
P (X )
.
P (X )
Thus sampling a point
will give us a vector
H = [xH ; y H ]
3|8
(a) original PDF
(b) Sampled data
(c) Density estimate

Figure 1: Density estimation
For example let us consider a multi-variate gaussian distribution where the random variables x and y take values in the range -3 to 3.
0.3.1 Modes of Smooth function

Let us say we want to find the modes of PDF.The PDF is approximated using kernel density estimation.Modes are the points at which PDF exhibits local maximum . Dense regions in PDF corresponds to modes or local maxima. Since the kernel is smooth,its differentiable.It gives to a smooth
4|8
Mean Shift Algorithm PDF.The gradient of density estimate is given by

^ (x) = f h 1
n
n X i=1
Kh (x
xi )
= 1
1
nh
n x xi X
K
r r r ^h(
f x)
^ (x) = f h
n X i=1
i=1
nh
x xi
h
^ (x) = f h
C nh
n X i=1
for gaussian kernel
r
2
exp
(x xi )2
2
C nh
n X
h
exp
(x xi )2
2
K
i=1
1
^ (x) = f h
nh
n x xi X
h
((
x
xi))
equating the gradient to 0
i=1 n X
K
((
x x xi)
xi))
xH xi
h
n xH xi X
K
i=1
h
(
K
=0
i=1
n x xi X
H x =
The estimate is xH
=
m(x)
Pn
i=1

K
xi
Pn
i=1
x xi h

xi
i=1 K
xxi h
is called the sample mean at x with kernel K.
This will always be biased towards region with high density. Thus if we were to move along the vector m(x) x,we would reach the region with higher density.The density at m(x) will be greater than density at x. This forms the basis of mean shift algorithm. The vector m(x) x is called the mean shift vector which always points in the direction of local maximum or mode.
m(x)
Pn
x
i=1 K
Pn
xxi (xi h
i=1 K
x
m(x)
2r
h
xxi h
x)
fh (x)
fh (x)
This is a estimate of normalize gradient of fh (x) Given any point x,we can take a small step in the direction of vector m(x) x to reach the local maximum.
5|8
Let us consider that the present estimate of the mode is we compute m(x) at this point.
, then
For examples let initial estimate of the location of mode be (0:96; 2:01) The density at this point can be approximated by interpolation or computed again using non parametric density estimation The plot for this is show in 2.The estimate clearly does not lie at maximum. To find the direction of the mean shift vector we find the gradient of the normalize density estimate and take a smalll step in that direction.This is perform till gradient magnitude is very small A video for mean shift algorithm using KDE is shown in https:
Figure 2: Mean Shift
//googledrive.com/host/0B-pfqaQBbAAtNkg2bUJvWERmNFU/a2.avi
In this case we scale the gradient by the estimated PDF values to obtain normalize gradient values.
m(x)
2r
h
fh (x)
fh (x)
This enabled us to adaptively change the step size based on estimated PDF value.The step size magnitude is iversly proportional to estimated PDF values. if the estimated PDF values is small,we are far away from the maximum and the step size will be large. If the estimate PDf value is large,we are close to maximum and the step size will be small.
6|8
0.3.2 Using the Gradient

to find the modes of the PDF,we do not actually required to estimate the PDF,we require just the gradient of the PDF to move in the direction of the mean shift vector. The gradient of superposition of kernels centered at each data point is equivalent to convolving the data points with gradient of the kernel. Instead of convolving with gaussian kernel,we can convolve with gradient of gaussian kernel.
k (X )
=
x

C exp k (x);
+ y2
k (X )
2
y
i
k (x)
Thus given a intial point X ,we estimate the value at X using the x k (x) and x k (x) which gives us the direction of gradient kernels h h at the point X Since we do not actually estimate the PDF at a point,but estimate the gradient of PDF each time during the mean shift iteration we need to take a step in direction of mean shift vector,in the earlier case ,we used the scale the gradient by the estimated PDF values to obtained a normalized measure. However in the present case we do not adaptively change the step size but take a step of fixed size in direction of the gradient. This still incorporates some adaptive behavior,since mean shift vector magnitude depends on the gradient magnitude. If gradient magnitude is large,step size take will be large else step take will be small and refined ,near the maximum. video of mean shift algorithm using gradient estimates is shown in https://googledrive.com/host/0B-pfqaQBbAAtNkg2bUJvWERmNFU/a3.avi This iterative algorithm is a standard gradient descent algorithm and the convergence is guranteed for infinately small step size. Since the algorithm depends on kernel density estimate, the band-
7|8
Mean Shift Algorithm with of kernel will play a important role in mean shift algorithm as well.
0.3.3 Local Maxima

If we reach a region,where local density is flat or we have reached a local maximum.The algorithm will terminate. this is a problem in case of all algorithms trying to reach a global maximum.The animation for the same is shown in https://googledrive. com/host/0B-pfqaQBbAAtNkg2bUJvWERmNFU/a1.avi
(a) original PDF

Figure 3: Mean shift
0.4 Code
The Code is written in matlab and available in repository https:// github.com/pi19404/m19404/tree/master/meanshift the file mean_shift.m is the main file.The file kgde2 implements kernel density estimator using bivariate gaussian windows for 2D distributions.The file kgde2x implements estimation of gradient on KDE .The dim parameter decides the computation of gradient along x and y directions.
8|8

Mean Shift Algorithm

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Mean Shift Algorithm

Загружено:

Авторское право:

Доступные форматы

Mean Shift Algorithm