Non deterministic estimation of parameters

Estimation

Harish K,(EP12B009) Aditya Gurunathan,(EE12B126) and T R Sriram,(EE12B056)

methods to estimate the density of a random variable. Nonparametric models differ from parametric models in that the

model structure is not specified a priori but is instead determined

from data.

We will be able to show asymptotically that optimal Histograms

error function converges at n2/3 and KDE at n4/5 and so KDE

is a better estimator of the density than the histogram.

For the mean squared error, we have a convenient decomposition

I. I NTRODUCTION

Nonparametric statistics are statistics not based on parameterized

families of probability distributions. The term non-parametric means

that the number and nature of the parameters are flexible and not

fixed in advance.

m

X

pj

I(x Bj )

(1)

fN (x) =

h

j=1

Y

the bin Bj , pj = Nj . There is no optimal procedure for determining

the number of bins, and different bin sizes can reveal different

features of the data. Using wider bins where the density is low reduces

noise due to sampling randomness; using narrower bins where the

density is high gives greater precision to the density estimation. It

can be shown that under certain conditions E(fN (x)) f (x).

1

(E(I(x [xo , xo + h])))

hZ

1

=

I(x [xo , xo + h])fN (xi )dxi

h

Z xo +h

1

=

fN (x) + f 0 (

x)(xi xo )dxi

h xo

Z xo +h

Z xo +h

1

1

dxi +

f 0 (

x)(xi xo ) dxi

= fN (xo )

h

h

xo

xo

(2)

E(fN (x)) =

The first term in the above equation reduces to fN (xo ) and the second

term can be shown to be bounded by C*h, if the derivative is assumed

to be bounded by C. Consistency of the histogram requires that h0

as n.

Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite

data sample. Let (x1, x2, . . . , xn) be an independent and identically

distributed sample drawn from some distribution with an unknown

density f . We are interested in estimating the shape of this function

f . Its kernel density estimator is

N

N

1 X

1 X

x xi

fN =

Kh (x xi ) =

K(

)

N i=1

hN i=1

h

(4)

For our estimator fN (x) =

II. H ISTOGRAM

1

i

K( xx

)

h

h

E(fN (x)) =

1

N

1

xt

K(

)f (t)dt

h

h

Z

(5)

K(t)f (x ht)dt

Z

1

= f (x) + h2 f 00 (x) t2 K(t)dt + . . .

2

We have used the fact that the kernel integrates to 1 and has zero

mean in deriving the last equation.

The bias can now be written as,

Z

1

E(fN (x)) f (x) = h2 f 00 (x) t2 K(t)dt + O(h4N ) (6)

2

can be written as,

=

V (fN (x)) =

f (x)2 (x)dx

1

+ O( )

N hN

N

(7)

V (fN (x)) 0 as N hN .

When we differentiate the error function with respect to h and set

it equal to zero, the optimal bandwidth which minimizes the error

function can be asymptotically approximated as

R

K(x)2 dx

1 15

R

h =[ R 2

]

(8)

( x K(x)dx)2 (f 00 (x))2 dx N

This also satisfies the property of E(fN (x)) f (x) 0 as h 0

and V (fN (x)) 0 as N h .

V. C ONCLUSION

Kernel density estimates are closely related to histograms, but can

be endowed with properties such as smoothness or continuity by

using a suitable kernel.

R EFERENCES

(3)

and has zero mean and h(> 0) is a smoothing parameter called the

[1]

[2]

[3]

[4]

Steven Kay - Fundamentals of Statistical Signal Processing, Volume I

http://www.cc.gatech.edu/agray/6740fall09/

http://athena.sas.upenn.edu/petra/class721/nonpar3.pdf

