You are on page 1of 17

# Non-Classical

Non-Parametric Methods
• Minimum Variance Method (MVSE)
LO-2.4.3, H-8.3

1/17
Recall: Family of Non-Parametric Methods

## “Classical” Methods “Non-Classical” Methods

(FT-Based Methods) (Non-FT-Based Methods)

## Periodogram-Based ACF-Est.-Based Filter Bank View

• Periodogram • Blackman-Tukey • Minimum Variance
• Modified
• Bartlett
• Welch

2/17
Minimum Variance Method

3/17
Minimum Variance Method – Terminology
The Minimum Variance Spectral Estimation (MVSE) method has
two other names:
• Maximum Likelihood Method (MLM)
• Capon’s Method

Note: The names MVSE and MLM are actually misnomers – this
method:
• does NOT minimize the variance of the estimate
• does NOT maximize the “likelihood function”

4/17
Recall: Filter Bank View of Periodogram
See Fig. 8.4 & 8.3 of Hayes
The problem is leakage from nearby frequencies:

## True PSD Sx(ω)

|Hi(ω)|2

ω
Filter Sidelobes Leak
“Out-of-Band” Power
into Estimate at ωi

5/17
Goal for MVSE Method
Figure out a way to design each filter bank channel response to
minimize the leakage – this is thus a data-dependent design.
Collect Data Î “Design” Filters for Filter Bank
Want to “design” filters to minimize the sidelobes while keeping
the mainlobe height at 1:
“Design” Goals:
1. Want Hi(ωi) = 1 …to let through the desired Sx(ωi)
2. Minimize total output power in the filter:
π
1

2
ρi = H i (ω ) S x (ω )dω

−π

## This is equivalent to minimizing the sidelobe contribution

even though the integral includes the desired ωi
6/17
LO-2.4.3
Get Useable Form for ρ p −1
The frequency response of filter hi[n] is: H i (ω ) = ∑ hi [n ]e − jωn
n =0
Using this in the expression for ρ gives:
p = Filter Length
π ⎡ p −1 ⎤ ⎡ p −1 ⎤ p<N
1
∫ ⎢⎢ ∑ hi [k ]e ⎥⎥ ⎢⎢ ∑ hi [l ]e ⎥⎥ S x (ω )dω
− jωk jωl
ρi = *
2π ⎦ ⎣ l =0 ⎦
−π ⎣ k =0
p −1 p −1 π
∑ ∑ i i [l ]
1 jω ( l − k )
= ∫ ω dω
*
h [ k ]h S x ( ) e
k =0 l =0

 −π
Now… Recognize as
= rx [ l − k ] vector-matrix-vector
p −1 p −1 multiplication
= ∑ ∑ i i [l ]rx [l − k ]
h [ k ]h *

k =0 l =0 Rx = Autocorrelation Matrix

= h iH R x h i
“H” superscript = Hermitian Transpose
= Transpose & Conjugate 7/17
Autocorrelation Matrix
The AC matrix is the p×p matrix whose i,j element is rx[j – i].
Example for p = 4:
⎡ rx [0] rx [1] rx [2] rx [3]⎤
⎢ ⎥
⎢ ⎥
⎢ ⎥
⎢ rx [ −1] rx [0] rx [1] rx [2]⎥
⎢ ⎥
Rx = ⎢ ⎥
⎢ ⎥
⎢ r [ −2] r [ −1] r [0] rx [1] ⎥
⎢x x x

⎢ ⎥
⎢ ⎥
⎢ r [ −3] r [ −2] r [ −1] rx [0]⎥⎦
⎣ x x x

8/17
Not in Book
Now Minimize the Matrix Form:
For each i, minimize this: ρ i = h iH R x h i
Under this constraint:
H i (ω i ) = 1 ⇒ h iH e i = 1

## where e i = [1 e jωi e j 2ωi " e j ( p −1)ωi ]T

Most common way to do constrained optimization is using the
Lagrange Multiplier method:

J = hi R x hi
H
−λ (H
hi ei −1 )
Lagrange says: Choose hi and λ to minimize J

9/17
Lagrange Minimization:
To find the hi and λ that minimizes J, in general set:
∂J ∂J
=0 T
& =0
∂h i ∂λ
But often an easier way is to do these two steps:
1. Do the partial w.r.t. hi and solve for hi
2. Then choose λ to ensure solution meets the constraint
So… we need: ∂J
= 0T
∂h i

## What is a partial derivative

w.r.t. a vector???!!!
It is called the Gradient.
(Comes from Multi-D Calculus) 10/17
Aside: Gradient
If g(x) is a scalar-valued function of a real-valued vector x,
Then the gradient of g(x) is defined as:
∂g ( x ) ⎡ ∂g ( x ) ∂g ( x ) ∂g ( x ) ⎤
∇x ( g ) = =⎢ " ⎥
∂x ⎣ 1∂x ∂x 2 ∂x N ⎦
Gradient here is nothing more than: the vector whose elements
are the partials w.r.t. each element of x.
(Note: There are similar definitions when g(x) is vector-valued.)
N
“Example” g ( x ) = c x = ∑ ci xi = c1 x1 + c2 x2 + " + cn x N
T

i =1
∂g ( x ) ⎡ ∂g ( x ) ∂g ( x ) ∂g ( x ) ⎤
=⎢ " ⎥
∂x ⎣ 1∂x ∂x 2 ∂x N ⎦
= [c1 c2 " c N ]
= cT 11/17
Lagrange Minimization (cont.):
Step #1: For our scalar-valued function J we get: Subscript “o”
∂J set indicates
= h iH,o R x − λe iH = 0T “Optimal” – the h
∂h i needed to get 0T

## using standard results for gradients of common functions of vectors

−1
Now solve this for hi,o: h i ,o = λe i R x
H H
But…Depends on λ!!

## Step #2: Choose λ to make this solution satisfy constraint:

H
(
H −1
)
h i ,o e i = 1 ⇒ λe i R x e i = 1 ⇒ λ = H −1
1
ei R x ei
…. Now use this λ in optimal hi,o:

e iH R −x 1 R −x 1e i
h iH,o = h i ,o =
e iH R −x1e i e iH R −x1e i 12/17
LO-2.4.3
MVSE – Filter Solution
e iH R −x 1 R −x 1e i
h iH,o = h i ,o =
e iH R −x1e i e iH R −x1e i

## where e i = [1 e jωi e j 2ωi " e j ( p −1)ωi ]

This gives the optimal filter for estimating the power at the
frequency ωi

## In principle – we need to solve this for each frequency ωi at

which we wish to get a PSD estimate.

## Then we would compute the output power at each filter and

that would be our PSD estimate.

## BUT…. we have an equation for the output power: ρi

13/17
MVSE – Power Estimate in Each Channel
The estimate of the power at ωi is nothing more than the
minimized value of ρi:
ρ i ,o = h iH,o R x h i ,o
H
⎡ e iH R −x1 ⎤ ⎡ e iH R −x1 ⎤ ⎡ e iH R −x1 ⎤ ⎡ R −x1e i ⎤
= ⎢ H −1 ⎥ R x ⎢ H −1 ⎥ = ⎢ H −1 ⎥ R x ⎢ H −1 ⎥
⎢⎣ e i R x e i ⎥⎦ ⎢⎣ e i R x e i ⎥⎦ ⎢⎣ e i R x e i ⎥⎦ ⎢⎣ e i R x e i ⎥⎦
e iH R −x1R x R −x1e i e iH R −x1e i
= =
(e iH R −x1e i )(e iH R −x1e i ) (e iH R −x1e i )(e iH R −x1e i )

## Thus, the estimated power at frequency ωi is:

1
σˆ x2 (ω i ) =
e iH R −x1e i

14/17
MVSE – PSD Estimate in Each Channel
To get the power spectral density we need to divide by the
filter’s bandwidth – for a filter of length p the BW is
approximately 1/p so our MVSE PSD estimate is:

p
Sˆ MV (ω i ) =
e iH R −x1e i

## But… this estimate requires the ACF in matrix form, which

if we had it we’d probably know what the PSD is, too!!!

## So… we need an estimate of the ACF in matrix form….

15/17
MVSE – Estimating The AC Matrix

## ⎡ rˆx [0] rˆx [1] " rˆx [ p − 1]⎤

⎢ ⎥ N − k −1

ˆ =⎢
R
⎢ rˆx [ −1] rˆx [0] % ⎥

rˆx [k ] = 1
N ∑ x[n + k ]x*[m]
x n=0
⎢ # % % rˆx [1] ⎥
⎢ ⎥
⎢ rˆx [ − p + 1] " rˆx [ −1] rˆx [0] ⎥⎦

Note: The p× p
AC Matrix
MUST
be Estimated
p
Sˆ MV (ω i ) = Choose p < N so that high-order ACF lag
ˆ −1e
eH R estimates are reasonably accurate.
i x i

## Note: There are other ways to estimate the AC Matrix!!! 16/17

MVSE – Comments
Implementation of MVSE:
Generally done directly on the data matrix X for efficiency
(see more advanced books)
Even with that, it is more complex than classical methods

Performance of MVSE:
Provides better resolution than classical methods
Mostly used when spiky spectra are expected
(Although, the AR methods are usually better in that case)

## If needed resolution can be met with classical – use them.

If not – consider either MVSE or Parametric Methods.

17/17