You are on page 1of 17

Non-Classical

Non-Parametric Methods
• Minimum Variance Method (MVSE)
LO-2.4.3, H-8.3

1/17
Recall: Family of Non-Parametric Methods

“Classical” Methods “Non-Classical” Methods


(FT-Based Methods) (Non-FT-Based Methods)

Periodogram-Based ACF-Est.-Based Filter Bank View


• Periodogram • Blackman-Tukey • Minimum Variance
• Modified
• Bartlett
• Welch

2/17
Minimum Variance Method

3/17
Minimum Variance Method – Terminology
The Minimum Variance Spectral Estimation (MVSE) method has
two other names:
• Maximum Likelihood Method (MLM)
• Capon’s Method

Note: The names MVSE and MLM are actually misnomers – this
method:
• does NOT minimize the variance of the estimate
• does NOT maximize the “likelihood function”

4/17
Recall: Filter Bank View of Periodogram
See Fig. 8.4 & 8.3 of Hayes
The problem is leakage from nearby frequencies:

True PSD Sx(ω)

|Hi(ω)|2

ω
Filter Sidelobes Leak
“Out-of-Band” Power
into Estimate at ωi

5/17
Goal for MVSE Method
Figure out a way to design each filter bank channel response to
minimize the leakage – this is thus a data-dependent design.
Collect Data Î “Design” Filters for Filter Bank
Want to “design” filters to minimize the sidelobes while keeping
the mainlobe height at 1:
“Design” Goals:
1. Want Hi(ωi) = 1 …to let through the desired Sx(ωi)
2. Minimize total output power in the filter:
π
1

2
ρi = H i (ω ) S x (ω )dω

−π

This is equivalent to minimizing the sidelobe contribution


even though the integral includes the desired ωi
6/17
LO-2.4.3
Get Useable Form for ρ p −1
The frequency response of filter hi[n] is: H i (ω ) = ∑ hi [n ]e − jωn
n =0
Using this in the expression for ρ gives:
p = Filter Length
π ⎡ p −1 ⎤ ⎡ p −1 ⎤ p<N
1
∫ ⎢⎢ ∑ hi [k ]e ⎥⎥ ⎢⎢ ∑ hi [l ]e ⎥⎥ S x (ω )dω
− jωk jωl
ρi = *
2π ⎦ ⎣ l =0 ⎦
−π ⎣ k =0
p −1 p −1 π
∑ ∑ i i [l ]
1 jω ( l − k )
= ∫ ω dω
*
h [ k ]h S x ( ) e
k =0 l =0

 −π
Now… Recognize as
= rx [ l − k ] vector-matrix-vector
p −1 p −1 multiplication
= ∑ ∑ i i [l ]rx [l − k ]
h [ k ]h *

k =0 l =0 Rx = Autocorrelation Matrix

= h iH R x h i
“H” superscript = Hermitian Transpose
= Transpose & Conjugate 7/17
Autocorrelation Matrix
The AC matrix is the p×p matrix whose i,j element is rx[j – i].
Example for p = 4:
⎡ rx [0] rx [1] rx [2] rx [3]⎤
⎢ ⎥
⎢ ⎥
⎢ ⎥
⎢ rx [ −1] rx [0] rx [1] rx [2]⎥
⎢ ⎥
Rx = ⎢ ⎥
⎢ ⎥
⎢ r [ −2] r [ −1] r [0] rx [1] ⎥
⎢x x x

⎢ ⎥
⎢ ⎥
⎢ r [ −3] r [ −2] r [ −1] rx [0]⎥⎦
⎣ x x x

8/17
Not in Book
Now Minimize the Matrix Form:
For each i, minimize this: ρ i = h iH R x h i
Under this constraint:
H i (ω i ) = 1 ⇒ h iH e i = 1

where e i = [1 e jωi e j 2ωi " e j ( p −1)ωi ]T


Most common way to do constrained optimization is using the
Lagrange Multiplier method:

J = hi R x hi
H
−λ (H
hi ei −1 )
Lagrange says: Choose hi and λ to minimize J

9/17
Lagrange Minimization:
To find the hi and λ that minimizes J, in general set:
∂J ∂J
=0 T
& =0
∂h i ∂λ
But often an easier way is to do these two steps:
1. Do the partial w.r.t. hi and solve for hi
2. Then choose λ to ensure solution meets the constraint
So… we need: ∂J
= 0T
∂h i

What is a partial derivative


w.r.t. a vector???!!!
It is called the Gradient.
(Comes from Multi-D Calculus) 10/17
Aside: Gradient
If g(x) is a scalar-valued function of a real-valued vector x,
Then the gradient of g(x) is defined as:
∂g ( x ) ⎡ ∂g ( x ) ∂g ( x ) ∂g ( x ) ⎤
∇x ( g ) = =⎢ " ⎥
∂x ⎣ 1∂x ∂x 2 ∂x N ⎦
Gradient here is nothing more than: the vector whose elements
are the partials w.r.t. each element of x.
(Note: There are similar definitions when g(x) is vector-valued.)
N
“Example” g ( x ) = c x = ∑ ci xi = c1 x1 + c2 x2 + " + cn x N
T

i =1
∂g ( x ) ⎡ ∂g ( x ) ∂g ( x ) ∂g ( x ) ⎤
=⎢ " ⎥
∂x ⎣ 1∂x ∂x 2 ∂x N ⎦
= [c1 c2 " c N ]
= cT 11/17
Lagrange Minimization (cont.):
Step #1: For our scalar-valued function J we get: Subscript “o”
∂J set indicates
= h iH,o R x − λe iH = 0T “Optimal” – the h
∂h i needed to get 0T

using standard results for gradients of common functions of vectors

−1
Now solve this for hi,o: h i ,o = λe i R x
H H
But…Depends on λ!!

Step #2: Choose λ to make this solution satisfy constraint:


H
(
H −1
)
h i ,o e i = 1 ⇒ λe i R x e i = 1 ⇒ λ = H −1
1
ei R x ei
…. Now use this λ in optimal hi,o:

e iH R −x 1 R −x 1e i
h iH,o = h i ,o =
e iH R −x1e i e iH R −x1e i 12/17
LO-2.4.3
MVSE – Filter Solution
e iH R −x 1 R −x 1e i
h iH,o = h i ,o =
e iH R −x1e i e iH R −x1e i

where e i = [1 e jωi e j 2ωi " e j ( p −1)ωi ]


This gives the optimal filter for estimating the power at the
frequency ωi

In principle – we need to solve this for each frequency ωi at


which we wish to get a PSD estimate.

Then we would compute the output power at each filter and


that would be our PSD estimate.

BUT…. we have an equation for the output power: ρi


13/17
MVSE – Power Estimate in Each Channel
The estimate of the power at ωi is nothing more than the
minimized value of ρi:
ρ i ,o = h iH,o R x h i ,o
H
⎡ e iH R −x1 ⎤ ⎡ e iH R −x1 ⎤ ⎡ e iH R −x1 ⎤ ⎡ R −x1e i ⎤
= ⎢ H −1 ⎥ R x ⎢ H −1 ⎥ = ⎢ H −1 ⎥ R x ⎢ H −1 ⎥
⎢⎣ e i R x e i ⎥⎦ ⎢⎣ e i R x e i ⎥⎦ ⎢⎣ e i R x e i ⎥⎦ ⎢⎣ e i R x e i ⎥⎦
e iH R −x1R x R −x1e i e iH R −x1e i
= =
(e iH R −x1e i )(e iH R −x1e i ) (e iH R −x1e i )(e iH R −x1e i )

Thus, the estimated power at frequency ωi is:

1
σˆ x2 (ω i ) =
e iH R −x1e i

14/17
MVSE – PSD Estimate in Each Channel
To get the power spectral density we need to divide by the
filter’s bandwidth – for a filter of length p the BW is
approximately 1/p so our MVSE PSD estimate is:

p
Sˆ MV (ω i ) =
e iH R −x1e i

But… this estimate requires the ACF in matrix form, which


if we had it we’d probably know what the PSD is, too!!!

So… we need an estimate of the ACF in matrix form….

15/17
MVSE – Estimating The AC Matrix

⎡ rˆx [0] rˆx [1] " rˆx [ p − 1]⎤


⎢ ⎥ N − k −1

ˆ =⎢
R
⎢ rˆx [ −1] rˆx [0] % ⎥

rˆx [k ] = 1
N ∑ x[n + k ]x*[m]
x n=0
⎢ # % % rˆx [1] ⎥
⎢ ⎥
⎢ rˆx [ − p + 1] " rˆx [ −1] rˆx [0] ⎥⎦

Note: The p× p
AC Matrix
MUST
be Estimated
p
Sˆ MV (ω i ) = Choose p < N so that high-order ACF lag
ˆ −1e
eH R estimates are reasonably accurate.
i x i

Note: There are other ways to estimate the AC Matrix!!! 16/17


MVSE – Comments
Implementation of MVSE:
Generally done directly on the data matrix X for efficiency
(see more advanced books)
Even with that, it is more complex than classical methods

Performance of MVSE:
Provides better resolution than classical methods
Mostly used when spiky spectra are expected
(Although, the AR methods are usually better in that case)

If needed resolution can be met with classical – use them.


If not – consider either MVSE or Parametric Methods.

17/17