Вы находитесь на странице: 1из 18

11/19/2011

Lecture 7
Least Mean Square (LMS)
Adaptive Filtering
Dr. Tahir Zaidi

Steepest Descent

The update rule for SD is


where
or

SD is a deterministic algorithm, in the sense that p and R are


assumed to be exactly known.

In practice we can only estimate these functions.

Chapter 5

Adaptive Signal Processing

11/19/2011

Basic Idea

The simplest estimate of the expectations is


To remove the expectation terms and replace them with the
instantaneous values, i.e.

Then, the gradient becomes

Eventually, the new update rule is

Chapter 5

No
expectations,
Instantaneous
samples!

Adaptive Signal Processing

Basic Idea

However the term in the brackets is the error, i.e.


then

is the gradient of

instead of

as

in SD.

Chapter 5

Adaptive Signal Processing

11/19/2011

Basic Idea

Filter weights are updated using instantaneous values

Chapter 5

ELE 774 - Adaptive Signal Processing

Update Equation for


Method of Steepest Descent

Update Equation for


Least Mean-Square

Chapter 5

ELE 774 - Adaptive Signal Processing

11/19/2011

LMS Algorithm
unbiased

Since the expectations are omitted, the estimates will have a high variance.
Therefore, the recursive computation of each tap weight in the LMS
algorithm suffers from a gradient noise.

In contrast to SD which is a deterministic algorithm, LMS is a member of the


family of stochastic gradient descent algorithms.

LMS has higher MSE (J()) compared to SD (Jmin) (Wiener Soln.) as n


i.e., J(n) J() as n
Difference is called the excess mean-square error Jex()
The ratio Jex()/ Jmin is called the misadjustment.
Hopefully, J() is a finite value, then LMS is said to be stable in the
mean square sense.
LMS will perform a random motion around the Wiener solution.

Chapter 5

Adaptive Signal Processing

LMS Algorithm

Involves a feedback connection.


Although LMS might seem very difficult to work due the
randomness, the feedback acts as a low-pass filter or performs
averaging so that the randomness can be filtered-out.
The time-constant of averaging is inversely proportional to .
Actually, if is chosen small enough, the adaptive process is made
to progress slowly and the effects of the gradient noise on the tap
weights are largely filtered-out.
Computational complexity of LMS is very low very attractive
Only 2M+1 complex multiplications and 2M complex additions
per iteration.

Chapter 5

Adaptive Signal Processing

11/19/2011

LMS Algorithm

Chapter 5

Adaptive Signal Processing

Canonical Model

LMS algorithm for complex signals/with complex coef.s can be


represented in terms of four separate LMS algorithms for real
signals with cross-coupling between them.
Write the input/desired signal/tap gains/output/error in the complex
notation

Chapter 5

Adaptive Signal Processing

10

11/19/2011

Canonical Model

Then the relations bw. these expressions are

Chapter 5

Adaptive Signal Processing

11

Canonical Model

Chapter 5

Adaptive Signal Processing

12

11/19/2011

Canonical Model

13

Adaptive Signal Processing

Chapter 5

Analysis of the LMS Algorithm

Although the filter is a linear combiner, the algorithm is highly nonlinear and violates superposition and homogenity

Assume the initial condition

Analysis will continue using the weight-error vector

, then

output

and its autocorrelation

Chapter 5

input

Here we use expectation,


however, actually it is
the ensemble average!.
Adaptive Signal Processing

14

11/19/2011

Analysis of the LMS Algorithm

We have

Let

Then the update eqn. can be written as

Analyse convergence in an average sense


Algorithm run many timesstudy their ensemble average behavior

Chapter 5

Adaptive Signal Processing

15

Small Step Size Analysis

Assumption I: step size is small (how small?) LMS filter act


like a low-pass filter with very low cut-off frequency.

Assumption II: Desired response is described by a linear multiple


regression model that is matched exactly by the optimum Wiener
filter
where eo(n) is the irreducible estimation error and

Assumption III: The input and the desired response are jointly
Gaussian.

Chapter 5

Adaptive Signal Processing

16

11/19/2011

Small Step Size Analysis

Applying the similarity transformation resulting from the eigendecom.


on

We do not have this term in


Wiener filtering!.

i.e.

Then, we have

where
Components of v(n)
are uncorrelated!
17

Adaptive Signal Processing

Chapter 5

Small Step Size Analysis

Components of v(n) are uncorrelated:

stochastic force

first order difference equation (Brownian motion, thermodynamics)

Solution: Iterating from n=0

natural component
of v(n)

Chapter 5

forced component
of v(n)

Adaptive Signal Processing

18

11/19/2011

Learning Curves

Two kinds of learning curves


Mean-square error (MSE) learning curve

Mean-square deviation (MSD) learning curve

Ensemble averaging results of many () realizations are averaged.

What is the relation bw. MSE and MSD?


for small

Chapter 5

19

Adaptive Signal Processing

Learning Curves
for small

under the assumptions of slide 17.


Excess MSE
LMS performs worse than SD, there is always an excess MSE

use

Chapter 5

Adaptive Signal Processing

20

10

11/19/2011

Learning Curves

or

Mean-square deviation D is lower-upper bounded by the excess MSE.

They have similar response: decaying as n grows

Adaptive Signal Processing

Chapter 5

21

Convergence

For small

Hence, for convergence


or

The ensemble-average learning curve of an LMS filter does not


exhibit oscillations, rather, it decays exponentially to the const. value

Jex(n)

Chapter 5

Adaptive Signal Processing

22

11

11/19/2011

Misadjustment

Misadjustment, define

For small , from prev. slide

or equivalently

but

then

Adaptive Signal Processing

Chapter 5

23

Average Time Constant

From SD we know that

but
then

Chapter 5

Adaptive Signal Processing

24

12

11/19/2011

Observations

Misadjustment is
directly proportional to the filter length M, for a fixed mse,av
inversely proportional to the time constant mse,av

Directly proportional to the step size

slower convergence results in lower misadjustment.


smaller step size results in lower misadjustment.

Time constant is
inversely proportional to the step size

smaller step size results in slower convergence

Large requires the inclusion of k(n) (k1) into the analysis


Difficult to analyse, small step analysis is no longer valid,
learning curve becomes more noisy
Adaptive Signal Processing

Chapter 5

25

LMS vs. SD

Main goal is to minimise the Mean Square Error (MSE)


Optimum solution found by Wiener-Hopf equations.

Requires auto/cross-correlations.
Achieves the minimum value of MSE, Jmin.
LMS and SD are iterative algorithms designed to find wo.
SD has direct access to auto/cross-correlations (exact measurements)

can approach the Wiener solution wo, can go down to Jmin.

LMS uses instantenous estimates instead (noisy measurements)

Chapter 5

fluctuates around wo in a Brownian-motion manner, at most J().


Adaptive Signal Processing

26

13

11/19/2011

LMS vs. SD

Learning curves
SD has a well-defined curve composed of decaying exponentials

Chapter 5

For LMS, curve is composed of noisy- decaying exponentials

27

Adaptive Signal Processing

LMS Limits

As filter length increases, M, Imposes a limit on the step size to


avoid instability

If the upper bound is exceeded, instability is observed.

Smax: maximum component


of the PSD S() of the tap
inputs u(n).

Chapter 5

Adaptive Signal Processing

28

14

11/19/2011

LMS Example

One tap predictor of order one AR process. Let

Chapter 5

Adaptive Signal Processing

29

LMS Example

Chapter 5

Adaptive Signal Processing

30

15

11/19/2011

LMS Example

Chapter 5

Adaptive Signal Processing

31

LMS Example: Learning Curves

Chapter 5

Adaptive Signal Processing

32

16

11/19/2011

H Optimality of LMS

A single realisation of LMS is not optimum in the MSE sense


Ensemble average is.
The previous derivation is heuristic

(replacing auto/cross correlations with their instantenous estimates.)

In what sense is LMS optimum?


It can be shown that LMS minimises

Minimising the maximum of something minimax

Chapter 5

Maximum energy gain of the filter under the constraint

Optimisation of an H criterion.

Adaptive Signal Processing

33

H Optimality of LMS

Provided that the step size parameter satisfies the limits on the
prev. slide, then
no matter how different the initial weight vector
is from the
unknown parameter vector wo of the multiple regression model, and
irrespective of the value of the additive disturbance n(n),
the error energy produced at the output of the LMS filter will never
exceed a certain level.

Chapter 5

Adaptive Signal Processing

34

17

11/19/2011

Limits on the Step Size

Chapter 5

Adaptive Signal Processing

35

18

Вам также может понравиться