Sie sind auf Seite 1von 14

11/19/2011

Adaptive Filtering
Lecture 6
Steepest Descent Method
Dr. Tahir Zaidi

Mean Square Error (Revisited)

For a transversal filter (of length M), the output is written as


and the error term wrt. a certain desired response is

Week 5

Adaptive Signal Processing

11/19/2011

Mean Square Error (Revisited)

Following these terms, the MSE criterion is defined as


Quadratic in w !

Substituting e(n) and manupulating the expression, we get

where

Adaptive Signal Processing

Week 5

Mean Square Error (Revisited)

For notational simplicity, express MSE in terms of vector/matrices

where

Week 5

Adaptive Signal Processing

11/19/2011

Mean Square Error (Revisited)

We found that the solution (optimum filter coef.s wo) is given by the
Wiener-Hopf eqn.s

Inversion of R can be very costly.


J(w) is quadratic in w convex in w for wo,
Surface has a single minimum and it is global, then

Can we reach to wo, i.e.

Week 5

with a less demanding algorithm?

Adaptive Signal Processing

Basic Idea of the Method of Steepest Descent

Can we find wo in an iterative manner?

Week 5

Adaptive Signal Processing

11/19/2011

Basic Idea of the Method of Steepest Descent

Starting from w(0), generate a sequence {w(n)} with the property

Many sequences can be found following different rules.

Method of steepest descent generates points using the gradient


Gradient of J at point w, i.e.
gives the direction at which
the function increases most.
Then
gives the direction at which the function
decreases most.
Release a tiny ball on the surface of J it follows negative
gradient of the surface.

Week 5

Adaptive Signal Processing

Basic Idea of the Method of Steepest Descent

For notational simplicity, let

, then going in the direction given by the negative gradient

How far should we go in g defined by the step size param.


Optimum step size can be obtained by line search - difficult
Generally a constant step size is taken for simplicity.

Then, at each step improvement in J is (from Taylor series expansion)

Week 5

Adaptive Signal Processing

11/19/2011

Application of SD to Wiener Filter

For w(n)

From the theory of Wiener Filter we know that

Then the update eqn. Becomes

which defines a feedback connection.

Adaptive Signal Processing

Week 5

Convergence Analysis

Feedback may cause stability problems under certain conditions.


Depends on

The step size,


The autocorrelation matrix, R

Does SD converge?
Under which conditions?
What is the rate of convergence?

We may use the canonical representation.

Let the weight-error vector be


then the update eqn. becomes

Week 5

Adaptive Signal Processing

10

11/19/2011

Convergence Analysis

Let

be the eigendecomposition of R.

Then

Using QQH=I

Apply the change of coordinates

Then, the update eqn. becomes

Week 5

Adaptive Signal Processing

11

Convergence Analysis

We know that is diagonal, then the k-th natural mode is


or, with the initial values vk(0), we have

Note the geometric series

Week 5

Adaptive Signal Processing

12

11/19/2011

Convergence Analysis

Obviously for stability


or

or, simply
Why?

Geometric series results in an exponentially decaying curve with


time constant k, where letting

Week 5

13

Adaptive Signal Processing

Convergence Analysis

We have

then

but
We know that Q is composed of the eigenvectors of R, then

or

Each filter coefficient decays exponentially.


The overall rate of convergence is limited by the slowest and fastest
modes

Week 5

Adaptive Signal Processing

14

11/19/2011

Convergence Analysis

For small step size

What is v(0)? The initial value v(0) is

For simplicity assume that w(0)=0, then

Adaptive Signal Processing

Week 5

15

Convergence Analysis

Transient behaviour:
From the canonical form we know that

then

As long as the upper limit on the step size parameter is satisfied,


regardless of the initial point

Week 5

Adaptive Signal Processing

16

11/19/2011

Convergence Analysis

The progress of J(n) for n=0,1,... is called the learning curve.

The learning curve of the steepest-descent algorithm consists of a


sum of exponentials, each of which corresponds to a natural mode
of the problem.
# natural modes = # filter taps

Week 5

Adaptive Signal Processing

17

Example

A predictor with 2 taps (w1(n) and w2(n)) is used to find the params.
of the AR process

Examine the transient behaviour for


Fixed step size, varying eigenvalue spread
Fixed eigenvalue spread, varying step size.
v2 is adjusted so that u2=1.

Week 5

Adaptive Signal Processing

18

11/19/2011

Example

Week 5

19

Adaptive Signal Processing

Example

The AR process:

Two eigenmodes
Condition number

Week 5

Adaptive Signal Processing

20

10

11/19/2011

Example (Experiment 1)

Experiment 1: Keep the step size fixed at

Change the eigenvalue spread

Week 5

Adaptive Signal Processing

21

Example (Experiment 1)

Week 5

ELE 774 - Adaptive Signal Processing

22

11

11/19/2011

Week 5

ELE 774 - Adaptive Signal Processing

23

Example (Experiment 2)

Keep the eigenvalue spread fixed at

Change the step size (max=1.1)

Week 5

Adaptive Signal Processing

24

12

11/19/2011

Week 5

ELE 774 - Adaptive Signal Processing

25

Example (Experiment 2)

Depending on the value of , the learning curve can be


Overdamped, moves smoothly to the min. ((very) small )
Underdamped, oscillates towards the min. (large < max)
Critically damped
Generally rate of convergence is slow for the first two.

Week 5

Adaptive Signal Processing

26

13

11/19/2011

Observations

SD is a deterministic algorithm, i.e. we assume that


R and p are known exactly.
In practice they can only be estimated
Sample average?

Can have high computational complexity.

SD is a local search algorithm, but for Wiener filtering,


the cost surface is convex (quadratic)
convergence is guaranteed as long as < max is satisfied.

Week 5

27

Adaptive Signal Processing

Observations

The origin of SD comes from the Taylor series expansion (as many
other local search optimization algorithms)

Convergence can we very slow.


To speed up the process, second term can also be included as in
the Newtons Method

Hessian

High computational complexity (inversion), numerical stability


problems.

Week 5

Adaptive Signal Processing

28

14

Das könnte Ihnen auch gefallen