Beruflich Dokumente
Kultur Dokumente
Prof. Dr. Guido Schuster University of Applied Sciences of Eastern Switzerland in Rapperswil (HSR)
Chapter 4 Signal Modeling Slides follow closely the book Statistical Digital Signal Processing and Modeling by Monson H. Hayes and most of the figures and formulas are taken from there
1
Introduction
The goal of signal modeling is that a parametric description of the signal is available This can be used for filter design and/or interpolation and/or extrapolation and/or compression We always use the same model, which is the output of a causal linear shift-invariant filter that has a rational system function The filter input is typically a discrete impulse (n)
Pad Approximation
The previous example showed, that for matching the number of unit sample response h(n) values equal to the degrees of freedom (p+q+1) in the system function H(z) can result in a set of nonlinear equations This is true in general, but there is an elegant trick to avoid these nonlinear equations and STILL match a given number of x(n) values with the sample unit response of a linear time invariant filter
Pad Approximation
Instead of working with the system function directly We use a little trick In the time domain, this becomes a convolution For n>q, the right hand side is zero Or written in matrix notation
Pad Approximation
This equation is solved in two steps. First for the poles, then with the now known poles, for the zeros.
Hence the ap(k) parameters can be found solving this set of linear equations
Pad Approximation
There a three cases which have to be handled
Case I: Xq is non-singular
Hence the inverse exist and the coefficients of Ap(z) are unique
Pad Approximation
Having found the coefficients of Ap(z), the second step is to find the coefficients of Bq(z) using the first q+1 equations Or in matrix notation
10
Pad Approximation
11
Or in matrix notation Since X0 is a lower triangular Toeplitz matrix the denominator coefficients may be found easily by back substitution
12
0.5
1.5
2.5
3.5
4.5
13
0.5 0
200
Amplitude
-0.5 -1 -1.5 -2
Amplitude
100
-100
-2.5 -3
-200 0 5 10 15 n (samples) 20 25
0.5
1.5
2.5 3 n (samples)
3.5
4.5
14
Amplitude
Hence the model is simply and the first three values of the unit sample response are matched and all other values are zero
0.5
0.5
1.5
2.5 3 n (samples)
3.5
4.5
15
19
20
21
Pronys Method
Impulse Response 1.5 1 0.5 0
Amplitude
Pad matches perfectly the number of samples in x(n) which correspond to the degree of freedom (p+q+1) What happens after p+q+1 is of no concern in the Pad method Prony does not match p+q+1 samples perfectly, but tries to use its degree of freedom such that an overall mean squared error is minimized
0.5
1.5
2.5 3 n (samples)
3.5
4.5
300
200
Amplitude
100
-100
-200
10
15 n (samples)
20
25
22
Pronys Method
Prony uses the same trick as Pad, but is also concerned about values beyond p+q+1, where the equal sign does not hold anymore The way the error is defined results in a linear problem, which is much easier to solve then the nonlinear equations of the direct least squares method
23
Pronys Method
Since bq(n)=0 for n>q the error can be written explicitly (a(0) =1) Instead of setting e(n)=0 for n=0,..,p+q as in the Pad approximation, Pronys method begins by finding the coefficients ap(k) that minimize the squared error As with Pad, since we focus on samples n>q, the error depends only on the coefficients ap(k)
24
Pronys Method
These coefficients can be found be setting the partial derivatives to zero Since the partial derivative of e*(n) with respect to ap*(k) is x* (n-k) this equation leads to the orthogonality principle Substituting the error expression into this leads to Or equivalently
25
Pronys Method
This can be simplified by using the following definition, which is very similar to the sample autocorrelation sequence (here we are not dividing by the number of samples N, and the sum does not go over all samples, it starts at q+1) This is now a set of p linear equations in the p unknowns ap(1),, ap(p) referred to as the Prony normal equations Or in matrix notation
26
Pronys Method
The Prony normal equations can be expressed using the data matrix Xq containing p infinite-dimensional column vectors The autocorrelation matrix Rx may be written in terms of Xq as follows The vector of autocorrelations rx may also be expressed in terms of Xq as follows Hence this is an equivalent form of the Prony normal equations
27
Pronys Method
If Rx is nonsingular then the coefficients ap(k) that minimize the MSE are Or equivalently Note that is also called the pseudo-inverse
28
Pronys Method
Now the value for the modeling error can be determined
It follows from the orthogonality principle that the second term is zero, therefore the minimum modeling error is Which can be written in terms of the autocorrelation sequence
29
Pronys Method
The normal equations can be written slightly differently, which will become handy later on as so called augmented normal equations Or in matrix notation
30
Pronys Method
Once the coefficients ap(k) have been found, the coefficients bq(k) are found in the same fashion as with the Pad approximation. In other words, the error is set to 0 for n=0,,q. This can be done using the convolution formula directly Or with a matrix multiplication
31
Pronys Method
32
34
The error corresponds to a minimum squared error The true error results in a squared true error of
35
Amplitude
10
15
20 n (samples)
25
30
35
40
Impulse Response 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Amplitude
10
15
20 n (samples)
25
30
35
40
36
39
Pronys Method
We may also formulate Pronys method in terms of finding the least squares solution to a set (infinitely many) of overdetermined linear equations, which all want the error to be zero, but that is not possible, since we have only p +q+1 degrees of freedom In matrix notation
For such an overdetermined set of linear equations the least squares solution can be found using the pseudo-inverse
40
Shanks Method
In Pronys method the numerator coefficients are found by setting the error to zero for n=0,,q This forces the model to be exact for those values but does not take into acount values greater than q A better approach is to perform a least squares minimization of the model error over the entire length of the data record
41
Shanks Method
Note that H(z) can be interpreted as a cascade of two filters
Once Ap(z) has been determined the unit sample response can be computed Instead of forcing e(n) to zero for the first q+1 values of n as in Pronys method, Shanks method minimizes the squared error
42
Shanks Method
Note that this is the same error as with the direct method, but since the poles are already determined, solving for the zeros is a linear problem and hence much simpler
As we did with Pronys method, we use the deterministic autoand cross-correlation sequences
Note that here the lower limit starts at 0, while for Pronys method the lower limit starts at q +1
43
Shanks Method
In matrix form, these equations are
Since g(n)=0 for n<0 (causal filter) the last term is zero for k>=0 or l>=0 hence Therefore each term rg(k,l) depends only on the difference between k and l
44
Shanks Method
Therefore Becomes now Since rg(k-l)=rg*(l-k) (*) The above equation can be written in matrix form
Or more compactly
45
Shanks Method
The minimum squared error can now be found Now e(n) and g(n) are orthogonal hence the second term is zero. Therefore the minimum error is Or in terms of rx(k) and rxg(k) Since rgx(-k)=rxg*(k)
46
Shanks Method
Shanks method can also be interpreted as finding a least squares solution to an overdetermined set of linear equations, by letting Now writing this convolution as a overdetermined set of linear equations results in
Or equivalently The pseudo-inverse will then find the least squares solution (Matlab: bq=G0\x0 )
-1
47
Shanks Method
48
49
50
And
51
All-Pole Modeling
The main advantage of all-pole models is, that there are fast algorithms (Levinson-Durbin recursion) to solve the Prony normal equations. Another reason is, that many physical processes, such as speech, can be well modeled with an all-pole model The error that we are concerned of in Pronys method for finding the ap(k) coefficients is (q=0)
52
All-Pole Modeling
Since x(n)=0 for n<0 then the error at time n=0 is equal to x(0)-b(0) and hence does not depend on the a coefficients. Hence we can include it in the sum of the squared errors and the minimization will still result in the same a coefficients Following the steps of the Prony derivation we arrive at Where now Note for Prony the sum would start at q+1=1, but since we minimize the above error it starts at 0
53
All-Pole Modeling
Again this can be simplified, since x(n)=0 for n<0, the term on the right is zero for k>=0 or l>=0. This means that rx(k,l) depends only on the difference between k and l We define
Hence the all-pole normal equations are Or in matrix form, using the conjugate symmetry
54
All-Pole Modeling
The modeling error is given by We still need to find b(0). The obvious choice would be Since the entire unit sample response is scaled by b(0), it might be better to select b(0) such, that the overall energy in x(n), rx(0), is equal to the overall energy of unit sample response h(n), rh(0) Without proof, this can be achieved by setting
55
All-Pole Modeling
As with the Pad approximation and Pronys method, the all-pole modeling can be interpreted as finding the least squares solution to the following set of overdetermined linear equations
56
All-Pole Modeling
57
59
All-Pole Modeling
The all-pole normal equations can again be brought into a special form, which contains the modeling error
61
Linear Prediction
We now establish the equivalence between all-pole signal modeling and linear prediction Recall that Pronys method finds the set of all-pole parameters that minimize the sum of the squared errors
62
64
66
67
In this example Hence the solution for h(0) and h (1) is With a squared error of And a system function of
68
70
71
72
73
74
The main difference is, that for the summation of the autocorrelation sequence, the lower limit is now k and the upper limit N The Toeplitz structure of the normal equations are preserved, which means, that the LevinsonDurbin recursion can be used to efficiently find the solution
75
76
77
78
79
80
-1
81
82
10
15
20
25
30
83
84
Hence it makes sense, to define the sum of the squared error such, that only e(n) are used which can also be calculated
85
87
88
89
Therefore
90
Comparison Example
The goal is to model x(n) as the unit sample response of a second-order all-pole filter The first 20 values are The autocorrelation method uses a windowed signal and then applies Pronys method The normal equations are Where
91
Comparison Example
Evaluating this sum at k=0,1 and 2 results in Hence the normal equations become Solving for a(1) and a(2) we get Hence the denominator polynomial is The modeling error is Setting b(0) to match the energy
92
Comparison Example
The goal is to model x(n) as the unit sample response of a second-order all-pole filter The first 20 values are The covariance method uses a different definition of the error and then applies Pronys method The normal equations are Where
93
Comparison Example
Evaluating this sum we find Hence the normal equations become singular! Hence a lower order is possible => a(2) =0 Solving for a(1) Hence the denominator polynomial is Setting b(0) to 1 the model is Matching the data perfectly for n=0,1,,N
94
95
The autocorrelation sequence of an ARMA(p,q) process satisfies the Yule-Walker equations (Eq. 3.115) Where cq(k) is the convolution of bq(k) and h*(-k) And rx(k) is a statistical autocorrelation
96
These equations are called the Modified Yule-Walker equations Hence this approach is called the Modified Yule-Walker Equation (MYWE) method If the autocorrelation is unknown, then an estimate is used
97
98
Since cq(k)=0 for k>q the sequence cq(k) is then known for all k>=0 (but not for k<0) We denote the z-transform of the causal part of cq(k) Similarly the anti-causal part is
99
Multiplying Cq(z) by Ap*(1/z*) results in the power spectrum of the MA(q) process Since ap(k)=0 for k<0 , Ap*(1/ z*) only contains positive powers of z. The causal part of Py(z) therefore
100
101
In this example Resulting in This was the easy part, now lets find the MA coefficients
102
In this example Resulting in Hence Multiplying with Results in Hence the causal part of Py(z)
103
104
105
106
Stochastic AR Models
This is clearly a special case of an ARMA(p,q=0) model Hence its autocorrelation sequence must satisfy the Yule-Walker equations Writing these in matrix form for k>0 using the conjugate symmetry of rx(k)
Solving these p equations for the p unknowns ak(p) is called the Yule-Walker method
107
Stochastic AR Models
Note that these equations are equaivalent to the normal equations for all-pole modeling using Pronys method. The only difference is in the definition of the autocorrelation sequence Statistical definition for the Yule-Walker method Deterministic definition for Pronys method But if we need to estimate the autocorrelation sequence for the Yule-Walker method?
108
Stochastic MA Models
This is clearly a special case of an ARMA(p=0,q) model The Yule-Walker equations (which are nonlinear) relating the autocorrelation sequence to the filter coefficients bq(k) are Instead to solving these directly one approach uses spectral factorization. Since an autocorrelation sequence of an MA(q) process is zero for |k|>q the power spectrum has the following form
109
Stochastic MA Models
Using the spectral factorization given in Eq. 3.102, where Q(z) is a minimum phase monic (q(0) =1) polynomial of degree q, i.e., |k|<=1
Hence And Q(z) is the minimum phase version of Bq(z) that is formed by replacing each zero of Bq(y) that lies outside the unit circle with one that lies inside the unit circle at the conjugate reciprocal location
110
Stochastic MA Models
In summary, given the autocorrelation sequence, we get the power spectrum, which is then factored into a minimum phase (Q(z)) and a maximum phase polynomial Q*(1/z*) Hence the process can now be modeled as the output of a minimum phase FIR filter, driven by a unit variance white noise Note the model is not unique. Any one (1-kz-1) factor in Q (z) can be replaced by (1-k*z-1)
111
In particular 0=4 and k=-1/4, hence as a minimum phase FIR filter or as a maximum phase FIR Filter
112
The estimate can be improve by including prior knowledge about the process. For example, the process x(n) is an AR(p) process Now the Yule-Walker method with estimated autocorrelations could be used to estimate the missing parameters
114
117
Exercises
118
Exercise
119
Solution
120
Exercise
121
Solution
122
Solution
123
Solution
124
Exercise
125
Solution
126
Exercise
127
Solution
128
Exercise
129
Solution
130
Exercise
131
Solution
132
Exercise
133
Solution
134
Exercise
135
Solution
136
Solution
137