Sie sind auf Seite 1von 37

Identification and Estimation

„ identification of the model structure (find a suitable class


of models)
„ design of experiment; selecting input and output signals
„ parameter estimation (estimation of the parameter
values in the chosen model) θ
y
„ model validation process
u + y-ym
θˆ -
ym
model

T
2
min J (θ ,θˆ) = min ˆ
{ ∫  y (t ,θ ) − yˆ (t ,θ )  dt
θˆ 0
Identification (cont.)

„ Model structures :
„ Regression models,
„ General (SISO) models,
„ State models,
„ ”Black-box-models” (e.g. impulse response models like
the residence time distribution, neural net models, any
input-output model can be considered in this class)
Identification (cont.)

„ Input signal:
„ The estimation result depends crucially on the
characteristics of the input signal
- Convergence of the estimate
- The signal must be rich enough the wake the
dynamics (”persistently exciting”)
„ If the model structure is too simple, changes in the
output are explaned by parameter variations, not good.
A too complex model does not usually improve the
input-output prediction much.
Least Squares Estimation

The model

y (t ) = ϕ1 (t )θ1 + ϕ 2 (t )θ 2 + L + ϕ n (t )θ n = ϕ (t )T θ

is linear in the parameters. An estimation problem can


be formulated as an optimisation problem, which is
analytically solvable.
T T
θ{ (t ) = [ϕ1 (t ) ϕ 2 (t ) L ϕ n (t ) ] θ{ = [θ1 θ 2 L θ n ]
n×1 n×1

regressors parameters
Least Squares (cont.)

The process experiment gives the observations

{( y(i), ϕ (i) ) , i = 1, 2,K , t}


Use the following notations

Y (t ) = [ y (1) y (2) L y (t ) ]
T
Residual = estimation error:
T
E (t ) = [ε (1) ε (2) L ε (t ) ] ε (i ) = y (i ) − yˆ (i ) = y (i ) − ϕ T (i )θ
Least Squares (cont.)

ϕ T (1) 
 T  −1
 ϕ (2)   t

P(t ) = Φ (t )Φ (t )  =  ∑ ϕ (i )ϕ (i ) 
−1
Φ= T T
 M 
   i =1 
ϕ (t ) 
T

Gauss: ”minimize the sum of squares of the estimation error”


1 t 1 t
1 T 1
V (θ , t ) = ∑ ε (i ) = ∑  y (i ) − ϕ (i )θ  = E E = E
2 T 2 2

2 i =1 2 i =1 2 2
Least Squares (cont.)

in which

E = Y − Yˆ = Y − Φθ

Solution:
T
2 V (θ , t ) = E E = (Y − Φθ ) (Y − Φθ )
T

= Y T Y − Y T Φθ − θ T ΦT Y + θ T ΦT Φθ = V1 (θ , t )

T
But θ Φ Y = (θ Φ Y ) = Y T Φθ
T T T T
(scalar)
Least Squares (cont.)

Note that for a square matrix A and vector x


∂ ∂ T
∂x
( Ax ) = A;
∂x
( x Ax ) = x T
( A + AT
)
in which the gradient with respect to x is considered to be
a row vector.

Now, search for the minimum


∂V1 (θ , t )
= −Y T Φ − Y T Φ + θ T ( ΦT Φ + ΦT Φ ) − 2Y T Φ + 2θ T ΦT Φ = 0
∂θ
Least Squares (cont.)

T T T
which gives θ Φ Φ =Y Φ

and by taking the transpose

T T
Φ Φθ = Φ Y (normal equations)

If ΦT Φ is non-singular, a unique solution exists. It is


−1
θ = θ = (Φ Φ ) Φ Y
ˆ T T
Least Squares (cont.)

The solution is a minimum, because the Hessian matrix

∂ 2V1 (θ , t ) T is always positive definite or positive


= 2 Φ Φ
∂θ 2
demidefinite.

(Note: a square matrix A is positive definite, if for all


non-zero vectors x
T
T
x Ax > 0 ;positive semidefinite if x Ax ≥ 0
Negative (semi)definitess is defined accordingly. A is
positive definite, iff the eigenvalues of AT A are positive.)
Least Squares (cont.)

Note that the solution can be written in the form


−1
 t
  t
 t
θˆ(t ) =  ∑ ϕ (i )ϕ (i )   ∑ ϕ (i ) y (i )  = P(t )∑ ϕ (i ) y (i )
T

 i =1   i =1  i =1

The condition that ΦT Φ is non-singular, is called the


excitation condition. Note that the dimension of the matrix
is n x n, in which n is the number of parameters to be
estimated.
Least Squares (cont.)

Exercise: Prove that if A is a real n x p – dimensional matrix


and x a p – dimensional column vector, it holds

AT Ax = 0 ⇔ Ax = 0

Prove further that if rank (A) = p, then AT A


is non-singular.
Prove that AT A is non-singular, iff the columns of matrix
A are linearly independent.
Recursive Least Squares (RLS)

In on-line identification the algorithms must run continuously


as new measurement data is flowing in. Two points are of
interest:
- how to develop a recursive form of the least squares
estimation algorithm?
- how to give more weight to the ”new” data?
Let us first try to write the least squares algorithm in a
recursive form.
Recursive Least Squares (cont.)

−1
 t

P (t ) = Φ (t )Φ (t )  =  ∑ ϕ (i )ϕ (i ) 
T −1 T

 i =1 

which gives easily

P (t ) −1 = P(t − 1) −1 + ϕ (t )ϕ T (t ) and

t
 t −1

θ (t ) = P(t )∑ ϕ (i ) y (i) = P(t )  ∑ ϕ (i ) y (i) + ϕ (t ) y (t ) 
ˆ
i =1  i =1 
Recursive Least Squares (cont.)

Using the formula of the estimate and then the expression


of P(t ) −1 gives
t −1

∑ ϕ
i =1
(i ) y (i ) = P (t − 1) −1 ˆ
θ (t − 1) = P (t ) −1 ˆ
θ (t − 1) − ϕ (t )ϕ T
(t )θˆ(t − 1)

and θˆ(t ) = θˆ(t − 1) − P(t )ϕ (t )ϕ T (t )θˆ(t − 1) + P(t )ϕ (t ) y (t )


= θˆ(t − 1) + P(t )ϕ (t )  y (t ) − ϕ T (t )θˆ(t − 1) 

= θˆ(t − 1) + K (t )ε (t )
Recursive Least Squares (cont.)

where K (t ) = P(t )ϕ (t )
ε (t ) = y (t ) − ϕ T (t )θˆ(t )

The residual ε(t) can be interpreted as the prediction error


of y(t), (one step predictor), based on the old estimate
θˆ(t − 1)
Recursive Least Squares (cont.)

The matrix inversion lemma: Let A, C and C-1+DA-1B be


non-singular matrices of appropriate dimensions. Then
A+BCD is non-singular and
−1
( A + BCD )
−1
= A − −1
A B (−1
C + DA−1
B ) DA −1
−1

Proof: Multiplying by A+BCD from the left gives


Recursive Least Squares (cont.)

 −1 −1 −1 
I = ( A + BDC ) A − A B D + CA B CA−1  =

−1 −1

( )
(
I −B D −1 −1
+ CA B )
−1
CA −1 −1
+ BDCA − BDCA B D −1
( −1 −1
+ CA B )
−1
CA−1 =
−1
(
I + BDCA − B I + DCA B D −1
)( −1
+ CA B−1
)−1
CA−1 =
−1
I + BDCA − BD D ( −1
+ CA B D −1
)( −1
+ CA B−1
)
−1
CA−1 =
I + BDCA−1 − BDCA−1 =
I
Recursive Least Squares (cont.)

Apply the inversion lemma to


−1 −1
P(t ) = Φ (t )Φ (t )  =  P(t − 1) + ϕ (t )ϕ (t ) 
T −1 T

which gives
−1
P(t ) = P(t − 1) − P(t − 1)ϕ (t )  I + ϕ (t ) P(t − 1)ϕ (t )  ϕ T (t ) P(t − 1)
T

Note that I =1 above (scalar)


Recursive Least Squares (cont.)

It follows that

 ϕ T (t ) P(t − 1)ϕ (t ) 
K (t ) = P(t )ϕ (t ) = P(t − 1)ϕ (t ) 1 − T 
 1 + ϕ (t ) P(t − 1)ϕ (t ) 
1
= P (t − 1)ϕ (t )
1 + ϕ T (t ) P (t − 1)ϕ (t )

Collecting the results together gives the desired RLS


algorithm
Recursive Least Squares (cont.)

θ{ˆ (t ) = θˆ(t − 1) + K (t )  y (t ) − ϕ T (t )θˆ(t − 1) 


n×1

P (t − 1)ϕ (t )
K (t ) = P(t )ϕ (t ) =
{ T
n×1 1 + ϕ (t ) P (t − 1)ϕ (t )
P(t − 1)ϕ (t )ϕ T (t ) P(t − 1)
{P (t ) = P (t − 1) − T
= 
 I − K (t )ϕ T
(t )  P(t − 1)
n× n 1 + ϕ (t ) P(t − 1)ϕ (t )

K(t) is used to explain, how to correct the previous estimate


based on new measurement data
Recursive Least Squares (cont.)

RLS can be interpreted as the optimal state estimator


(Kalman filter) to the system

θ (t + 1) = θ (t )
y (t ) = ϕ T (t )θ (t ) + e(t )

where e is white noise. The RLS can be interpreted in


the stochastic framework. Also, in a geometric sense, it
is a consequence of the projection theorem.
Recursive Least Squares (cont.)

Note that P(t) is defined only when ΦT (t )Φ (t ) is


non-singular. Choose the initial value at a time instant,
when this holds
−1
P(t0 ) = Φ (t0 )Φ (t0 ) 
T

θˆ(t0 ) = P(t0 )ΦT (t0 )Y (t0 )

recursion for t > t0


Recursive Least Squares (cont.)

Choose a positive definite matrix P0 and

P(0) = P0
−1
P(t ) =  P + Φ (t )Φ (t ) 
0
−1 T

Choose P0 ”large”. Interpretation in the stochastic


setting: set the covariance of the parameter estimates
large in the beginning.
Recursive Least Squares (cont.)

The second important question was: how to weight more


new data and forget the old history. For example: if the
parameters change, the old values cause problems as time
goes by.
Solution: Use a forgetting factor. The parameter values
are assumed to change slowly with respect to the dynamics
of the new estimator. If the estimator is tuned to be too
fast, that can cause severe robustness problems;
the estimator may exhibit oscillations.
Recursive Least Squares (cont.)

New cost function:

1 t t −i
V (θ , t ) = ∑ λ  y (i ) − ϕ (i )θ 
T 2

2 i =1

where the forgetting factor λ is between 0 and 1; usually


λ ∈ [ 0.95 1]
Recursive Least Squares (cont.)

The new RLS estimator becomes

θˆ(t ) = θˆ(t − 1) + K (t )  y (t ) − ϕ T (t )θˆ(t − 1) 


P(t − 1)ϕ (t )
K (t ) = P(t )ϕ (t ) =
λ + ϕ T (t ) P(t − 1)ϕ (t )
P(t ) =  I − K (t )ϕ T (t )  P(t − 1) / λ

Problem: If K (t ) = P(t )ϕ (t ) = 0 then P(t) grows.


This is called the estimator windup. There exist methods
e.g. constant trace algorithms to deal with the problem.
Identification in Closed Loop

Consider a forward path system

y (t ) = ay (t − 1) + bu (t − 1) + e(t )

with proportional regulator

u (t ) = gy (t )
Identification in Closed Loop

e(t)
1
1 − az −1

+
bz −1 +
1 − az −1
u(t) y(t)

By substituting the controller equation into the system


equation we obtain either one of the following equations
y (t ) = (a + gb) y (t − 1) + e(t )
y (t ) = (a / g + b)u (t − 1) + e(t )

i. It is not possible to identify both parameters a, b.


We can only estimate a+gb.
ii. There are two equivalent low order models driven by the
same white noise.
Identifialibility can be improved by :

i. adding an independent signal (’dither’) into the


feedback loop,
ii. adding delay in feedback,

iii. using a time-variable or non-linear feedback.


Simplified algorithms

The save the calculation effort the updating of P can be


avoided by introducing simplified estimation algorithms.
These are based on different ideas, e.g. the geometrical
interpretation, and usually lead to slower convergence rates.
Example: Kaczmarz’s algorithm:
γϕ (t )
θˆ(t ) = θˆ(t − 1) + T
ϕ (t )ϕ (t )
(
y (t ) − ϕ T
(t )θˆ(t − 1) )
Simplified algorithms

To avoid the potential problem of division by zero leads to

γϕ (t )
θˆ(t ) = θˆ(t − 1) +
α + ϕ T (t )ϕ (t )
(
y (t ) − ϕ T
(t )θˆ(t − 1) )
where α ≥ 0 and 0 < γ < 2

In the stochastic framework the stochastic approximation


(SA) algorithm and its simplified version, the least mean
square (LMS) algorithm, are obtained
Simplified algorithms

SA:
(
θˆ(t ) = θˆ(t − 1) + P(t )ϕ (t ) y (t ) − ϕ T (t )θˆ(t − 1) )
−1
 t

where P(t ) =  ∑ ϕ (i )ϕ (i) 
T
is a scalar.
 i =1 

LMS: ˆ ˆ (
θ (t ) = θ (t − 1) + γϕ (t ) y (t ) − ϕ (t )θˆ(t − 1)
T
)
where γ is a constant.
Continuous-Time Models

The least squares algorithm can also be formulated and


solved in the case of continuous time systems.

Model: y (t ) = ϕ T (t )θ

t
Criterion: V (θ ) = ∫ e −α ( t −τ )
(y(τ ) − ϕ T
)
2
(τ )θ dτ
0

where α corresponds to the forgetting factor.


Continuous-Time Models

The solution is expressed by the normal equation

 t −α (t −τ )  t
∫e ϕ (τ )ϕ T
(τ ) dτ θˆ(t ) = ∫ e −α (t −τ )ϕ (τ ) y (τ )dτ
 
0  0

The estimated is unique, if the matrix


t
R(t ) = ∫ e −α (t −τ )ϕ (τ )ϕ T (τ )dτ is invertible.
0
Continuous-Time Models

The solution can be formulated as the algorithm


dθˆ(t )
= P(t )ϕ (t )e(t )
dt
e(t ) = y (t ) − ϕ T (t )θˆ(t )
dP (t )
= αP(t ) − P(t )ϕ (t )ϕ T (t ) P (t )
dt

where P (t ) = R (t ) −1

Das könnte Ihnen auch gefallen