Lecture Notes - Optimal Control (LQG, MPC)

Optimal Control
Le ture Notes
February 9, 2016
Preliminary Edition
Department of Control Engineering, Institute of Ele troni Systems

Aalborg University, Fredrik Bajers Vej 7, DK-9220 Aalborg , Denmark
Page II of IV
Preamble
These notes have been prepared for a basi ourse in optimal ontrol at the 8th term of studies
in Ele trioni s and Information Te hnology, Aalborg University. The students are expe ted
to be a quainted with lassi al feedba k ontrol theory.
Key Words
Optimal ontrol. Ri ati equation. Linear Quadrati Gaussianl, Kalman observer, Stability;
Aliation of the Authors

The notes are based on original notes in danish by Ole Srensen. The notes have been
translated and adapted by Palle Andersen.
Department home page: http://www. ontrol.au .dk.
The email of the last author is: paes.aau.dk
Optimal Control
Page III of IV
Contents
1
Introdu tion
1.1
General Des ription of Plant and Performan e
. . . . . . . . . . . . . . . . . . . . . .
Dynami Programming
Time varying LQ-Control
3.1
LQ ontrol of dis rete time systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2
Summary of LQ-method for dis rete time systems
. . . . . . . . . . . . . . . . . . . .
12
3.3
Choi e of Weight Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3.4
LQ-Control of Continous Time Systems
16
3.5
Summary of LQ-method for ontinous time systems
. . . . . . . . . . . . . . . . . . .
Stationary LQ-Controllers
L(k)
17
19
S(k)
4.1
Steady State Values of
4.2
Example: LQ ontrol of a dis rete time 2nd order system.
. . . . . . . . . . . . . . . .
20
4.3
Example: LQ ontrol of a ontinuous time 2nd order system . . . . . . . . . . . . . . .
24
and
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Referen es and disturban es

5.1
. . . . . . . . . . . . . . . . . . . . . . . . . .
19
27
Modelling referen e and disturban e
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
Control of System with a Referen e Model
31
6.1
Example, First Order System with onstant referen e . . . . . . . . . . . . . . . . . . .
35
6.2
Example, se ond order system with onstant referen e
37
. . . . . . . . . . . . . . . . . .
Using a Disturban e Model in the Control Law
39
Sto hasti LQ Control with Full State Information
42
8.1
Sto hasti Optimal Control
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
8.2
Full state information
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
8.3
Performan e fun tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
8.4
Expe tation of a Quadrati form
45
8.5
Derivation of Re ursive expressions for
8.6
L(k)
and
S(k)
. . . . . . . . . . . . . . . . . .
45
Summary of LQ Method for sto hasti , dis rete time systems with omplete state information
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
Sto hasti LQ Control with In omplete State Information
49
9.1
In omplete State Information
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
9.2
Separation theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
Optimal Control
Page IV of IV
9.3
Summary of LQ method for sto hasti , dis rete time systems with in omplete state
information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
9.4
Duality between ontroller and observer
. . . . . . . . . . . . . . . . . . . . . . . . . .
53
9.5
Innovation model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
10 Stability of Multivariable Systems
55
10.1 Stability for Multivariable System in general . . . . . . . . . . . . . . . . . . . . . . . .
55
10.2 Stability for an LQ- ontrolled System
58
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Optimal Control
Chapter 1
Introdu tion
In this ourse we will explore a method for design of ontrollers in whi h the goals of the
ontrol are formulated using a performan e fun tion whi h quantify obje tives. Typi al goals
an be to keep the plant state or output lose to a referen e by the use of as small ontrol
signals as possible. In the performan e (or ost) fun tion su h goals are quantied dening
a ost for deviations from setpoint and a ost for the use of ontrol signal. Having dened
the performan e fun tion the best ontrol signal over time may be found by minimizing this
performan e fun tion. This way a ontrol law relating the ontrol signal from the ontroller
to signals measured in the plant may be synthesized from the performan e fun tion.
The systems we will onsider will primarily be linear systems des ribed in state spa e. The
theory will primarily be developed for dis rete time ontrol, although we will also give formulas
for the ontinuous time ase.
Most of you are familiar with methods for design of ontrollers based on spe i ation of
desired losed loop poles. You may even have experien ed, that it an be a di ult task to
spe ify a sensible set of desired poles. For instan e if you spe ify a set of poles it may turn
out, that the resulting ontroller gives ontrol signals whi h ex eed the physi al bounds even
for reasonable disturban es and referen e signals.
If the plant you are going to ontrol has multiple inputs and outputs you will also dis over
that spe i ation of losed loop poles is not su ient to determine ontroller oe ients.
If for instan e you want to ontrol a plant des ribed by
(1.1)
x(k + 1) = x(k) + u(k)
(1.2)
y(k) = Hx(k)
In ase the plant has two ontrol inputs, two outputs and four states:
x1 (k)

u1 (k)
y1 (k)
x2 (k)
u(k) =
y(k) =
x(k) =
x3 (k)
u2 (k)
y2 (k)
x4 (k)
(1.3)
a state feedba k ontroller an be of the form
u(k) = L(xref (k) x(k))

1
(1.4)
Page 2 of 61
with
L=
l11 l12 l13 l14

l21 l22 l23 l24
(1.5)
This feedba k law results in a system whi h in losed loop an be des ribed by the equation
(1.6)
x(k + 1) = ( L)x(k) + Lxref
(1.7)
y(k + 1) = Hx(k)
The losed loop poles of this system are the eigenvalues of the matrix ( L). The losed
loop poles an be pla ed at the desired lo ation if (, ) is ontrollable. Controllability of
dis rete time system an be investigated by studying where you an steer the state ve tor of
the system using ontrol signals at n su essive sample instants. This results in
x(k + 1)
x(k + 2)
=
=
(1.8)
x(k) + u(k)
2
(1.9)
x(k) + u(k) + u(k + 1)
(1.10)
... ... ...

x(k + n)
u(k + n 1)
...
n x(k) + [, , 2 , . . . , n1 ]
u(k + 1)
u(k)
(1.11)
(1.12)
The plant is ontrollable if you an rea h the full state spa e. We dene the ontrollability
matrix
C = [, , 2 , . . . , n1 ]
(1.13)
If C has full rank, i.e. the rank is equal to the order, n, of the system, the system is ontrollable.
Spe i ation of the n = 4 losed loop poles of the system above is not su ient for instan e
to determine the p n = 2 4 oe ients of the state feedba k matrix L of a plant with
p = 2 inputs and n = 4 state variables. The losed loop poles an be pla ed at the desired
lo ation for a multitude of L values if (, ) is ontrollable.
Optimal Control oers design methods whi h instead of fo using on pole lo ation use optimization to minimize a performan e fun tion whi h ideally may be related dire tly to the
goals of ontrol task. This setup an used for MIMO as well as SISO systems. These methods
an be used for
Deterministi as well as sto hasti pro esses

Design of a ontroller where all states an be measured
Design of ontroller to be ombined with an observer (state estimator) if some or all
states annot be measured dire tly
Continuous time as well as dis rete time pro esses
Optimal Control
Page 3 of 61
1.1 General Des ription of Plant and Performan e

In a rather general setting we ould introdu e a dis rete time plant model whi h ould be
nonlinear (later in these we will again onne ourselves to linear systems)
x(k + 1) = G(x(k), u(k))
(1.14)
If the system has n state variables and p inputs x(k) will be an n-dimensional ve tor and u(k)
will b a p-dimensional ve tor, G is a ve tor valued fun tion of dimension n.
In the rst setting it is supposed that
the system is deterministi

the system is time invariant
all system states are measurable
there are no limits on ontrol signals
Later we will introdu e problems with disturban es, non measurable states and noisy measurements.
The performan e fun tion is important in optimal ontrol, be ause it is the base for design
and partly the evaluation of the ontroller. It has the form
I=
N
X
k=0
H(x(k), u(k))
(1.15)
where k is the sampling number. In these notes optimal ontrol problems are formulated
as minimization problem, su h that I is a quantity we want to minimize (is of ourse also
possible to formulate performan e fun tions whi h should be maximized). The problem is
now to nd a sequen e of ontrol signals u(k), k = 0, 1, 2, , N , whi h minimizes I with
x(k) determined by the state equation.
It is in the form and parameters of H the weighting of large ontrol signals versus large states is
determined. For the problem to be tra table you should usually take are that H(x(k), u(k))
is a onvex fun tion of x(k) and u(k). Often the goal will be to bring the state fast from an
initial value to origo with as small an amount of ontrol eort as possible.
In this way the design of a ontrol law an be reformulated as a hoi e of a suitable performan e
fun tion followed by solution an optimization problem.
In the hoi e of performan e we have several degrees of freedom:
1. The stru ture of H. In this ourse we will only onsider H as a quadrati fun tion in
u(k) and x(k). In prin iple several others ould be thought of, some of these are being
explored in urrent resear h.
2. The weighting between x(k) and u(k). It must be determined how you wish to weight
large states versus large ontrol signals, whi h may be seen as good ontrol with small
states versus heap ontrol with small ontrol eort. Through areful weighting it is also
possible to allow use of ertain elements in u more than others, and allow deviations in
some state variables more than others.
Optimal Control
Page 4 of 61
3. The hoi e of N. The time horizon N will determine the weighting of the long term
steady state performan e (large N) versus short term dynami performan e (small N).
The hoi e of performan e fun tion will be ru ial to the behavior of the ontrolled system, and
it is important to see the minimization of the performan e fun tion as tool to obtain a good
ontroller. There is no ontroller whi h is optimal in any absolute sense. A ontroller whi h is
optimal for one performan e fun tion will not be optimal for another. So with optimization the
designer will have to tune the ontroller by tuning the parameters of a performan e fun tion
instead of tuning the ontroller parameters dire tly or tuning the position of the losed loop
poles.
At rst we will onsider the problem of nding a ontrol sequen e u(k), k = 0, 1, 2, , N ,
whi h minimizes a given performan e fun tion with an initial value of the state x(0). This
an be seen as an open loop problem.
After having solved this problem we will su essively expand the method to be used in more
pra ti al losed loop problems, see gure 1.1 .
d1
r
Use of
performance
function
State space
dynamics
d2
x
static
output
relation
Figure 1.1:
Blo k diagram for optimal ontrol
1. Output referen e is equal to zero

2. Output referen e dierent from zero is onsidered
3. Constant and other deterministi disturban es d1 are onsidered
4. Sto hasti disturban e d1 and measurement noise d2 is onsidered
In the next hapter we will derive a method for minimization of the performan e fun tion
using dynami programming.
The performan e (or ost) fun tion used in linear quadrati ontrol whi h is the fo us of these
notes is the integral (or in dis rete time a sum) of weighted squared states and weighted
squared inputs. When the system is linear this performan e leads to a ontroller in whi h the
ontrol signal u is a linear fun tion of the state variables.
Model of plant
x(k + 1) = x(k) + u(k)

y(k) = Hx(k)
Optimal Control
(1.16)
(1.17)
Page 5 of 61
Performan e fun tion (dis rete time)
I=
N
X
x(k)T Q1 x(k) + u(k)T Q2 u(k)
(1.18)
k=0
Control law
(1.19)
u(k) = L(k)x(k)
The linear ontroller whi h minimize a quadrati performan e fun tion is often alled Linear
Quadrati ontrol (LQ) or Linear Quadrati Regulator (LQR). If the pro ess is sto hasti and
the states are not dire tly measurable the states may be estimated using an observer designed
to minimize the estimation error, that is a Kalman lter. With this observer the ontroller is
alled Linear Quadrati Gaussian (LQG).
d1
r
Use of
performance
function
State space
dynamics
d2
x
static
output
relation
x
State
estimator
Figure 1.2:
Blo k diagram for optimal ontrol with state estimation
Optimal Control
Chapter 2
Dynami Programming
Dynami programming is a prin iple whi h breaks omplex de isions into in a series of simpler
de isions. We will use this idea to nd a sequen e of ontrol inputs whi h minimize the
performan e fun tion.
The optimization takes advantage of the fa t, that
a ontrol strategy whi h is optimal in the interval of samples [0; N ] must also be optimal in
any interval [k; N ] with 0 k N .
This is true sin e if it was possible to improve the performan e in the interval [k; N ] this
would also improve the performan e in the entire interval [0; N ]
Based on this we will split the optimization up. We will introdu e the notation
I0N
=
=
N
X
H(x(k), u(k))
k=0
I0N (x(0), u(0), u(1), . . .
, u(N ))
(2.1)
Indi ating that the performan e is dependent of the initial state ve tor and the sequen e of
ontrol signals.This is sensible sin e x(1) is determined from x(0) and u(0) et .
We will also use the notation
J0N (x(0)) = minu(0),...,u(N ) I0N x(0), u(0), u(1), . . . , u(N )
(2.2)
for the obtainable minimum of the performan e fun tion. We will also onsider the ontribution to the performan e fun tion from a part of the interval:
IkN
N
X
i=k
H(x(i), u(i))
= IkN x(k), u(k), u(k + 1), . . . , u(N )
(2.3)
Page 7 of 61
We will now determine the minimal performan e ontribution from the last part of the interval
JkN (x(k)) =
min
u(k),...,u(N )
IkN x(k), u(k), u(k + 1), . . . , u(N )
= min[H(x(k), u(k)) +
u(k)
= min[H(x(k), u(k)) +
u(k)
min
u(k+1),...,u(N )
N
(x(k
Jk+1
N
X
i=k+1
H(x(i), u(i))]
+ 1))]
(2.4)
Noti e that in the last expression x(k + 1) will depend on u(k) through the plant equation.
This may be summarized in the following algorithm
STEP 0: JNN = H(x(N ), 0)

STEP 1: JNN1 = minu(N 1) [H(x(N 1), u(N 1)) + JNN (x(N ))]
We will all the minimizing ontrol signal u (N 1)
STEP i: JNNi = minu(N i) [H(x(N i), u(N i)) + JNNi+1 (x(N i + 1))]
We will all the minimizing ontrol signal u (N i)
STEP N: J0N = minu(0) [H(x(0), u(0)) + J1N (x(1))]

We will all the minimizing ontrol signal u (0)
Example 2.1 (Minimum sear h for 1st order system)
We onsider the rst order system
x(k + 1) = ax(k) + bu(k)
(2.5)
The performan e fun tion is hosen to be quadrati in x(k) and u(k)

N
X
x2 (k) + qu2 (k)
(2.6)
H(k) = x2 (k) + qu2 (k)
(2.7)
J22 (x(2)) = x2 (2)
(2.8)
I=
k=0
We hose N = 2
STEP 0:
u (2) = 0
(2.9)
will be set to zero sin e it does not inuen e any x in the performan e therefore
any nonzero value will in rease the performan e fun tion
u (2)
STEP 1:
J12 (x(1)) = minu(1) [x2 (1) + qu2 (1) + (ax(1) + bu(1))2 ]
(2.10)
dJ12 (x(1))
= 2qu(1) + 2(ax(1) + bu(1))b = 0
du(1)
(2.11)
Optimal Control
Page 8 of 61
u (1) =
ab
x(1)
q + b2
J12 (x(1)) = (1 + q
(2.12)
a2
)x2 (1)
q + b2
(2.13)
STEP 2:
J02 (x(0)) = minu(0) [x2 (0) + qu2 (0) + (1 + q
a2
)(ax(0) + bu(0))2 ]
q + b2
(2.14)
u (0) =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
x(0)
(2.15)
J02 (x(0))
= (1 + q
a
a2 (1 + q q+b
2)
q+
b2 (1
a2
q q+b
2)
)x2 (0)
(2.16)
Now this al ulation has given us the optimal ontrol sequen e

2
u (0) =
u (1) =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
(2.17)
x(0)
ab
x(1)
q + b2
(2.18)
2
a
ab(1 + q q+b
2)
ab
=
)x(0)
(a
b
a2
q + b2
q + b2 (1 + q q+b
2)
(2.19)
(2.20)
u (2) = 0
(2.21)
The optimization results may be iterpreted and applied in two ways
1. With a known initial state we an al ulate in advan e the entire input sequen e
u (0), u (1), , u (N )
and apply it in open loop to bring the plant from the initial state to zero
2. Use the al ulated gain L(k) (in general a matrix, in this simple example a s alar) and
measured state x to al ulate
u (k) = L(k)x(k)
(2.22)
This is losed loop ontrol with a time varying gain (dependent on sample number
k), still with the purpose to bring the plant from an initial state to zero. This losed
loop ontrol is preferable be ause open loop ontrol is vulnerable to disturban es and
un ertainties in the model parameters.
The ontrol is still very spe ialized be ause its only purpose is to bring the plant from an initial
state to zero. Later we will see that the ontroller is extendable also to more realisti ases
where referen e and disturban e signals are present, and where not all states are measurable.
Optimal Control
Chapter 3
Time varying LQ-Control
3.1 LQ ontrol of dis rete time systems

We will onsider the following linear system
x(k + 1) = x(k) + u(k)

The system has n states and p inputs. is a (n n)-matrix and is a (n p)-matrix.
We will suppose that the system is deterministi , we may not need to require that the system
is fully ontrollable, but as minimum unstable modes of the system must be ontrollable
using state feedba k, that is the system is state feedba k stabilizable. At this point we will
also suppose that all states an be measured without error and that the input signal u(k) is
unlimited
The performan e fun tion is hosen to be a quadrati fun tion of x(k) and u(k):
I=
N
1
X
(xT (k)Q1 x(k) + uT (k)Q2 u(k)) + xT (N )QN x(N )
k=0
In the terms introdu ed in the general form of the performan e fun tion we have hosen:
H(k) =
xT (k)Q1 x(k) + uT (k)Q2 u(k)

xT (N )QN x(N )
0k N 1
k=N
The matri es QN , Q1 og Q2 are quadrati and have the dimensions (n n), (n n) and
(p p). The matri es may be interpreted as weight matri es punishing large values of the
nal state, the urrent state and the input
We will suppose that QN and Q1 are positive semidenite and that Q2 is positive denite.
The matrix Q2 is positive denite, if the s alar uT Q2 u is positive for all values u 6= 0. This
imply that nonzero ontrol signals will give positive ontribution to the performan e fun tion.
Q1 is positive semidenite, if xT Q1 x is positive or zero for all x. This imply that we will
allow some states or linear ombinations of states to give zero ontribution to the performan e
fun tion.
9
Page 10 of 61
The three Q-matri is are further symmetri and will therefore be equal to their own transpose
Note that with our hoi e of performan e fun tion we have hosen to have the possibility to
be able to give a spe ial attention to the nal state using the matrix QN .
It is obvious, that you will usually want the nal state to be as lose to the desired value for now a ve tor of zeros - as possible
If we are not spe ially interested in the nal state, we an put QN = Q1 and obtain
N
X
(xT (k)Q1 x(k) + uT (k)Q2 u(k)),
I=
u(N ) = 0
k=0
For a simple rst order system we have earlier al ulated an optimal input sequen e. This
turned up to give inputs u(k) whi h for ea h step were proportional to the urrent state x(k),
a linear but time varying feedba k ontrol law. Further the ontribution to the performan e
fun tion from the samples k . . . N turned out to be a quadrati fun tion of of the urrent
state x(k) We will now try to generalize this experien e to an n-dimensional system, that is
we assume:.
u(k) = L(k)x(k)
JkN (x(k))
= xT (k)S(k)x(k)
We will seek expressions for the matri es L(k) and S(k) in the expressions where L(k) has
the dimension (p n) and S(k) has the dimension (n n). We start with the general dynami
programmed expression from earlier
N
JkN (x(k)) = min[H(x(k), u(k)) + Jk+1
(x(k + 1))]
u(k)
N (x(k + 1)) and sear h

We will insert the assumed quadrati expression from above for Jk+1
for the ontrol signal u (k) whi h will give us the minimal performan e JkN (x(k)) :
JkN (x(k)) = min[xT (k)Q1 x(k) + uT (k)Q2 u(k) + xT (k + 1)S(k + 1)x(k + 1)]
u(k)
= min[xT (k)Q1 x(k) + uT (k)Q2 u(k) +

u(k)
(x(k) + u(k))T S(k + 1)(x(k) + u(k))]
To keep expressions of a reasonable size we relax the notation and leave out the argument k
for the ve tors x og u (but of ourse not for the matrix S whi h is in fo us now !).
JkN (x) = min[xT Q1 x + uT Q2 u + (x + u)T S(k + 1)(x + u)]

u
To nd the optimal ontrol signal u at the time (sample number) k the performan e fun tion
is dierentiated with respe t to u:
Optimal Control
Page 11 of 61
JkN (x)
u
= Q2 u + QT2 u + 2T S(k + 1)(x + u)

= 2Q2 u + 2T S(k + 1)(x + u) = 0
If [Q2 + T S(k + 1)] is invertible this an be solved to give the optimal ontrol signal
(re-entering argument k )
u (k) = [Q2 + T S(k + 1)]1 T S(k + 1)x(k)

This value of u gives the minimal performan e and we an now identify the proportionality
matrix L(k):
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)

We will now insert u = Lx in the expression for JkN (x) and obtain:
JkN (x) = xT Q1 x + (Lx)T Q2 (Lx) + (x Lx)T S(k + 1)(x Lx)

= xT [Q1 + LT Q2 L + ( L)T S(k + 1)( L)]x
= xT S(k)x
N (x(k + 1)) was

We have thus found that JkN (x(k)) is quadrati in x(k) (we assumed that Jk+1
quadrati - in fa t sin e JNN (x(N )) = xT QN x is quadrati we an prove by indu tion that
the assumption holds). We an identify a re ursive expression for the matrix S(k):
S(k) = Q1 + LT (k)Q2 L(k) + ( L(k))T S(k + 1)( L(k))

From this expression it may be seen that sin e Q1 is assumed to be non-negative denite S(k)
will be non-negative denite as well. We will now derive two alternative expressions for S(k).
First S(k) is rewritten, and next L(k) is inserted
S(k) = Q1 + LT Q2 L + ( L)T S(k + 1)( L)

= Q1 + LT Q2 L + (T LT T )S(k + 1)( L)
= Q1 + LT [Q2 + T S(k + 1)]L + T S(k + 1) T S(k + 1)L
LT T S(k + 1)
= Q1 + T S(k + 1)[Q2 + T S(k + 1)]1 [Q2 + T S(k + 1)] . . .

[Q2 + T S(k + 1)]1 T S(k + 1) +
T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)
T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)
Optimal Control
Page 12 of 61
S(k) = Q1 + T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)
Or
S(k) = Q1 + T S(k + 1)[ L(k)]
3.2 Summary of LQ-method for dis rete time systems

We onsider the linear dis rete time, dynami al system
x(k + 1) = x(k) + u(k)

and the quadrati performan e fun tion
I=
N
1
X
k=0
where Q1 and QN positive semidenite and Q2 is positive denite.

An optimal input sequen e an be found by:
u (k) = L(k)x(k)
with
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)

and
= Q1 + T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)
= Q1 + T S(k + 1)[ L(k)]

with
S(N ) = QN
The value of the performan e fun tion using this input sequen e will be:
J0N (x(0)) = xT (0)S(0)x(0)
Note that we have assumed Q2 to be positive denite. This is not allways stri tly nessessary
sin e it is su ient that [Q2 + T S(k + 1)] is positive denite
The rst and the last expression for S(k) will be best suited for re ursive al ulations be ause
they do not in lude matrix invertions.
The se ond expression is often suitable if you wish to let N and nd the value of
S(0). This may be done on the ondition that the re ursive equation for S(k) onverge to
a stationary value whi h an be found by letting S(k) = S(k + 1) = S whi h leads to an
Algebrai Ri ati Equation
It is important to note:
In some problems the interval [0; N ] will represent the only interesting time interval and
Optimal Control
Page 13 of 61
you will not be interested in the behaviour of the plant after N samples.
Here u(k) = L(k)x(k), is a time varying feedba k and the ontrol task is nished after
N samples.
In other problem areas i.e. ontrol systems u(k) = L(0)x(k), represents a onstant
feedba k.
This means that you at ea h sample time let the time horison of the performan e fun tion
be N samples ahead in time.
In the performan e fun tion k = 0 represents the urrent time and you push the time
horison N samples in front of you. This is alled re eding horizon ontrol.
These notes will fo us on the se ond ontrol appli ation
This leads to the following pro edure for al ulation of the optimal ontroller
1.
2.
3.
4.
5.
k := N
S(k) = QN
REPEAT
k := k 1
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)
S(k) = Q1 + LT (k)Q2 L(k) + ( L(k))S(k + 1)( L(k))
= Q1 + T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)
= Q1 + T S(k + 1)[ L(k)]
UNTIL k = 0
Use the linear feedba k u(k) = L(0)x(k)
The obtained performan e fun tion will be
J0N (x(0)) = xT (0)S(0)x(0)
Example 3.1 (The LQ-method with a 1st order system)
We will illustrate the algorithm using our previously used 1st order system where we the al ulated the optimal inputsequen e using minimum sear h in ea h step.
We repeat the system
x(k + 1) = ax(k) + bu(k)
The performan e is like before

I=
N
1
X
(x2 (k) + qu2 (k)) + x2 (N )
k=0
We hose N = 2 og have for this system:

= a, = b, Q1 = 1, Q2 = q og QN = 1.
A ording to the algorithm we al ulate in the order listed
Optimal Control
Page 14 of 61
S(2) = 1
ab
q + b2
ab
a2
S(1) = 1 + a(a b
)
=
1
+
q
q + b2
q + b2
a2 1
a2
L(0) = [q + b2 [1 + q
]
ab(1
+
q
)
q + b2
q + b2
L(1) = [q + b2 ]1 ab =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
2
a
ab(1 + q q+b
2)
a2
)(a
b
)
S(0) = 1 + a(1 + q
2
2
a
q+b
q + b2 (1 + q 2 )
q+b
= 1+q
a2
a2 (1 + q q+b
2)
2
a
+ b2 (1 + q q+b2 )
The optimal ontroller looking 2 samples ahead will be

u(k) = L(0)x(k), with
2
L(0) =
a
ab(1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
Using this ontroller we will obtain a minimal performan e fun tion

2
J02 (x(0))
=1+q
a
a2 (1 + q q+b
2)
2
a
q + b2 (1 + q q+b
2)
x2 (0)
Most often we will not need to know the value of the performan e fun tion.
Now let us ontinue the example and al ulate the optimal ontroller, when we put the time
horison to innity, N = .
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)
whith the parameters in this example these equations amounts to

L(k) =
ab S(k + 1)
q + b2 S(k + 1)
S(k) = 1 + a2 S(k + 1)
a2 b2 S 2 (k + 1)
q + b2 S(k + 1)
Firts we solve for the stationary value of S = S(k) = S(k + 1):

Optimal Control
Page 15 of 61
a2 b2 S 2
q + b2 S
2
b + qa2 q
q
0 = S2
S 2
2
b
b
S = 1 + a2 S
As S must be positive, only one solution of the equation is feasable:

S=
p
1 2
2
(b2 + qa2 q)2 + 4qb2 ]
[b
+
qa
q
+
2b2
Inserting the stationary S in the expression for L(k) give the stationary value of L:
p
a b2 + qa2 q + (b2 + qa2 q)2 + 4qb2
p
L=
b b2 + qa2 + q + (b2 + qa2 q)2 + 4qb2
3.3 Choi e of Weight Matri es

Using the 'The Pole-pla ement method' the design parameters are the desired losed loop
poles and the state feedba k matrix L is designed to give the eigenvalues of the matrix L
the desired poles as eigenvalues.
If r > 1 there will be an innite number of L-matri es, fullling this.
The number of design parameters is equal to the number of losed loop poles, that is n.
Using the LQ method the design parameters are the weight matri es Q1 and Q2 . The
number of design parameters now apparantly be omes very large, n2 + r 2 , but in return L
will be uniquely determined.
The large number an by rst sight be frustrating and some "trial and error" with an intera tive omputer program is ne essary before a satisfa tory result is obtained.
On the other hand some problems are almost reated to be atta ked with LQ design be ause
physi al insight makes it natural to use weighting matri es to give a balan e between tight
ontrol and good e onomy.
If you an not use the many degrees of freedom you may obtain a drasti redu tion by limiting
the weighting matri es to be diagonal. This redu es the number of parameters to determine
to n + r .
In this ase you an use the diagonal elements Q1 (i, i) and Q2 (j, j) as measures of the relative
punishment you will put on large deviations in xi (k) og uj (k).
Having a hoise of weighting matri es you solve the Ri ati equations and nd the L-matrix,
and you an test the ontroller by simulation or in pra ti e.
If the simulation or pra ti al test shows that an input signal uj (k) be omes ua eptable
large/small, then you might want to in rease/de rease the weight Q2 (j, j).
If the simulation or pra ti al test shows that a state xi (k) has a too slow/fast response you
might want to in rease/de rease the weight Q1 (i, i).
Optimal Control
Page 16 of 61
Often it will be ne essary to hange single elements relatively mu h, say a fa tor 10 to see
substantial dieren es in the losed loop response.
If you know your open loop plant well enough to spe ify maximal values of ea h state and
input ( ontrol) signal, that might be from physi al limitations or as an assesment of desired
maximal deviations you an very straight forward hose the diagonal elements as
Q1 (i, i) =
1
x2i,max
Q2 (i, i) =
1
u2i,max
In addition to you should use a s alar weighting fa tor to balan e tight ontrol and e onomy
The performan e fun tion now be omes
I =
=
N
X
k=0
N
X
k=0
(xT (k)Q1 x(k) + uT (k)Q2 u(k))

([
u2 (k)
x21 (k)
x2 (k)
x2 (k)
u2 (k)
u2 (k)
] + [ 21
])
+ 22
+ + 2n
+ 22
+ + 2r
2
xn,max
ur,max
x1,max x2,max
u1,max u2,max
The elements in the performan e fun tion are now s aled by their desired maximal value to
be in the interval [0; 1].
If you are able to assess thes max-values there is only one degree of freedom left represented
by the s alar with the task to weight tight ontrol against e onomy.
3.4 LQ-Control of Continous Time Systems

Until now we have only onsidered dis rete time systems, to be optimally ontrolled
Corresponding expressions an be derived for optimal ontrol of ontinous time systems.
Here we will only summarize the results for ontinous time optimal ontrol.
Optimal Control
Page 17 of 61
3.5 Summary of LQ-method for ontinous time systems

Consider the linear, ontinous time, dynani al system
dx(t)
= Ax(t) + Bu(t)
dt
and the quadradi performan e fun tion
I=
(xT (t)Q1 x(t) + uT (t)Q2 u(t)) + xT (T )QT x(T )
with Q1 , Q2 and QT positive denite. The optimal input signal will be given by u(t) =
L(t)x(t), where
T
L(t) = Q1
2 B S(t)
with
dS(t)
dt
T
= Q1 + AT S(t) + S(t)A S(t)BQ1
2 B S(t)
= Q1 + AT S(t) + S(t)(A BL(t))

and
S(T ) = QT
The performan e fun tion using this inputsignal will be:
J0T (x(0)) = xT (0)S(0)x(0)

It is interesting to ompare L(t) in the dis rete and the ontinous time ases, when we omit
the input ( ontrol) signal in the performan e fun tion, that is when we put Q2 = 0.
For Q2 = 0 we obtain if og B are quadrati (if not the al ulations will be more omplex):
Optimal Control
Page 18 of 61
Dis rete
L(k) = 1
u(k) = L(k)x(k) is inserted in the state equations to obtain
the losed loop equations:
x(k + 1) = x(k) + u(k)
x(k + 1) = x(k) + (1 x(k)) = 0
This is a dead-beat regualtor, taking the state from the initial
x(0) to origo in the state spa e in one step
Continous
L(t) =
I.e u(t) =
It takes inte available input ( ontrol) signal to take
the state from a nite initial state x(0) to origo
in the state spa e in zero time.
For this reason we require Q2 to be positive denite -
Optimal Control
Chapter 4
Stationary LQ-Controllers
4.1 Steady State Values of L(k) and S(k)

Very often - perhaps most often - LQ ontrol will be used with values of L(0) and S(0)
determined for N .
For onvenien e re ursive expressions for ontroller al ulations are repeated here
Dis rete Time:
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)

Continuous time:
T
L(t) = Q1
2 B S(t)
dS(t)
T
= Q1 + AT S(t) + S(t)A S(t)BQ1

2 B S(t)
dt
If we seek an optimal ontroller for a problem where the performan e index is a sum or integral
over a time interval in reasing towards innity, the ontroller will be ome independent of time
L(0) = L(k) = L(k + 1) = L . The steady state value of L and the orresponding steady state
value of S(0) = S(k) = S(k + 1) = S are solutions of a set of steady state Ri ati equations
Dis rete time:
L = [Q2 + T S]1 T S
S = Q1 + T S T S[Q2 + T S]1 T S
Continuous time:
19
Page 20 of 61
T
L = Q1
2 B S
T
0 = Q1 + AT S + SA SBQ1
2 B S
In both ases the equations for S are nonlinear and di ult to solve. The equations are
referred to as algebrai Ri ati equations, ARE's.
If the order of the system is three or larger the equations are almost impossible to solve
by hand. Fortunately it is possible to solve the equations iteratively, simply by using the
same algorithm as in the ase with nite time horizon in the performan e sum with the one
dieren e, that you do not stop the iteration loop for S(k) and L(k) after a xed number (N )
of repetitions, but ontinue the iteration until no more signi ant (using a suitable riterion)
hanges of S(k) and L(k) appear.
Solutions of the stationary Ri ati equations in ontinous or dis rete time may also be found
using the MatLab Control Toolbox fun tion lqr.
Having found S and L we will use the optimal ontroller u(k) = Lx(k). The obtained value
of the performan e index will be
J0 (x(0)) = xT (0)Sx(0)
In pra ti e there is seldom any use for the value of the performan e fun tion. However the
performan e fun tion an only obtain a stationary value if the state ve tor x(k) goes to origo
in state spa e. An optimal ontroller using the steady state value of L to al ulate the ontrol
signal will therefore always for e the state from an initial value x(0) towards origo.
4.2 Example: LQ ontrol of a dis rete time 2nd order system.

We onsider the servome hanism shown in the blo k diagram in gure 4.1. A ontinuous time
state spa e model of this system is

0
1
0
x(t)
=
x(t) +
u(t) = Ax(t) + Bu(t)
0 1.53
76.7
y(t) = [0.013 0]x(t) = Cx(t)
u(s)
76.7
s + 1.53
Figure 4.1:
x2 (s)
1
s
x1 (s)
y(s)
0.013
Blo k diagram for servome hanism
The model is transformed to dis rete time using a sampling time of 10
ms:

1 0.009924
0.003816
x(k + 1) =
x(k) +
u(t) = x(k) + u(k)
0
09848
0.7612
y(k) = [0.013 0]x(k) = Hx(k)
Optimal Control
Page 21 of 61
We will use the performan e fun tion
X
(xT (k)Q1 x(k) + u(k)Q2 u(k))
I=
0
With a diagonal weighting matrix
Q1 =
1 0
0 0
As it may be seen only the state variable representing position is weighted, while there is no
weight on the state variable representing speed.
The weight matrix Q2 is in this ase with only one input a s alar q2 . We will now give q2
several values and al ulate the optimal ontroller in ea h ase using the general re ursive
equations until L(k) has a hieved its steady state value within 1 %.
The result is
q2
q2
q2
q2
:
:
:
:
=0
= 0.00001
= 0.001
= 0.1
L(0) = [262.055 2.601]

L(0) = [111.275 1.689]
L(0) = [22.267 0.743]
L(0) = [2.792 0.251]
Using these four ontrollers we have simulated losed loop progress of the output y(t) and the
input u(t) with the initial state
x(0) =
x1 (0)
x2 (0)
1
0
The results of these simulations are shown in gures 4.2, 4.3, 4.4 and 4.5.
Comments on the simulations:

q2 = 0
In this ase the input signal is not weighted in the performan e fun tion. This results in a
DEAD-BEAT ontroller for ing the state to zero as fast as possible.
From the output plot alone this seems to be the perfe t ontroller, where the state is driven
from the initial state to zero after only one sampling period. But it is important to note, that
the system is simulated in dis rete time and the signals are shown only at the sampling times.
Consequently we know nothing about, how the real ontinuous time system behaves between
sampling times.
The plot of the input behaviour shows, that the pri e for the apparently perfe t ontrol is
high. The input is highly os illatory with amplitudes of approximately 500, indi ating that
the output probably also will os illate with large amplitudes between sampling times.
q2 = 0.00001
This gives a very fast progress of the output with a small overshoot, but still the ontrol input
is very large with a maximum of approximately 100.
q2 = 0.001
Optimal Control
Page 22 of 61
This gives a reasonably fast progress for the output with a small overshoot, and now the
ontrol input is more reasonable with a maximum of 22.
q2 = 0.1
This gives a very slow progress for the output, and only a very small ontrol signal with a
maximum of 2.7
1000
500
0
-500
-1000
0
0.015
*
0.01
0.005
Figure 4.2:
Simulation with q2 = 0 upper plot u, lower plot y
100
50
0
-50
-100
-150
0
15 x10
-3
10
*
5
0
-5
0
Figure 4.3:
Simulation with q2 = 0.00001 upper plot u, lower plot y
Optimal Control
Page 23 of 61
10
0
-10
-20
-30
0
15 x10
-3
10
*
*
*
*
*
0
-5
0
Figure 4.4:
1
0
-1
-2
-3
0
0.015
*
0.01
*
*
*
0.005
Figure 4.5:
Optimal Control
Page 24 of 61
4.3 Example: LQ ontrol of a ontinuous time 2nd order system

We will onsider the ontinuous time system
y(t) + ay(t)
= u(t)
A state spa e model of the system an be given with the state variables x1 (t) = y(t) and
x2 (t) = y(t)
resulting in the equations
x 1 (t) = y(t)
= x2 (t)
x 2 (t) = y(t) = ay(t)
+ u(t) = ax2 (t) + u(t)
These equations ombine dire tly to a full state spa e model

0
0 1
u(t)
x(t) +
x(t)
=
1
0 a
y(t) = [1 0]x(t)

We hoose the performan e fun tion
I=
(x (t)
1 0
0 0
x(t) + u(t)qu(t))dt
So we have the below onstants for our problem
A=
0 1
0 a
B=
0
1
Q1 = Q =
1 0
0 0
Q2 = q
We have hosen only to "punish" the output y(t) = x1 (t), and put no restri tions on the
se ond state variable x2 (t). We partition the steady state value of the matrix S(t) as below
S=
S11 S12
S21 S22
S11 S12
S12 S22
Now we have for determination of L
L=
T
Q1
2 B S
1
= [0 1]
q
S11 S12
S12 S22
1
[S12 S22 ]
q
and as we seek the stationary values the derivative of S will be zero:
Optimal Control
Page 25 of 61
dS(t)
dt
0 0
0 0
1
0

1
0

T
= Q1 + AT S + SA SBQ1
2 B S

0
0 0
S11
S11 S12
+
+
0
S12
1 a
S12 S22

S11 S12
0 1
S11 S12
[0 1]
S12 S22
S12 S22
1 q

0
0
0
0
+
+
0
0
S11 aS12 S12 aS22

S12 1
[S S ]
S22 q 12 22
S12
S22

0 1
0 a
S11 aS12
S12 aS22
This may be written as three s alar equations (the two o diagonal elements give the same
equation be ause S is symmetri ):
1 2
1. 0 = 1 S12
q
1
2. 0 = S11 aS12 S12 S22
q
1 2
3. 0 = 2S12 2aS22 S22
q
From 1. we obtain : S12 =
From 3. we obtain : S22 = aq +
a2 q 2 + 2q q
This give us the optimal ontroller u(t) = Lx(t):
1
u(t) = [S12 S22 ]x(t)
q
1
= (S12 x1 (t) + S22 x2 (t))
q
q
1
= ( qx1 (t) + (aq + a2 q 2 + 2q q)x2 (t))

q
Using x(t)
= Ax(t) + Bu(t) and u(t) = Lx(t) we a hieve:
x(t)
= (A BL)x(t)

0 1
0 1
= (
[S S ])x(t)
0 a
1 q 12 22

0
1
x(t)
=
Sq12 a Sq22
The losed loop poles are determined by the hara teristi equation for the matrix A BL
Optimal Control
Page 26 of 61
det(sI (A BL)) = 0
S22
S12
s2 + (a +
)s +
= 0
q
q
Comparing this with the hara teristi equation for a se ond order system in standard form
you will nd
1
hara teristi frequen y : n =
q 1/4r
a2 q
1
damping ratio
:
=
1+
2
2
For some extreme values of a and q we nd:
q=0
1
2
q=
0
a=0
1
q 1/4
1
2
These values found with the LQ riterion justify, that lassi ontroller design often uses
values of the damping ratio in the proximity of 12
For pure inertia systems, y = kst u(t) (i.e. a = 0) a damping ratio of
independent of the hoi e of weighting q .
Optimal Control
1
2
will be optimal
Chapter 5
Referen es and disturban es
Until now we have onsidered optimal ontrol of a system des ribed in state spa e using a
quadrati performan e fun tion. The purpose was to bring the system from an initial state
to origo in state spa e. The balan e between speed and reasonable use of ontrol signal was
determined by weighting in the performan e index.
A pra ti al ontrol problem will naturally most often be to make the output y(t) tra k a
referen e signal r(t) and/or keep the output lose to the referen e without using too large
ontrol signals u(t). Furthermore the system will most often have disturban es inuen ing
the output.
It is obvious, that it is not possible to optimise a ontroller for all possible referen e signals
and disturban es. In lassi al ontrol it is a ommon pra ti e to adjust parameters in for
example a PID ontroller to give the "best possible" step response. However this does not
imply, that the response of other referen e forms is "best possible".
If you use optimal ontrol the same onditions are valid. It is possible to nd a ontroller
whi h is optimal to ertain lasses of referen e- and disturban e signals
5.1 Modelling referen e and disturban e

In this se tion we will onsider models of signals with the intention to use su h models to
des ribe the behavior of an external signal as a referen e or disturban e, all it r(t). Some
simple signal type may be modelled using the general state spa e des ription, where the initial
ondition x(0) is given
x(k + 1) = r x(k)
r(k) = Hr x(k)
A des ription like this without any input, just initiated by the value of x(0) is alled an
autonomous state spa e des ription see gure 5.1.
27
Page 28 of 61
x(t)
r(t)
Hr
z -1
r
Figure 5.1:
Blo k diagram of an autonomous state spa e model
a) Constant (step), r(t) = K
K is any onstant.
The state spa e des ription of this is very simple:
x(k + 1) = x(k),
x(0) = K
r(k) = x(k)
b) Ramp, r(t) = K1 t
K1 is an arbitrary onstant.
r(t) may be modelled as the solution of the dieren e equation (Ts : sampling time):
r(k) r(k 1)
r(k 1) r(k 2)
=
Ts
Ts
or:
r(k) = 2r(k 1) r(k 2)

Here we introdu e x1 (k) = r(k) and x2 (k) = r(k 1), and a hieve :
x2 (k + 1) = x1 (k) and x1 (k + 1) = 2x1 (k) x2 (k), leading to

2 1
x(k + 1) =
x(k),
1 0
r(k) = [1 0]x(k)
Optimal Control
x(0) =
0
K1
Page 29 of 61
) A eleration,
r(t) = K2 t2
K2 may be an arbitrary onstant.

r(t) may be found as a solution to the dieren e equation (Ts : sampling time):
r(k)r(k1)
Ts
r(k1)r(k2)
Ts
Ts
r(k1)r(k2)
Ts
r(k2)r(k3)
Ts
Ts
or:
r(k) = 3r(k 1) 3r(k 2) + r(k 3)

Here we introdu e the state variables x1 (k) = r(k), x2 (k) = r(k 1) and x3 (k) = r(k 2),
leading to:
x3 (k + 1) = x2 (k)
x2 (k + 1) = x1 (k)
x1 (k + 1) = 3x1 (k) 3x2 (k) + x3 (k)
and a hieve
3
x(k + 1) = 1
0
r(k) = [1 0
3 1
0 0 x(k),
1 0
0]x(k)
0
x(0) = K2
4K2
Similar des riptions may be found for ontinuous time signals. As an example we show the
model of a osine
d) Cosine, r(t) = K3 cos(at)
K3 is an arbitrary onstant.
r(t) is the solution of the dierential equation :
r(t) = a2 r(t)
We introdu e the state variables x1 (t) = r(t) and x2 (t) = r(t)
, to a hieve:
2
x 1 (t) = x2 (t) and x 2 (t) = a x1 (t), or as a full state spa e des ription
0
1
x(t)
=
a2 0
r(t) = [1 0]x(t)
x(t),
Optimal Control
x(0) =
K3
0
Page 30 of 61
e) System with more than one referen e
If a system has m outputs, the referen e ve tor must also have m elements. We will show an
example of this:
A system with two outputs should also have two referen es. We want to model one referen e
as a onstant of size K . The other referen e is modelled as a ramp with slope K1 .
The onstant and the ramp may be modelled as shown below in an autonomous state spa e
model with two outputs
K
xr (0) = 0
K1
1 0 0
xr (k + 1) = 0 2 1 xr (k)
0 1 0
The rst state variable represents the onstant and the se ond state variable represents the
ramp. The third state variable is not present in the referen e ve tor whi h is taken as output
in 2 3 output matrix with zeros in the third olumn.
r(k) =
The full referen e model is thus
xr (k + 1)
r(k)
1 0 0
0 1 0
xr (k)
= r xr (k),
1 0 0
where r = 0 2 1
0 1 0
= Hr xr (k),
where Hr =
1 0 0
0 1 0
Optimal Control
K
and xr (0) = 0
K1
Chapter 6
Control of System with a Referen e
Model
System model
Referen e model
xs (k + 1)
y(k)
xr (k + 1)
r(k)
=
=
=
=
s xs (k) + s us (k)
Hs xs (k)
r xr (k)
Hr xr (k)
For this system we will introdu e the ontrol error
e(k) = r(k) y(k) = Hr xr (k) Hs xs (k)

We want to minimize the performan e fun tion:
I=
N
1
X
(eT (k)Q1e e(k) + uT (k)Q2 u(k)) + eT (N )QN e e(N )
k=0
The system state xs (k) is augmented by the referen e state xr (k), giving the augmented state
ve tor x(k) = [xTs (k) xTr (k)]T . The augmented system is des ribed by

xs (k)
s
=
+
u(k)
xr (k)
0

xs (k)
e(k) = [Hs | Hr ]
xr (k)
xs (k + 1)
xr (k + 1)
s 0
0 r
We will write this with more the ompa t notation
x(k + 1) = x(k) + u(k)

e(k) = Hx(k)
where
31
Page 32 of 61
s 0
0 r
s
0
H = [Hs | Hr ]
The performan e fun tion may be rewritten:
I =
=
N
1
X
(eT (k)Q1e e(k) + uT (k)Q2 u(k)) + eT (N )QN e e(N )
k=0
N
1
X
((Hx(k))T Q1e (Hx(k)) + uT (k)Q2 u(k)) + (Hx(N ))T QN e (Hx(N ))
k=0
N
1
X
(xT (k)HT Q1e Hx(k) + uT (k)Q2 u(k)) + xT (N )HT QN e Hx(N )
k=0
N
1
X
k=0
Note that this performan e fun tion has the stru ture we have used earlier. Therefore we
an use the re ursive expressions derived earlier to nd L(k) and S(k), using the weighting
matri es
Q1 = H Q1e H =
HTs
HTr
QN = H QN e H =
HTs Q1e Hs HTs Q1e Hr

HTr Q1e Hs HTr Q1e Hr
Q1e [Hs | Hr ]

HTs QN e Hs HTs QN e Hr
HTr QN e Hs HTr QN e Hr
These values of , , Q1 , QN and Q2 results in a state feedba k matrix L(0) representing

an optimal ontroller:

xs (k)
u(k) = L(0)x(k) = [Ls (0) | Lr (0)]
xr (k)
= Ls (0)xs (k) Lr (0)xr (k)
A blo k diagram of this ontrol system is shown in gure 6.1
Most often the referen e an be modelled as a step with the autonomous model
xr (k + 1) = Ixr (k)
r(k) = Ixr (k)
This results in the stru ture shown in gure 6.2:
Optimal Control
Page 33 of 61
u(k)
x(t)
-L r (0)
z -1
-1
y(k)
x (k)
Hs
-L s (0)
Hr
r(k)
Figure 6.1:
Blo k diagram of optimal ontrol system with referen e

u(k)
r(k)
-L r (0)
-1
y(k)
x (k)
Hs
-L s (0)
Figure 6.2:
Blo k diagram of optimal ontrol system with step-referen e
We will now show that the feedba k from the states of the original system Ls (0) xs (k), will
be equal to feedba k we would get if the referen e had been zero as it was earlier.
For the augmented system we have the equations
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)

S(k) = Q1 + T S(k + 1)[ L(k)]
We will partition the matrix a ording to the partition of the state ve tor in system states
and referen e states
S11 (k) S12 (k)

S(k) =
S21 (k) S22 (k)
L(k) = [Ls (k) | Lr (k)]
This gives us the following partition of the Ri ati equation
Optimal Control
Page 34 of 61
S(k) =
S11 (k) S12 (k)

S21 (k) S22 (k)
T

s
HTs Q1e Hs HTs Q1e Hr
S11 (k + 1) S12 (k + 1)
0
+
HTr Q1e Hs HTr Q1e Hr

0 Tr
S21 (k + 1) S22 (k + 1)

s
s 0
[Ls (k) | Lr (k)]
0 r
0
HTs Q1e Hs + Ts S11 (k + 1)(s s Ls (k))
HTr Q1e Hs + Tr S21 (k + 1)(s s Ls (k))
HTs Q1e Hr + Ts S11 (k + 1)s Lr (k) + Ts S12 (k + 1)r
HTr Q1e Hr Tr S21 (k + 1)s Lr (k) + Tr S22 (k + 1)r
For the feedba k matrix we nd:
L(k) = [Ls (k) | Lr (k)]

1
S11 (k + 1) S12 (k + 1)
s
T
=
Q2 + [s | 0]
S21 (k + 1) S22 (k + 1)
0

s 0
S11 (k + 1) S12 (k + 1)
T
[s | 0]
S21 (k + 1) S22 (k + 1)
0 r

1
s
=
Q2 + [Ts S11 (k + 1) | Ts S12 (k + 1)]
0

s 0
T
T
[s S11 (k + 1) | s S12 (k + 1)]
0 r
= [Q2 + Ts S11 (k + 1)s ]1 [Ts S11 (k + 1)s | Ts S12 (k + 1)r ]
From this we an pull out the sub matrix orresponding to the original system:
Ls (k) = [Q2 + Ts S11 (k + 1)s ]1 [Ts S11 (k + 1)s ]

S11 (k) = HTs Q1e Hs + Ts S11 (k + 1)(s s Ls (k))
These matri es give an optimal ontroller for the system alone i.e. with the referen e equal
to zero and with the performan e index
I =
=
N
1
X
(yT (k)Q1e y(k) + uT (k)Q2 u(k)) + yT (N )QN e y(N )
k=0
N
1
X
(xT (k)HTs Q1e Hs x(k) + uT (k)Q2 u(k)) + xT (N )HTs QN e Hs x(N )
k=0
We have now shown the interesting result that Ls (0), the feedba k proportionality matrix
from the system state xs (k) is independent of the presen e of a referen e ve tor and an be
obtained using the weight matrix
Optimal Control
Page 35 of 61
Q1 = HTs Q1y Hs = HTs Q1e Hs

in other words Ls (0) is independent of the referen e model
6.1 Example, First Order System with onstant referen e

System
Referen e
xs (k + 1)
y(k)
xr (k + 1)
r(k)
=
=
=
=
as xs (k) + bu(k)
cxs (k)
xr (k),
xr (k)
xr (0) = K
With e(t) = r(t) y(t), we seek a minimum for the performan e index
I =
=
N
1
X
(eT (k)q1e e(k) + uT (k)q2 u(k)) + eT (N )qN e e(N )
k=0
N
1
X
(q1e e2 (k) + q2 u2 (k)) + qN e e2 (N )
k=0
Be ause the weighting matri es are s alars we an x q1e = 1 and onsider q2 = q in the interval
[0; ]. For the augmented system the two dimensional state ve tor xT (k) = [xs (k) | xr (k)]
is introdu ed giving the model:

a 0
b
x(k + 1) =
x(k) +
u(k) = x(k) + u(k)
0 1
0
e(k) = [c 1]x(k) = Hx(k)
The performan e index is:
I=
N
1
X
(xT (k)HT q1e Hx(k) + uT (k)q2 u(k)) + xT (N )HT qN e Hx(N )
k=0
with the weighting matri es
Q1 = H q1e H =
Q2 = q 2 = q
T
QN e = H qN e H =
c
1
c
1
1[c 1] =
c2 c
c 1
1[c 1] = qN e
c2 c
c 1
Combining this with the augmented matri es , and H you may nd L(0) and S(0) using
the usual re ursive expressions.
Optimal Control
Page 36 of 61
This leads to the ontroller
u(k) = L(0)x(k)

xs (k)
= [Ls (0) | Lr (0)]
xr (k)
= Ls (0)xs (k) Lr (0)xr (k)
orresponding to the stru ture shown in gure 6.3.

r(k)
u(k)
x (k)
-L r (0)
x (k)
c
z-a
-L s (0)
Figure 6.3:
Stru ture of rst order system with onstant referen e.
We will now detail the expressions for L(k) and S(k).
L(k) = [Q2 + T S(k + 1)]1 T S(k + 1)

= [q + b2 S11 (k + 1)]1 [abS11 (k + 1) bS12 (k + 1)]
abS11 (k + 1)
bS12 (k + 1)
= [
] = [Ls (0) | Lr (0)]
2
q + b S11 (k + 1) q + b2 S11 (k + 1)
S(k) = Q1 + T S(k + 1)[ L(k)]

2

c
c
aS11 (k + 1) aS12 (k + 1)
a bLs (k) bLr (k)
=
+
c 1
S21 (k + 1) S22 (k + 1)
0
1
2

c + aS11 (k + 1)(a bLs (k)) c aS11 (k + 1)bLr (k) + aS12 (k + 1)
=
c + S21 (k + 1)(a bLs (k))
1 S21 (k + 1)bLr (k) + S22 (k + 1)

S11 (k + 1) S12 (k + 1)
=
S21 (k + 1) S22 (k + 1)
From this al ulation it may be seen, that the feedba k from the system state will be determined from the oupled pair of re ursive equations
Ls (k) =
abS11 (k + 1)
,
q + b2 S11 (k + 1)
S11 (k + 1) = c2 + aS11 (k + 1)(a bLs (k))
that is, pre isely the expressions you would a hieve if the referen e was zero and you hose to
weight the output y(t) using the weighting matrix Q1y = Q1e .
Optimal Control
Page 37 of 61
6.2 Example, se ond order system with onstant referen e

On e again we onsider the dis rete time system
1 0.009924
x(k + 1) =
0 0.9848
y(k) = [0.013 0]x(k)
We used
Q1 =
x(k) +
1 0
0 0
0.003816
0.7612
u(t)
I.e. the state x1 (k) was punished with fa tor 1, orresponding to a weight on the output y(t)
of 1/(0, 013)2 = 5917, while the state x2 was weighted with a fa tor 0.
Simulations showed that a suitable value of Q2 = q2 would be 0, 001.
For this example we will augment the state des ription with a referen e state as a third state
variable:
1 0.009924 0
0.003816
x(k + 1) = 0 0.9848 0 x(k) + 0.7612 u(t)
0
0
1
0
e(k) = [0.013 0 1]x(k)
we will use the performan e index
I=
(eT (k)q1e e(k) + uT (k)q2 u(k))
Again we hoose q2 = 0.001 and q1e = 5917, i.e. the same weight on the output as we used
when the referen e was zero. We a hieve:
Q1
0.013
5917[0.013 0 1]
=
0
1
1
0 76.92
=
0
0
0
76.92 0 5917
Use if the above weighting matri es lead to the optimal ontroller
L(0) = [22.267 0.743 1713]

A simulation of the losed loop is shown in gure 6.4 showing input and output, with the
initial state
Optimal Control
Page 38 of 61
xT (0) = [0 0 1]
The 3rd state variable whi h is the referen e stateis initialized to 1 introdu ing a unity step
on the referen e with the system state initially being zero.
It an be seen that L(0) is un hanged from the earlier al ulation resulting in the same losed
loop dynami s. This is onrmed by the simulated step response.
Note that the steady state error is zero only be ause the system itself has an integration. If
this had not been the ase you would have seen a steady state error be ause the ontroller in
prin iple is a proportional ontroller.
2000
1500
1000
500
0
-500
0
10
12
14
16
18
20
1.5
*
*
0.5
*
*
0*
0
Figure 6.4:
10
12
14
16
18
20
Closed loop simulation of step response, upper plot u(t). Lower plot y(t)
Optimal Control
Chapter 7
Using a Disturban e Model in the
Control Law
Model of system
Model of disturban e
xs (k + 1)
y(k)
xd (k + 1)
d(k)
= s xs (k) + s us (k) + d d(k)

= Hs xs (k)
= d xd (k)
= Hd xd (k)
We want to minimize the performan e fun tion:
I=
N
1
X
(yT (k)Q1y y(k) + uT (k)Q2 u(k)) + yT (N )QN y y(N )
k=0
The system state ve tor xs (k) is augmented with the disturban e state ve tor xd (k), thus
the augmented state ve tor in ludes x(k) = [xTs (k) xTd (k)]T . The equations des ribing the
augmented states are:

s d Hd
xs (k)
s
=
+
u(k)
0
d
xd (k)
0

xs (k)
y(k) = [Hs | 0]
xd (k)
xs (k + 1)
xd (k + 1)
In short notation this an be written
x(k + 1) = x(k) + u(k)

y(k) = Hx(k)
with
s d Hd
0
d
The performan e fun tion may be rewritten:

39
s
0
H = [Hs | 0]
Page 40 of 61
I =
=
N
1
X
(yT (k)Q1y y(k) + uT (k)Q2 u(k)) + yT (N )QN y y(N )
k=0
N
1
X
(xT (k)HT Q1y Hx(k) + uT (k)Q2 u(k)) + xT (N )HT QN y Hx(N )
k=0
N
1
X
k=0
We observe that the performan e index has the same form as we have used in general derivations, therefore we an use the earlier derived general re ursive expressions to determine L(k)
og S(k) using
Q1
QN

HTs Q1y Hs 0
= H Q1y H =
Q1y [Hs | 0] =
0
0
T

T
Hs Q N y Hs 0
Hs
T
QN y [Hs | 0] =
= H QN y H =
0
0
0
T
HTs
0
This vil give us a value of L(0) for the optimal ontroller:

xs (k)
u(k) = L(0)x(k) = [Ls (0) | Ld ]
xd (k)
= Ls (0)xs (k) Ld (0)xd (k)
Also in this ase it an be shown that Ls (0), the feedba k from the system state ve tor is
pre isely the same as if there was no disturban e.
Ls (0) is in other words independent of the disturban e model. The stru ture of the ontrolled
system is shown in gure 7.1 .
Optimal Control
Page 41 of 61
z 1
x (k)
d
L d (0)
d Hd
u(k)
+
z 1
y(k)
x s (k)
Hs
L s (0)
Figure 7.1:
Stru ture of ontrolled system with disturban e model
Optimal Control
Chapter 8
LQ-Control with integral a tion
In hapter 7 and hapter 6 we introdu ed state ve tors representing referen es and disturban es a ting on the systems. Controllers for systems in luding a disturban e model rely on
measuements of the disturban e state ve tor. If a model and measurements of the disturban es
are available this method may be suitable.
If no detailed model and measurements of the disaturban es are available it may be a better
approa h to a ount for at least onstant disturban es by the introdu tion of an extra state
ve tor representing the integral of the ontrol error:
xi (k + 1) = xi (k) + e(k)
where e(k) = r(k) y(k) is the ontrol error:
xi (k + 1) = xi (k) + r(k) y(k)

In the performan e index we want in this way to use weighting matri es to punish the ontrol
error e(k), integrated ontrol error xi (k) and the input signal u(k):
I =
N
1
X
(eT (k)Q1e e(k) + xTi (k)Q1i xi (k) + uT (k)Q2 u(k)) +
k=0
T
e (N )QN e e(N ) + xTi (N )QN i xi (N )

This may be interpreted the way, that the hoi e of matri es Q1e og Q1i determine a kind of
PI- ontroller. As usual we will ta kle the problem by adding extra dimensions to the state
spa e
The System is modelled by
The referen e model
The integral state
xs (k + 1)
y(k)
xr (k + 1)
r(k)
xi (k + 1)
e(k)
= s xs (k) + s us (k)
= H s xs (k)
= r xr (k)
= H r xr (k)
= xi (k) + e(k)
= H r xr (k) H s xs (k)
With the augmented state ve tor x(k) = [xTs (k) xTr (k) xTi (k)]T the state spa e des ription
is
42
Page 43 of 61
s
0 0
s
x(k + 1) = 0
r 0 x(k) + 0 u(k) = x(k) + u(k)
H s H r I
0
y(k) = [H s | 0 | 0]x(k) = H y x(k)
e(k) = [H s | H r | 0]x(k) = H e x(k)
xi (k) = [0 | 0 | I]x(k) = H i x(k)
The performan e index is rewritten
I =
N
1
X
(eT (k)Q1e e(k) + xTi (k)Q1i xi (k) + uT (k)Q2 u(k)) +
k=0
T
e (N )QN e e(N ) + xTi (N )QN i xi (N )

=
N
1
X
(xT (k)[H Te Q1e H e + H Ti (k)Q1i H i (k)]x(k) + uT (k)Q2 u(k)) +
k=0
T
x (N )[H Te QN e H e + H Ti (k)QN i H i (k)]x(N )

=
N
1
X
k=0
with
Q1 = H Te Q1e H e + H Ti Q1i H i
H Ts (k)Q1e H s H Ts (k)Q1e H r 0
= H Tr (k)Q1e H s H Tr (k)Q1e H r
0
0
0
Q1i
QN
= H Te QN e H e + H Ti QN i H i
H Ts (k)QN e H s H Ts (k)QN e H r
0
= H Tr (k)QN e H s H Tr (k)QN e H r
0
0
0
QN i
With these values of the matri es , , Q1 , Q2 og QN we have a performan e index of the

form we have solved earlier and onsequently we an use the results from earlier to al ulate
L(k) og S(k). The optimal ontroller wil be of the form:
xs (k)
u(k) = L(0)x(k) = [Ls (0) | Lr | Li ] xr (k)
xi (k)
= Ls (0)xs (k) Lr (0)xr (k) Li (0)xi (k)
Optimal Control
Page 44 of 61
orresponding to the stru ture shown in gure
u(k)
x r (k)
z -1
??
z -1
-L r (0)
x s (k)
-L i (0)
x i (k)
1
-L s (0)
z-1
y(k)
r(k)
H
Figure 8.1:
Hs
Stru ture of the optimal ontroller with referen e state and integral state.
8.1 Exampel, rst order system with referen e and integral

state
System model
Referen e model
Integral state
xs (k + 1)
y(k)
xr (k + 1)
r(k)
xi (k + 1)
e(k)
Performan e index
I=
= axs (k) + bu(k)

= cxs (k)
= xr (k)
= xr (k)
= xi (k) + e(k)
= r(k) y(k) = xr (k) cxs (k)
N
1
X
(e2 (k)q1e + x2i (k)q1i + u2 (k)q2 ) + e2 (N )qN e + x2i (N )qN i
k=0
With the augmented state ve tor x(k) = [xs (k) xr (k) xi (k)]T the state equation is

a 0 0
b
x(k + 1) = 0 1 0 x(k) + 0 u(t) = x(k) + u(k)
c 1 1
0
y(k) = [c 0 0]x(k) = H y x(k)
e(k) = [c 1 0]x(k) = H e x(k)
xi (k) = [0 0 1]x(k) = H i x(k)
Optimal Control
Page 45 of 61
The performan e index is
I =
N
1
X
(xT (k)[H Te q1e H e + H Ti (k)q1i H i (k)]x(k) + uT (k)Q2 u(k)) +
k=0
T
x (N )[H Te qN e H e + H Ti (k)qN i H i (k)]x(N )

=
N
1
X
k=0
with
Q1
Q2
QN
c
0
=
1
q1e [c 1 0] + 0 q1i [0 0 1]
0
1
q1e c2 q1e c 0
=
q1e c
q1e
0
0
0
q1i
= q2
qN e c2 qN e c 0
= qN e c
qN e
0
0
0
qN i
With these values of the matri es , , Q1 , Q2 og QN the feedba k matrix an be found

using the usual algorithm:
xs (k)
u(k) = L(0)x(k) = [Ls (0) | Lr | Li ] xr (k)
xi (k)
= Ls (0)xs (k) Lr (0)xr (k) Li (0)xi (k)
whi h orresponds to the blo k diagram shown in gure
Optimal Control
??
Page 46 of 61
u(k)
r(k)
z -1
-L r (0)
x s (k)
-L i (0)
x i (k)
1
-L s (0)
z-1
y(k)
+
Figure 8.2:
Hs
The stru ture of the optimal ontroller with referen e and integral state
Optimal Control
Chapter 9
Sto hasti LQ Control with Full State
Information
9.1 Sto hasti Optimal Control

Until now we have dealt ex lusively with optimal ontrol of deterministi systems, that is we
have determined the optimal ontroller, u(k) = L(0)x(k), where x(k) may have elements
from the states of the system, a referen e state, a disturban e state and an integral state
ve tor, but these have been modeled as deterministi systems.
Now we will allow sto hasti models for the states, and we will - as in the deterministi ase
- begin with the ase with no referen e and no deterministi disturban e.
Later it will be simple to in lude ases with referen es and disturban e extending the state
spa e with the ne essary state variables.
We will dierentiate distin tly between two ases, omplete state information and in omplete
state information.
In the ase with
measured.
omplete state information,
whi h we will treat rst, all states are
In the ase with in omplete state information some states are not measured and in addition the available measurements may be noisy. Therefore it is ne essary to re onstru t the
state ve tor using measured values of the outputs from the system, y(t), using an observer,
also alled a state estimator.
9.2 Full state information

In this se tion we onsider the ase where all state variables of the system an be measured
at ea h sampling time.
We will des ribe the system using a sto hasti state spa e model
x(k + 1) = x(k) + u(k) + ex (k)

where ex (k) is a sto hasti pro ess and onsequently x(k) is also sto hasti .
47
Page 48 of 61
At ea h sampling time k we an measure x(k) and determine u(k). In the time between
samples k and k + 1 the ontrol signal u(k) will be known but the next state ve tor x(k + 1)
will be un ertain be ause of the non measurable sto hasti noise ex (k).
The system has n state variables and p inputs. has dimensions n n and dimensions
n p. Furthermore we will assume that the system is stabilizable and that the input signal
u(k) is unlimited.
The state noise ex (k) is a ve tor of dimension n onsisting of white normal distributed sequen es with
Expe ted value:
Varian e:
Covarian e matrix:
E{ex (k)} = 0
E{ex (k)eTx (k)} = Rex (symmetri , (n n))
E{ex (k)eTx (k + l)} = Rex (l)
About the initial state we assume:

Expe ted value
Covarian e matrix
E{x(0)} = xm (0)
E{(x(0) xm (0))(x(0) xm (0))T } = Rx (0)
Rex og Rx (0) are both positive denite.
9.3 Performan e fun tion

If we use the performan e fun tion from the deterministi ase
I0N
N
1
X
k=0
we will not be able to tell in advan e (at sample 0) whi h values of u(k) will minimize the
performan e fun tion or what the value of the minimum will be. These values depend on
inputs ex (k) whi h are not known at the time we need the to de ide value of u. If we know
sto hasti properties of ex as we assume, we may also al ulate sto hasti properties resulting
from sequen es u1 and u2 for instan e if E{I1 } < E{I2 }.
We will use the expe ted value of the squared sum dire tly as performan e fun tion:
I0N
N
1
X
= E{
(xT (k)Q1 x(k) + uT (k)Q2 u(k)) + xT (N )QN x(N )}
k=0
N
X
= E{
H(x(k), u(k))}
k=0
where the last expression may be regarded as an abbreviation.

We sear h the minimum of the performan e index with respe t to the input sequen e
u(0), u(1), u(2), , u(N 1), u(N ). We begin setting u(N ) = 0, and ontinue as in
the deterministi ase with a part of this interval, [k; N ].
Optimal Control
Page 49 of 61
IkN
min
IN
u(k), ,u(N ) k
N
X
= E{
H(x(i), u(i))}
i=k
N
X
H(x(i), u(i))}
min
E{
u(k), ,u(N )
i=k
This an be elaborated as follows
min
IN
u(k), ,u(N ) k
N
X
min
E{
H(x(i), u(i))|x(k)}
u(k), ,u(N )
i=k
min
E{H(x(k), u(k)) +
u(k), ,u(N )
N
X
i=k+1
min
(H(x(k), u(k)) + E{
u(k), ,u(N )
N
X
H(x(i), u(i))|x(k)}
i=k+1
H(x(i), u(i))})
= min(H(x(k), u(k)) +
min
E{
u(k)
u(k+1), ,u(N )
N
X
i=k+1
= min(H(x(k), u(k)) + E{
min
E{
u(k)
u(k+1), ,u(N )
H(x(i), u(i))})
N
X
i=k+1
H(x(i), u(i))|x(k + 1)}})
Here E{. . . |x(k)} designates expe tation on ondition of x(k). We will now introdu e the
term
N
X
min
E{
JkN (x(k)) =
H(x(i), u(i))|x(k)}
u(k), ,u(N )
i=k
with the relation
min
IN
u(k), ,u(N ) k
= E{JkN (x(k))}
with
N
(x(k + 1))})
JkN (x(k)) = min(H(x(k), u(k)) + E{Jk+1
u(k)
Using H(x(k), u(k)) = xT (k)Q1 x(k) + uT (k)Q2 u(k) we a hieve:
min
IN
u(k), ,u(N ) k
= E{JkN (x(k))}
with
N
(x(k + 1))})
JkN (x(k)) = min(xT (k)Q1 x(k) + uT (k)Q2 u(k) + E{Jk+1
u(k)
To solve the minimization we will use the following
Optimal Control
Page 50 of 61
9.4 Expe tation of a Quadrati form

An expression for expe tation of a quadrati form like
E{v T (k)Av(k)}
will be derived for use in the following se tions. v(k) is assumed to be a sto hasti pro ess
with expe tation v m (k) and a ovarian e matrix Rv , A is a onstant quadrati matrix.
E{v T (k)Av(k)} = E{(v(k) v m (k))T A(v(k) v m (k))} + E{v T (k)Av m (k)} +

E{v Tm (k)Av(k)} E{v Tm (k)Av m (k)}
= E{(v(k) v m (k))T A(v(k) v m (k))} + v Tm (k)Av m (k) +

v Tm (k)Av m (k) v Tm (k)Av m (k)
= v Tm (k)Av m (k) + E{(v(k) v m (k))T A(v(k) v m (k))}
= v Tm (k)Av m (k) + E{tr[A(v(k) v m (k))(v(k) vm (k))T ]}

= v Tm (k)Av m (k) + tr[E{A(v(k) v m (k))(v(k) vm (k))T }]
= v Tm (k)Av m (k) + tr[AE{(v(k) v m (k))(v(k) vm (k))T }]

= v Tm (k)Av m (k) + tr[ARv ]
tr[A] is termed the tra e of A and is dened as the sum of the diagonal elements of A
9.5 Derivation of Re ursive expressions for L(k) and S(k)

In the deterministi ase we assumed the minimum of the performan e index to have the form:
JkN (x(k)) = xT (k)S(k)x(k)

In the sto hasti ase we will assume the form
JkN (x(k)) = xT (k)S(k)x(k) + w(k)

where the extra term w(k) is a s alar independent of x(k)
This implies:
= min[xT (k)Q1 x(k) + uT (k)Q2 u(k) + E{xT (k + 1)S(k + 1)x(k + 1) + w(k + 1)}]
u(k)
u(k)
E{(x(k) + u(k) + ex (k))T S(k + 1)(x(k) + u(k) + ex (k)) + w(k + 1)}]

u(k)
(x(k) + u(k))T S(k + 1)(x(k) + u(k)) + E{eTx (k)S(k + 1)ex (k)} + w(k + 1)]
Optimal Control
Page 51 of 61
In the last expression two terms have vanished be ause ex (k) is un orrelated to x(k) og u(k).
The expe tation of ex (k) is zero and the ovarian e matrix is Eex (k)T ex (k) = Rex . We an
using the expression of expe tation of a quadrati a form to nd:
JkN (x(k)) = min[xT (k)Q1 x(k) + uT (k)Q2 u(k) +

u(k)
(x(k) + u(k))T S(k + 1)(x(k) + u(k)) + tr[S(k + 1)Rex ] + w(k + 1)]
dJkN (x(k))
du(k)
= 2Q2 u(k) + 2T S(k + 1)(x(k) + u(k)) = 0
f or
u(k) = u (k) = [Q2 + T S(k + 1)]1 T S(k + 1)x(k)

This value of u(k) will give the minimal performan e ontribution for the interval [k N ],
JkN (x(k)). The optimal ontroller an now be written
u (k) = L(k)x(k)
with
L(k) = [Q2 + S(k + 1)]1 T S(k + 1)
Insertion of u (k) = L(k)x(k) in JkN (x(k)), gives
= xT (k)Q1 x(k) + xT (k)LT (k)Q2 L(k)x(k) +

(x(k) L(k)x(k))T S(k + 1)(x(k) L(k)x(k)) +
tr[S(k + 1)Rex ] + w(k + 1)
= xT (k)[Q1 + LT (k)Q2 L(k) + ( L(k))T S(k + 1)( L(k))]x(k) +

tr[S(k + 1)Rex ] + w(k + 1)]
From this an be seen:

w(k) = tr[S(k + 1)Rex ] + w(k + 1)
with S(N ) = QN and w(N ) = 0.

The expressions for L(k) and S(k) are identi al to the orresponding expressions for the
deterministi ase.
The two alternative expressions for S(k) will therefore also be identi al and are given in the
summary for ompleteness.
All we need now is to al ulate the value of the performan e index by use of the optimal
ontroller.
Optimal Control
Page 52 of 61
min I0N
u(0)...u(N )
= E{J0N (x(k))}
= E{xT (0)S(0)x(0) + w(0)}
=
xTm (0)S(0)xm (0)
+ tr[S(0)Rx (0)] +
N
1
X
tr[S(k + 1)Rex ]
k=0
The rst term of the minimal performan e index orresponds to the deterministi ase while
the last two terms are due to the noise ae ting the states.
The two rst terms are onstant, while the last term is growing with N .
We an on lude, that for the sto hasti system with omplete state information, the optimal
ontroller will be identi al to the one derived for the deterministi ase (with no state noise).
The only dieren e is, that minimal performan e index is larger in the sto hasti ase.
Optimal Control
Page 53 of 61
9.6 Summary of LQ Method for sto hasti , dis rete time systems with omplete state information
For the linear, dis rete time, sto hasti system
x(k + 1) = x(k) + x(k) + ex (k)

with
E{ex (k)} = 0
E{ex (k)eTx (k)} = Rex
E{x(0)}
= xm (0) E{(x(0) xm (0))(x(0) xm (0))T } = Rx
and a quadrati performan e index
N
1
X
I = E{
k=0
where Q1 and QN are positive denite and Q2 is positive semi denite, the optimal input
sequen e will be determined by:
u(k) = L(k)x(k),
with
L(k) = [Q2 + S(k + 1)]1 T S(k + 1),

T
with
S(k) = Q1 + L (k)Q2 L(k) + ( L(k)) S(k + 1)( L(k))
= Q1 + T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)

= Q1 + T S(k + 1)( L(k)),
and
S(N ) = QN
With this input sequen e we will obtain the performan e index:
min I0N
xTm (0)S(0)xm (0) + tr[S(0)Rx (0)]
{z
as deterministic
Optimal Control
N
1
X
tr[S(k + 1)Rex ]
k=0
{z
due to state noise
Chapter 10
Sto hasti LQ Control with
In omplete State Information
10.1 In omplete State Information

If it is not - as assumed till now - possible to measure all state variables of the system it is
said to have in omplete state information.
In this ase it is ne essary to re onstru t/estimate/observe the state ve tor of the system
from simultaneous values of the inputs to the system u(k) outputs y(k). The measurements
of the output will be un ertain be ause of a sto hasti noise ontribution whi h for sure will
in rease the value of performan e index.
The system an be modeled using a sto hasti state spa e des ription:
x(k + 1) = x(k) + u(k) + ex (k)

y(k) = Hx(k) + ey (k)
We will for the present assume that state noise ex (k) and output noise ey (k) are both white
and normal distributed and mutually independent sequen es with mean value zero and with
ovarian e matri es Rex and Rey .
The performan e index is as in the ase with omplete state information
N
1
X
I = E{
k=0
This apparently ompli ated problem - to nd the optimal ontroller for a system whi h has
both state noise and output noise implying that the exa t knowledge of the states is not
available - an fortunately be solved relatively simple using the separation theorem, whi h we
will state without proof.
54
Page 55 of 61
10.2 Separation theorem

The problem to determine the optimal ontroller for a system with in omplete state information an be separated into two parts:
1. Determine an estimator giving the optimal estimate of the system state ve tor from
observations of the system input and output. This is also alled an Observer.
2. Determine an optimal feedba k law from the estimated states. This feedba k law is the
same as if omplete state information was available.
In the following two pages a summary of LQ method for a sto hasti dis rete time system
with in omplete state information will be given.
Optimal Control
Page 56 of 61
10.3 Summary of LQ method for sto hasti , dis rete time systems with in omplete state information.
For the linear, dis rete time, sto hasti system
x(k + 1) = x(k) + u(k) + ex (k)

y(k) = Hx(k) + ey (k)
where ex (k) and ey (k) are white normal distributed mutually independent sequen es with
E{ex (k)} = 0
E{ex (k)eTx (k)} = Rex
E{ey (k)} = 0
E{ey (k)eTy (k)} = Rey
E{x(0)}
= xm (0) E{(x(0) xm (0))(x(0) xm (0))T } = Rx
and with a quadrati performan e index:
N
1
X
I = E{
k=0
where Q1 and QN er positive denite and Q2 is positive semi denite, the optimal input
sequen e will be determined by:
u(k) = L(k)
x(k),
with
L(k) = [Q2 + S(k + 1)]1 T S(k + 1),

T
with
S(k) = Q1 + L (k)Q2 L(k) + ( L(k)) S(k + 1)( L(k))
= Q1 + T S(k + 1) T S(k + 1)[Q2 + T S(k + 1)]1 T S(k + 1)

= Q1 + T S(k + 1)( L(k)),
and
S(N ) = QN
and where the optimal observer x

(k), giving minimum varian e of the re onstru tion error
x(k) x
(k), is determined by:
with
x
(k + 1) =
x(k) + u(k) + K(k)[y(k) H x
(k)]
K(k) = P (k)H T [Rey + HP (k)H T ]1
Optimal Control
Page 57 of 61
where
P (k + 1) = Rex + K(k)Rey K T (k) + ( K(k)H)P (k)( K(k)H)T
= Rex + P (k)T P (k)H T [Rey + HP (k)H T ]1 HP (k)T

= Rex + ( K(k)H)P (k)T ,
and
P (0) = Rx (0)
With this input sequen e we obtain the performan e index:
min I0N
N
1
X
tr[S(k + 1)Rex ] +
= xTm (0)S(0)xm (0) + tr[S(0)Rx (0)] +
{z
}
|
k=0
som determin.
|
{z
}
due to state noise
N
1
X
|k=0
tr[P (k)LT (k)T S(k + 1)]

{z
due to output noise
The stru ture of the ontroller is shown in the blo k diagram in gure 9.1.
Referen es and Disturban es
If the referen e is not zero and/or the disturban es ex (k) and ey (k) are not white sequen es
you an as in the deterministi ase model referen e and/or disturban e and in lude the states
of this model in an extended state ve tor.
The system with extended state ve tor will then have a performan e index in standard form
and you determine an optimal ontroller as
u(k) = L(0)
x(k)
where the extended state ve tor is used
x
s (k)
x
r (k)
u(k) = [Ls (0) Lr (0) Lex (0) Ley (0)]

x
ex (k)
x
ey (k)
= Ls (0)
xs (k) Lr (0)
xr (k) Lex (0)
xex (k) Ley (0)
xey (k)
Optimal Control
Page 58 of 61
System
e x(k)
e y (k)
u(k)
z -1
y(k)
x(k)
H
Observer
K
z -1
^
x(k)
H
Controller
-L
Figure 10.1:
Stru ture of the optimal ontroller with state observer
10.4 Duality between ontroller and observer

In the summary for the ase with in omplete state information you remark a distin t duality
between the ontroller part and the observer part.
For the steady state ontroller and observer i.e. remark
Controller
Observer
L = [Q2 + T S]1 T S
:
S = Q1 + LT Q2 L + ( L)T S( L)
S(N ) = QNT
K = P H [Rey + HP H T ]1
:
P = Rex + KRey K T + ( KH)P ( KH)T
P (0) = Rx (0)
Optimal Control
Page 59 of 61
immediate you see the following duality:
Controller
Observer
Q2
Rey
T
H
S
P
Q1
Rex
LT
K
QN
Rx (0)
This remarkable onne tion makes it possible, that you in prin iple use the same software for
omputation of the optimal ontroller and the optimal observer.
You also see, that the re ursive iteration move ba kward in time for the ontroller part and
forward for the observer. In the al ulations this is of no importan e.
10.5 Innovation model

The optimal observer is termed a
The Kalman-predi tor
x
(k + 1) =
x(k) + u(k) + K(k)[y(k) y
(k)]
y(k) = H x
(k)
The al ulation of K(k) is done by the solving

the ovarian e Rex , Rey and Rx (0).
Ri ati
equation and presumes knowledge of
Often this information is not available !
In this ase you have two possibilities:

1. You an use Rex and Rey as design parameters pre isely as you use Q1 and Q2 in the
ontroller part. Using iterations of guesses on Rex and Rey and al ulations of pole
pla ements or simulations you an rather fast determine an observer having suitable
dynami s.
2. As an alternative the problem may be solved using an observer based in an innovation
model.
This an be derived from the Kalman-predi tor by the introdu tion of the predi tion
error (k) = y(k) y(k), giving:
x
(k + 1) =
x(k) + u(k) + K(k)
y(k) = y(k) + (k)
y
(k) = H x
(k)
The essential dieren e between the Kalman-predi tor and the Innovation model is that
in the Kalman-predi tor K(k) must be al ulated using the often well known , and
H together with the most often poorly known ovarian e matri es Rex and Rey and
Rx (0). On the other hand K in the Innovation model is dire tly parameterized and an
be determined using system identi ation together with , and H -matri es.
Optimal Control
Chapter 11
Deterministi Observer Design
For stokastisk modellerede systemer er det netop omtalt hvorledes en optimal observer kan
indfres enten i form af Kalmann-pre iktoren eller i form af Innovationsmodellen i forbindelse
med systemidentikation.
For deterministisk modellerede systemer kan man ikke tale om en optimal observer, men
her m vi selv vlge K -matri en, sledes at observeren opnr en passende dynamik i forhold
til systemets dynamik.
Vi vil nu omtale, hvorledes vi kan designe forskellige former for observere, nemlig:
Predi tionsobserver, "full state".

Aktuel observer, "full state".
Observere af redu eret orden.
11.1 Predi tionsobservere

Systemet beskrives ved
x(k + 1) = x(k) + u(k)

y(k) = Hx(k)
Det synes da nrliggende at konstruere en model af systemet sledes
x
(k + 1) =
x(k) + u(k)
y
(k) = H
x(k)
hvor x
(k) er et estimat af x(k). Indfres observerfejlen x
(k) = x(k) x
(k) fs:
x
(k + 1) =
x(k)
60
Page 61 of 61
(0) vil observerfejlen alts konvergere mod nulvektoren med en

Uanset begyndelsesfejlen x
dynamik bestemt af matri en -naturligvis forudsat at reprsenterer et stabilt system.
Strukturelt ser denne bensljfeobserver sledes ud:
Da et af hovedformlene med regulering netop er at ndre en utilfredsstillende bensljfedynamik (), vil den foreslede model vre en drlig observer.
Det er dernst indlysende at danne en observer ved tilbagekobling af forskellen mellem det
mlte output y(k) og det estimerede y
(k).
Kaldes tilbagekoblingsmatri en for K, fs flgende
lukketsljfeobserver:
x
(k + 1) =
x(k) + u(k) + K[y(k) y
(k)]
=
x(k) + u(k) + K[y(k) H
x(k)]
Dette kaldes en predi tionsobserver, fordi estimatet x

(k + 1) bestemmes p grundlag af
mlingerne y(k). Predi tionsobserveren har prin ipielt samme struktur som Kalmannltret
(k) = x(k) x
(k), fs
Indfres igen observerfejlen x
x
(k + 1) = ( K)
x(k)
Dette er en homogen ligning og dynamikken for observerfejlen er bestemt ved ( K).
(k) alts konvergere mod nulvektoren uafhngigt af
Hvis denne har stabile egenvrdier, vil x
(0).
begyndelsesfejlen x
Med andre ord vil x
(k) konvergere mod x(k) uafhngigt af begyndelsesestimatet x
(0), og
dette kan ske hurtigere end den normale (bensljfe) bevgelse af x(k). Hvis og samt
H ikke kendes helt njagtig - eller hvis de skulle variere lidt med tiden - vil observerfejlen
(k + 1) = ( K)
x(k), men ved et fornuftigt valg
naturligvis ikke vre givet eksakt ved x
af K-matri en vil observeren stadig vre stabil og observerfejlen a eptabel lille.
11.2 Valg af K-matri en

Tilbagekoblingsmatri en K for observeren vlges sledes:
1. Ud fra kendskabet til systemets poler (sdvanligvis identisk med egenvrdierne for
vlges passende nskede observerpoler 1 , 2 , . . . , n .
Disse pla eres naturligvis inden for enheds irklen i z -planet og hvis de endvidere pla eres tt p origo fr man fordelen af en hurtig aftagende observerfejl.
P den anden side bliver observeren da mere flsom over for mlefejl eller stj i outputsignalet y(k).
(k) = y(k) H
x(k) og en mlefejl eller
Observeren justeres jo efter dieren en y(k) y
stj vil af observeren fejlagtigt tolkes som et drligt estimat x
(k), hvilket medfrer, at
x
(k) hurtigt justeres til en ny (fejlagtig) vrdi.
Valget af observerpoler bliver sledes et kompromis mellem observerhurtighed og stjflsomhed.
Normalt pla erer man dog de nskede observerpoler 1 , 2 , . . . , n noget nrmere origo
end systempolerne.
Optimal Control
Page 62 of 61
2. Tilbagekoblingsmatri en K bestemmes nu sledes, at egenvrdierne for matri en
KH bliver lig med de nskede observerpoler, alts:
det(zI + KH) = (z 1 )(z 2 ) . . . (z n )
= z n + 1 z n1 + 2 z n2 + . . . + n
= (z)
Bestemmelse af K er naturligvis betinget af at systemet er observerbart d.v.s at observerbarhedsmatri en O, hvor
O=
H
H
H2
..
.
Hn1
skal have fuld rang

Det bemrkes, at for et system med mere end eet output vil K ikke blive entydigt
bestemt, idet ovennvnte mat hning mellem egenvdierne for matri en KH og
nskede observerpoler giver frre ligninger end der er elementer i K.
Sfremt systemet er observerbart og er single-output, vil man alternativt eentydigt
kunne bestemme K-matri en ved hjlp af A kermann's formel:
K = ()
H
H
H2
..
.
Hn1
0
0
0
..
.
1
11.3 Aktuel observer

Hos den netop omtalte prediktionsobserver bestemtes som nvnt estimatet x
(k) p grundlag
af en mling y(k 1).
Dette betyder, at en optimal ontroller u(k) = L

x(k) ikke afhnger af den aktuelle vrdi
af outputtet, og derfor ikke er s njagtig, som den mske kunne vre.
I nogle tilflde - hjere ordens systemer med en beregningstid, der er stor i.f.t. samplingstidenx(k) ligefrem vre en
kan dette delay mellem en mling y(k 1) og en regulering u(k) = L
velsignelse.
I andre tilflde - lavere ordens systemer med en beregningstid, der er lille i.f.t. samplingstiden
- kan man derimod blive rastls af at skulle vente nsten en hel samplingsperiode frend det
nye reguleringssignal kan ptrykkes, idet signalet i mellemtiden er blevet forldet.
Derfor er det nrliggende at formulere en skaldt
x(k) p grundlag af det aktuelle output y(k).
aktuel observer,
Optimal Control
der baserer estimatet
Page 63 of 61
Metoden gr ud p at separere predi tionen og korrektionen. Hvis vi til tiden k har et estimat
x
(k), vil vi prediktere den nste tilstand ved
x
(k + 1) =
x(k) + u(k)
Til tiden k + 1 mles outputtet y(k + 1), hvorefter x
(k + 1) korrigeres
x
(k + 1) =
x(k) + K[y(k + 1) H
x(k + 1)]
x
(k) kaldes da en aktuel observer for x.
I praksis kan denne observer naturligvis ikke implementeres njagtigt, idet det er umuligt at
mle, udfre beregninger og regulere uden et vist delay.
Imidlertid kan dette delay mellem ind- og udlsning minimeres ved at udnytte sidste del
af hver samplingsperiode til p forhnd at udfre den del af beregningerne, der bygger p
"gamle" vrdier.
Udregnes igen observerfejlen x
(k) = x(k) x
(k), fs ved at eliminere x
(k):
x
(k) = [ KH]
x(k)
Tilbagekoblingsmatri en K ndes derfor ganske som omtalt under predi tionsobserveren, idet
H blot erstattes af H, d.v.s. at K bestemmes ved
det(zI + KH) = (z)

hvor z n +1 z n1 +2 z n2 +. . .+n er det nskede karakteristiske polynomium for observeren.
For et observerbart og single-output system vil man alternativt kunne bestemme K ved A kermann's formel.
K = ()
H
H2
H3
..
.
Hn
0
0
0
..
.
1
11.4 Observer af redu eret orden

De to tidligere omtalte observere designedes til at estimere hele tilstandsvektoren x(k) ved
hjlp af mlinger af input y(k) samt kendskab til systemmatri erne , og H.
Ofte er det imidlertid tilfldet, at nogle af elementerne i tilstandsvektoren x(k) kan mles
direkte, og man kan derfor sprge, om det er ndvendigt at estimere hele tilstandsvektoren
eller man kan njes med at estimere de ikke-mlbare tilstandselementer ?
Sfremt mlingerne er strkt stjbehftede, vil man ofte vlge at estimere hele tilstandsvektoren (full state observer), idet stjen da vil ltreres i observerens dynamik.
Optimal Control
Page 64 of 61
Sfremt mlingerne ikke er srlig stjbehftede, kan man derimod njes med at estimere de
ikke mlbare tilstandselementer (redu ed order observer).
I dette tilflde opdeler man sin tilstandsvektor x(k) i to dele, hvor xa (k) er den del, der
indeholder de mlbare tilstandselementer, og hvor xb (k) indeholder de ikke-mlbare tilstandselementer.
Sidstnvnte skal derfor estimeres som x
b (k).
Systemets tilstandsbeskrivelse omskrives alts p formen:
xa (k + 1)
xb (k + 1)
aa ab
ba bb

xa (k)
xb (k)
a
b
u(k)
Dette svarer til flgende to ligninger:
xb (k + 1) = bb xb (k) + ba xa (k) + b u(k)

{z
}
|
kendt input
xa (k + 1) aaxa (k) a u(k) = ab xb (k)

|
{z
}
kendt output
Sammenholdes disse med en normal tilstandsbeskrivelse
x(k + 1) = x(k) + u(k)

y(k) = Hx(k)
der har predi tionsobserveren
x
(k + 1) =
x(k) + u(k) + K[y(k) H
x(k)]
indses, at man nder den redu erede observer ved at indfre flgende substitutioner:
x(k) xb (k)
bb
u(k) ba xa (k) + b u(k)
y(k) xa (k + 1) aaxa (k) a u(k)

H ab
Hermed bliver den redu erede observer:
x
b (k + 1) = bb x
b (k) + ba x
a (k) + b u(k) +
K[xa (k + 1) aa xaa (k) a u(k) ab x
b (k)]
Dannes som tidligere observerfejlen x
b (k + 1) = xb (k) x
b (k) fs:
Optimal Control
Page 65 of 61
x
b (k + 1) = [bb Kab ]
xb (k)
Tilbagekoblingsmatri en K ndes derfor som tidligere omtalt ved
det(zI bb + Kab ) = (z)

hvor (z) er det nskede karakteristiske polynomium af nb 'te orden.
For et observertbart system med kun een mlbar tilstand xa (k) vil man alternativt kunne
nde K ved A kermann's formel:
K = (bb )
ab
ab bb
ab 2bb
..
.
ab nbbb 1
Optimal Control
0
0
0
..
.
1
Chapter 12
Stability of Multivariable Systems
12.1 Stability for Multivariable System in general

In this hapter we will onsider stability of multivariable systems. We will onne ourselves to
linear ontinous time systems with linear state feedba k. The material in this se tion is general
for state feedba k. In se tion 10.2 stability for linear quadrati regulators is onsidered.
In SISO systems stability analysis an be based on R(s) = 1 + T (s), where T (s) is the open
loop transfer fun tion and R(s) is denoted the return dieren e.
Also in MIMO system the return dieren e matrix R(s) = I + T(s) has an important role for
stability, but whereas it in SISO systems is a uniquely dened s alar fun tion of the Lapla e
operator it will for MIMO systems depend on where the loop is ' ut' open to al ulate the
loop gain,
We an demonstrate this
System
: x(t)
= Ax(t) + Bu(t)
y(t) = Cx(t)
Regulator : u(t) = Lx(t)
In the Lapla e domain we have:
x(s) = (sIn A)1 Bu(s) = F(s)Bu(s)
u(s) = Lx(s)
where we have introdu ed the shorthand notation F(s) = (sIn A)1

Opening the loop at the input give the open loop matrix T(s) = LF(s)B. The return
dieren e matrix will here be of dimension r , where r is the number of inputs
RI (s) = Ir + LF(s)B
Opening the loop at the outputside (II ) give us the loop gain T(s) = F(s)BL. and the
return dieren e matrix is of dimension n equal to the number of state variables
66
Page 67 of 61
u(s)
F (s)B
II
x(s)
- C
y(s)
Figure 12.1: Illustration of points to open the loop
RII (s) = In + F(s)BL

Sin e RI (s) most often will be of lower order than RII (s) it will most often be preferable to
onsider RI (s) when stability is analysed
We will now nd a relation between the return dieren e matrix and the hara teristi polynomia for open and losed loop.
We start with the losed loop hara teristi polynomial
det(sI A + BL) = det[(sI A)(I + (sI A)1 BL)]
= det(sI A) det(I + (sI A)1 BL)
= det(sI A) det(I + F(s)BL)
= det(sI A) det(RII (s))
= det(sI A) det(RI (s))

This give us the relation
det(R(s)) =
=
det(sI A + BL)
Pc (s)
=
det(sI A)
Po (s)
Characteristic polynomial f or closedloop
Characteristic polynomial f or openloop
If we write the relation as
Pc (s) = Po (s) det(R(s))

and if we suppose that the system is open loop stable su h that Po (s) has no zeros in the
right half plane the system will be losed loop stable if and only if det(R(s)) has no zeros in
the right half plane
This onne tion between the hara teristi polynomial in losed loop and the return dieren e
matrix makes it possible to generalize the Nyquist riterion for SISO systemsto in lude also
MIMO systems.
It is therefore a ondition for losed loop stability that the mapping of
det(R(j) = det(Ir + LF(j)B)
Optimal Control
Page 68 of 61
en ir les the origin one time ounter lo k wise for ea h open loop pole in the right half plane
when s goes one time round the Nyquist path whi h en ir les all the unstable poles
In gure 10.2 this is illustrated for an open loop stable system for whi h the map must not
en ir le the origin
6
Criti al
point
Complex-plane
(0, 0)
=0
-
(1, 0)
det(R(j))
Figure 12.2:
Closed loop stability using plot of return dieren e matrix R(j).
Sin e the determinant of a matrix is the produ t of the eigenvalues of the matrix, whi h in
this ase are fun tions l(j)
det(R(j)) =
r
Y
li (j)
i=1
the losed loop stability for an open loop stable system an also be formulated su h that no
of the eigenvaluefun tions li (j) for the return dieren e matrix R(j) must en ir le origo of
the omplex plane when s traverses the Nyquist path.
This may also be illustrated for a stable system
6
Criti al
point
(0, 0)
Complex-plane
=
=0
-
(1, 0)
li (j)
Figure 12.3:
Closed loop stability using plot of eigenvalues li (j)of R(j)
Sin e R(s) = Ir + LF(s)B the eigenvalues of R(s) an also be expressed as
li (s) = 1 + ki (s)
Optimal Control
Page 69 of 61
where ki (s) ar ethe eigenvalues of the loop transfer matrix T(s) = LF(s)B.
6
Complex-plane
Criti al
point
U
=
(0, 0)
=0
-
(1, 0)
ki (j)
Figure 12.4:
Closed loop stability using plots of eigenvalues ki (j) of open loop transfermatrix.
12.2 Stability for an LQ- ontrolled System

In this se tion we will use the MIMO stability tests from the previous se tion to derive stability
onditions for LQ- ontrolled systems.
System
x(t)
y(t)
= Ax(t) + Bu(t)
= Cx(t)
Performan e index
R
= R0 (yT (t)Q1y y(t) + uT (t)Q2 u(t))dt
= 0 (xT (t)Q1 x(t) + uT (t)Q2 u(t))dt

Q1 = CT Q1y C
u(t)
L
0
= Lx(t)
T
= Q1
2 B S, hvor
T
T
= C Q1y C + AT S + SA SBQ1
2 B S
with
Controller
with
and
We use the stationary LQ- ontroller. With the abreviation F(s) = (sIn A)1 we have
Open loop input output matrix
loop gain matrix
Return dieren e matrix
: W(s)
: T(s)
: R(s)
= CF(s)B
= LF(s)B
= Ir + T(s) = Ir + LF(s)B
We will now perform some manipulations of the stationary Ri ati-equation:

First we rearrange the terms and introdu e L in the last term.
CQ1y C = SA AT S + LT Q2 L
On the right hand side we add and subtra t the lapla eoperator s multiplied by S .
CQ1y C = S(sI A) + (sI AT )S + LT Q2 L

We multiply from left with and from BT (sI AT )1 right with (sI A)1 B, and a hieve
Optimal Control
Page 70 of 61
BT (sI AT )1 CT Q1y C(sI A)1 B = BT (sI AT )1 SB + BT S(sI A)1 B

+BT (sI AT )1 LT Q2 L(sI A)1 B
Now we introdu e F(s) = (sI A)1 and W(s) = CF(s)B:
WT (s)Q1y W(s) = BT FT (s)SB + BT SF(s)B + BT FT (s)LT Q2 LF(s)B

1 T
= BT FT (s)SBQ1
2 Q2 + Q2 Q2 B SF(s)B +
BT FT (s)LT Q2 LF(s)B
= BT FT (s)LT Q2 + Q2 LF(s)B + BT FT (s)LT Q2 LF(s)B
Q2 is added on both sides of the equation
WT (s)Q1y W(s) + Q2 = Q2 (Ir + LF(s)B) + BT FT (s)LT Q2 (Ir + LF(s)B)
= (Ir + BT FT (s)LT )Q2 (Ir + LF(s)B)
= RT (s)Q2 R(s)
We have now a hieved the following important relation between the return dieren e matrix
R(s) and the open loop gain matrix W(s) for LQ- ontrolled systems:
RT (s)Q2 R(s) = Q2 + WT (s)Q1y W(s)
(12.1)
This relation is rewritten yet a bit introdu ing the eigenvalue fun tions li (s) and the mat hing
eigenve tor fun tions vi (s) for the return dieren e matrix R(s) dened by
R(s)vi (s)
T
vi (s)RT (s)
= li (s)vi (s)
= li (s)viT (s)
Now multiply the previous relation with viT (s) from the left and with vi (s) from the right.
viT (s)RT (s)Q2 R(s)vi (s) = viT (s)Q2 vi (s) +

viT (s)WT (s)Q1y W(s)vi (s)
and a hieve:
li (s)viT (s)Q2 li (s)vi (s) = viT (s)Q2 vi (s) +

and:
Optimal Control
Page 71 of 61
|li (s)|2 viT (s)Q2 vi (s) = viT (s)Q2 vi (s) +
The last term on the right side is a positive number be ause Q1y is positive denite. Sin e
also Q2 is positive denite we see that:
|li (s)|2 viT (s)Q2 vi (s) > viT (s)Q2 vi (s)

or
|li (s)|2 > 1,
|li (s)| > 1
eller
This shows that all the eigenvalue fun tions li (s) of R(s) are numeri ally larger than 1, whi h
imply that
r
Y
li (s)| > 1
| det(R(s))| = |
i=1
If we use this result in Nyquist- onsiderations it simply implies that the plot of det(R(j)) in
the omplex plane for running from 0 til be stri tly outside a irle with entre in (0, 0)
and with radius 1.
6
Complex-plane
LQ will stay
outside ir le.
60
=0
(1, 0)
det(R(j))
Figure 12.5: Sket h showing stability margins for LQ regulator.

With the denitions of phase margin and gain margin known from SISO systems it may be
seen from the Nyquist- urve that for a ontinous time LQ- ontrolled system holds:
Fasemargin
Gainmargin
> 60
=
With the use of ontinous time LQ- ontrol ni e stabilty margins are ensured.
Unfortunately these results does not extend to dis rete time systems for whi h strm and
Wittenmark, have shown that the gain margin is nite.
When the ontroller is used with an observer the stability margins an not be guaranteed as
it has been shown by Doyle and Stein although the same authors also have given a re ipe
(LTR: Loop Transfer Re overy) for re overy of the properties.
Optimal Control
Bibliography
[AM89
Brian D. O. Anderson and John B. Moore. Optimal Control, Linear Quadrati

Methods. Prenti e-Hall Information and System S ien es Series. Prenti e-Hall International, Englewood Clis, NJ, USA, 1989.
[W84
K.J. strm and B. Wittenmark. Conputer Controlled Systems: Theory and Design. Prenti e-Hall Information and System S ien es Series. Prenti e-Hall In .,
Englewood Clis, NJ, USA, 1984.
[FPW97 Gene F. Franklin, J. David Powel, and Mi hael L. Workman.

Dynami Systems. Addison Wesley, California,USA, 1997.
[KS72
H. Kwakernak and R. Sivan.

New York, USA, 1972.
Digital Control of
Linear Optimal Control Systems. Wiley Inters ien e,
72

Lecture Notes - Optimal Control (LQG, MPC)

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lecture Notes - Optimal Control (LQG, MPC)

Hochgeladen von

Copyright:

Verfügbare Formate

Optimal Control

Department of Control Engineering, Institute of Ele troni Systems

Aliation of the Authors

General Des ription of Plant and Performan e

Time varying LQ-Control

LQ ontrol of dis rete time systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Summary of LQ-method for dis rete time systems

Choi e of Weight Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

LQ-Control of Continous Time Systems

Summary of LQ-method for ontinous time systems

Steady State Values of

Example: LQ ontrol of a dis rete time 2nd order system.

Example: LQ ontrol of a ontinuous time 2nd order system . . . . . . . . . . . . . . .

Referen es and disturban es

Modelling referen e and disturban e

Control of System with a Referen e Model

Example, First Order System with onstant referen e . . . . . . . . . . . . . . . . . . .

Example, se ond order system with onstant referen e

Using a Disturban e Model in the Control Law

Sto hasti LQ Control with Full State Information

Sto hasti Optimal Control

Full state information

Performan e fun tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Expe tation of a Quadrati form

Derivation of Re ursive expressions for

Sto hasti LQ Control with In omplete State Information

In omplete State Information

Duality between ontroller and observer

10 Stability of Multivariable Systems

10.1 Stability for Multivariable System in general . . . . . . . . . . . . . . . . . . . . . . . .

10.2 Stability for an LQ- ontrolled System

x(k + 1) = x(k) + u(k)

a state feedba k ontroller an be of the form

u(k) = L(xref (k) x(k))

l11 l12 l13 l14

x(k + 1) = ( L)x(k) + Lxref

x(k) + u(k) + u(k + 1)

... ... ...

Deterministi as well as sto hasti pro esses

1.1 General Des ription of Plant and Performan e

x(k + 1) = G(x(k), u(k))

the system is deterministi

Blo k diagram for optimal ontrol

1. Output referen e is equal to zero

x(k + 1) = x(k) + u(k)

x(k)T Q1 x(k) + u(k)T Q2 u(k)

Blo k diagram for optimal ontrol with state estimation

J0N (x(0)) = minu(0),...,u(N ) I0N x(0), u(0), u(1), . . . , u(N )

= IkN x(k), u(k), u(k + 1), . . . , u(N )

IkN x(k), u(k), u(k + 1), . . . , u(N )

STEP 0: JNN = H(x(N ), 0)

STEP N: J0N = minu(0) [H(x(0), u(0)) + J1N (x(1))]

Example 2.1 (Minimum sear h for 1st order system)

We onsider the rst order system

x(k + 1) = ax(k) + bu(k)

The performan e fun tion is hosen to be quadrati in x(k) and u(k)

x2 (k) + qu2 (k)

H(k) = x2 (k) + qu2 (k)

J22 (x(2)) = x2 (2)

J12 (x(1)) = minu(1) [x2 (1) + qu2 (1) + (ax(1) + bu(1))2 ]

Now this al ulation has given us the optimal ontrol sequen e

3.1 LQ ontrol of dis rete time systems

x(k + 1) = x(k) + u(k)

(xT (k)Q1 x(k) + uT (k)Q2 u(k)) + xT (N )QN x(N )

Aliation of the Authors

We onsider the rst order system

where Q1 and QN positive semidenite and Q2 is positive denite.