2007

ARTICLE IN PRESS
Control Engineering Practice 15 (2007) 15771587

www.elsevier.com/locate/conengprac
A stable self-learning PID control for multivariable

time varying systems
D.L. Yua,, T.K. Changa, D.W. Yub
a
Control Systems Research Group, School of Engineering, Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK
b
Department of Automation, Northeast University at Qinhuangdao, China
Available online 29 March 2007
Abstract
A stable self-learning PID proportional integral derivative control scheme for multivariable nonlinear systems with unknown
dynamics is proposed in this paper. The control scheme is based on a neural network (NN) model of the plant. The NN model is adapted
by an extended Kalman lter (EKF) to learn plant dynamic change, while the PID control parameters are adapted by the Lyapunov
method to minimize squared tracking error. Therefore, the model output is guaranteed to converge to the desired trajectory
asymptotically, and the plant output also tracks the desired trajectory due to model adaptation. The proposed scheme is evaluated by
applying it to a simulated multivariable continuous stirred tank reactor (CSTR). The self-learning PID controller is also compared with a
xed parameter PID controller for a single-input single-output CSTR and the superiority of the self-learning PID is demonstrated.
r 2007 Elsevier Ltd. All rights reserved.
Keywords: Self-learning PID; Adaptive NN models; Nonlinear systems; EKF; CSTR
1. Introduction
It is well known that PID proportional integral
derivative controllers have dominated industrial control
applications for a half of century, although there has been
considerable research interest in the implementation of
advanced controllers. This is due to the fact that the PID
control has a simple structure that is easily understood by
eld engineers and is robust against disturbance and system
uncertainty. As most of the industrial processes demonstrate nonlinearity in the system dynamics in a wide
operating range, different self-tuning PID control strategies
have been investigated in the past two decades.
Relay feedback is a simple and reliable test that keeps the
process output under closed-loop control and makes it
close to the operating point. Astrom and Hagglund (1984)
combined the strengths of both PID and relay control and
invented the relay auto-tuner for a single-input singleoutput (SISO) PID controller. The tuner has been widely
and successfully applied in industry, and further research
Corresponding author. Tel.: +44 151 231 2229; fax: +44 151 298 2624.
E-mail address: D.Yu@ljmu.ac.uk (D.L. Yu).

0967-0661/$ - see front matter r 2007 Elsevier Ltd. All rights reserved.
doi:10.1016/j.conengprac.2007.02.004
to improve this technique has followed (Hang, Astrom, &

Ho, 1993; Ho, Hong, Hansson, Hjalmarsson, & Deng,
2003; Park, Sung, & Lee, 1997). A tutorial given by Hang,
Astrom, and Wang (2002) outlined the recent developments in this aspect. Some other techniques have also been
used in developing auto-tuning PID controllers for SISO
systems, such as the gain and phase margin-based method
(Ho, Hang, & Cao, 1995). In addition, Kim and Han
(2006) applied a robust PID-like neuro-fuzzy controller to
induction motor servo drive systems. Tavakoli, Grifn, and
Fleming (2006) presented tuning of decentralized PID
controllers for TITO processes. Gyongy and Clarke (2006)
described automatic tuning and adaptation of a PID
controller.
Many industrial processes are inherently multivariable in
nature and need multivariable control. Multivariable autotuning PID controllers have been in development for the
past two decades. The early work includes a method
for tuning the integral part of the multivariable PID
controller developed by Davidson (1976). Pentinnen and
Koivo (1980) proposed a method for tuning the P and I
parts of the multivariable PID controller. The limitation of
these methods is that some experimental and graphical
ARTICLE IN PRESS
1578
D.L. Yu et al. / Control Engineering Practice 15 (2007) 15771587
procedures are required, which can be rather time

consuming. Therefore, such methods are not suitable for
on-line tuning. Zgorzelski, Unbehauen, and Niederlinski
(1990), Loh, Hang, Quek, and Vasnani (1993) and Zhuang
and Atherton (1994) developed multivariable PID controllers based on the method by Astrom and Hagglund
(1984). For the early multivariable PID control, Koivo and
Tanttu (1991) gave a survey for its tuning techniques.
These techniques primarily aim to decouple the plant at
certain frequencies. Decentralized PID control structure
has also been targeted with auto-tuning method developed
by Halevi, Palmor, and Efrati (1997) and Palmor, Halevi,
and Krasney (1993).
Methods based on on-line parameter estimation have
also been proposed for the automatic tuning of PID
regulators. Some authors proposed auto-tuning regulators
based on minimum variance, pole placement or linear
quadratic Gaussian (LQG) design methods. Gawthrop
(1986) and Radke and Isermann (1987) proposed autotuning PID using adaptive parameter estimation methods.
Hang and his co-workers have proposed auto-tuning PID
regulators using alternative methods, including a knowledge-based PID auto-tuner (Lee, Hang, Ho, & Yue, 1993).
Based on the method given in Nishikawa, Sannomya,
Ohta, and Tanaka (1984), Ruano, Fleming, and Jones
(1992) proposed a connectionist approach to PID autotuning, which used integral measures of the step response
as the input to neural networks to determine the required
PID parameter values. However, most of these methods
are for SISO systems.
In this paper, a self-learning PID control for multivariable time varying systems is proposed based on a NN
model of the plant. The NN model is on-line and updated
with the EKF algorithm to learn plant dynamics change,
while the PID controller parameters are updated based on
the plant output predictions by the model. The proposed
auto-tuning algorithm is operated iteratively and is
developed using the Lyapunov method. Hence, the
convergence of the tracking control is guaranteed. The
proposed auto-tuning PID controller is evaluated by
applying it to a simulated two-input two-output CSTR
process. In order to compare the auto-tuning PID
controller with a xed parameter PID controller, a
simulated SISO CSTR is used as a test bench. The
superiority of the developed method over the xed
parameter PID is clearly shown.
2. Adaptive neural network model
2.1. The NN model
For a multivariable nonlinear sampled-data system
represented by the following NARX (nonlinear autoregressive with exogenous inputs) model,
yk guk d 1; . . . ; uk d nu ; yk 1; . . . ,
yk ny ek,
where u 2 Rm and y 2 Rp are the sampled process input

and output vector, nu and ny are the input order and output
order, respectively, d denotes the process transmission
delay and e is a noise vector, a multi-layer perceptron
(MLP) network of the following form can be used to model
the system.
^
^
yk
guk
d 1; . . . ; uk d nu ; yk 1; . . . ,
yk ny ,
where y^ 2 Rp is the estimated output by the NN model and

^
g
is an approximated nonlinear function of g. It has
been previously proved (Funahashi, 1989) that if g is
sufciently smooth, a network model can approximate it to
any pre-specied accuracy, provided with enough numbers
of hidden layer nodes. The commonly used structure of
MLP network with one hidden layer of q neurons is
adopted,

ok
xk
y
h
^
yk
W
; ok f zk; zk W
,
1
1
(3)
where xk 2 Rn is the network input vector and is given,
according to (1), by
xk uk d 1T ; . . . ; uk d nu T ; yk 1T ; . . . ,
yk ny T T ,
where n mnu pny , ok 2 Rq is the hidden layer output,

W h 2 Rqn1 and W y 2 Rpq1 are the weight matrices
in the hidden and output layers, respectively, f : is the
nonlinear activation function in the hidden layer, for which
the sigmoid activation function is used in this study,
ok
1
.
1 ezk
2.2. EKF training algorithm

The extended Kalman lter (EKF) is chosen for the
MLP network on-line updating in this study because this
algorithm is much faster than the commonly used backpropagation algorithm. Back-propagation algorithm is the
gradient descent method used for nonlinear optimization.
The EKF algorithm applies the KF to the linearized
nonlinear optimization problem. As the linear optimization
using the KF is much faster compared with the nonlinear
optimization using the back-propagation method, the EKF
is adopted here in this research for MLP model updating.
There are two conditions for the EKF to be applied. One
is that the process dynamics must be differentiable, or
smooth, so that the dynamics can be linearized around the
current operating point. The second condition is that all
relevant input/output data should be measurable, and
therefore are available for use by the EKF. Both conditions
are satised in the updating of the MLP model and for the
PID controller. The model parameters to be trained are
weight matrices W h and W y . To enable these parameter
ARTICLE IN PRESS
matrices to be adjusted with the EKF algorithm, a

parameter vector y^ 2 Rqn1pq1 is formulated as
follows:
" #
^h
^y y
(5)
y^ y
with
2
6
6
y^ h 6
4
wh1 T
..
.
whq T
7
7
7;
5
6
6
y^ y 6
4
wy1 T
..
.
wyp T
3
7
7
7
5
(6)
yk 1 yk,
(7)
^ em k,
yk yk
(8)
where
^
^
yk
gyk;
xk W
"
C w k 2 Rpqn1pq1 in (14) is the Jacobian matrix of

^
model output yk
with respect to y, and is expressed as
"
#

^
^
qyk
qyk
^
qyk
C w k
.
qyy yyk1
^
^
qyh yyk1
qy yyk1
^
(17)
The rst term in the matrix in right-hand side of Eq. (17),
can be expressed as

^
^
qyk
qyk
qhk qzk
qhk qzk qyh yyk1

^
^
qyh yyk1
3
2 y
w^ 1;1 k 1 w^ y1;q k 1
7
6
7
6
..
..
..
7
6
6
.
.
.
7
5
4
y
w^ p;1 k 1 w^ yp;q k 1
3
2
qh1 k

0
7
6 qz1 k
7
6
7
6
6 ..
.. 7
..
6 .
. 7
.
7
6
6
qhq k 7
5
4
0

qzq k
^
yyk1
3
2
T
1
xk
01n1
7
6
7
6
..
..
..
7
6
6
18,
.
.
.
7
5
4
01n1
xkT 1
^
qyk
j ^
qyh yyk1
where whi is the ith row vector in W h and wyi is the ith row
vector in W y . Let the desired network model, for which the
network to be trained is learning to achieve, have a
parameter vector y that is dened in the same way as the
above, then this target network model is described as
1579
1 eW
#T
T
1
xT k 1T
(9)
and em k 2 Rp is the modelling error vector. The parameter vector of the target network, y is estimated by y^ that
is updated in every sample step with the EKF algorithm to
Eqs. (8) and (9) as
^
^ 1 K w kyk yj
^ kjk1 ,
yk
yk
(10)
E w k I K w kC w kE w k 1,
(11)

^
qyk
C w k
qy yyk1
^
hkjyyk1
^
1
^
Rk Rk 1 fyk ykj
^
yyk
k
T
^
yk ykj
^ Rk 1g.
yyk
1
1e
zkjyyk1
^
(13)
^ h k 1
zkjyyk1
W
^
The second term,

^
qyk
qyy yyk1
^
^ y k 1
qW
15
16
6
6
6
6
6
4
^
qyk
^
qyy jyyk1
hkjyyk1
^
1
xk
(14)
and where Rjkjk1 2 Rpp is an unknown priori error

covariance matrix. Iiguni, Sakai, and Tokumaru (1992)
proposed an on-line estimation with the following equations according to Ljung and Soderstrom (1983).
1
^ kjk1
Rjkjk1 Rk 1 fyk yj
k
^ kjk1 T Rk 1g,
yk yj

qhi k
hi kjyyk1
1 hi kjyyk1
,
^
^
qzi k yyk1
^
12
where
^ 1; xk,
^ kjk1 g
^ yk
yj
where
i 1; . . . ; q,
K w k E w k 1C w kT
Rjkjk1 C w kE w k 1C w kT 1 ,
qq:n1

.
(19)
can be expressed as
qyy
T
^
h kjyyk1
..
.
01q1
1

01q1
..
..
.

^
h kjyyk1
1
7
7
7
7
7
5
.
ppq1
20
ARTICLE IN PRESS
1580
The procedure of applying the EKF algorithm to the

adaptive model is given as follows:
Step 1: At sample time k, obtain the past process output,
y and the past control variable u to form the NN model
input vector xk in (4).
Step 2: Obtain the current process measurement output,
yk, which is used as the training target.
Step 3: Update the error covariance matrix Rjkjk1 using
Eqs. (15)(16).
Step 4: Implement the EKF training algorithm (17)
(20) to update the weight parameter vector, yk with
(10)(14).
2.3. Model evaluation
To evaluate the learning performance of the developed adaptive network model, the network is used to
model a multivariable, nonlinear CSTR process that is
often used as nonlinear process simulation to evaluate
control methods. The schematic diagram of the process is
shown in Fig. 1, for the detail of the CSTR (see Lightbody
& Irwin, 1997).
The process works in the following way. The reactant
A with constant concentration ci and temperature T i t
ows into the tank with the ow rate qi t. A second order
endothermic chemical reaction 2A ! B takes place in the
tank, which is based on the temperature and absorbs heat
energy. As a result, the reaction inuences the temperature
and concentration of outow liquid. The two inputs and
two outputs are chosen as follows:
" #
" #
qi
c
u
; y
.
Ti
Tr
The following equations can be derived to describe the
process dynamics.

dct
1
h
EA
2
ci qi t ct 2Ahko c t exp
,
dt
Ah
Rv
RT r t
(21)
ci, Ti(t),
h(t)
A
Rv
c(t), Tr(t)
Fig. 1. CSTR process.

dT r t
1
h
r sr qi tT i t T r t
dt
rr sr Ah r
Rv

EA
2
DHAhko c t exp
RT r t

U 1 T h T r t U 2 T r t T x .
22
Eq. (21) is a mass balance equation for chemicals while

Eq. (22) is a heat energy balance equation. The operating
ranges of the input variables applied by the actuators are
qi t 2 2; 5(l/s);
T i t 2 273; 480(K).
(23)
It can be seen in (21) and (22) that the main nonlinearities

in the process dynamics are introduced by the exponential
terms. Although, due to the limitation of the energy E A and
the negative power of the exponential function, the
nonlinearity is not signicant but rather mild in the
interpretation of mathematics, the CSTR process is widely
recognized in process engineering as a typical nonlinear
process and used as test system for control algorithms
(Lightbody & Irwin, 1997). Although the model of the
CSTR process used in the simulation is known, it is
supposed that the process model is unknown and these
known parameters have not been used in the modelling and
control. A white noise with zero mean and unity variances
and appropriate magnitude is superimposed on the process
output to simulate the measurement noise. Two inputoutput data sets were generated from the CSTR simulation
when two different random amplitude binary sequences
were used as excitation signals. One set containing 5000
samples is for network training and the other containing
1200 samples is used for network validation. The sample
period was chosen as 30 s.
In this simulation, nu 2, ny 2, d 0 and 10 hidden
layer nodes are found to be most appropriate for the MLP
model. Hence, the MLP model has the following structure:
^
yk
g^ 8:10:2 uk 1; uk 2; yk 1; yk 2,
where the subscript, (8:10:2) denotes that the network has
eight inputs, 10 nodes in hidden layer and two outputs. The
MLP model is trained in the on-line mode with the EKF
algorithm using the training data set. The initial value of
vector y is assigned to a small random value. The initial
values of the parameters of EKF, E w 0 and Rjkjk1 are
assigned an identity matrix and a zero matrix, respectively.
All the initial values of the Jacobian matrix in the learning
calculation are assigned to zero. After training the model is
evaluated using the test data set for one-step-ahead
prediction, during which the on-line learning of the model
is still conducted after prediction. The validation results are
shown in Fig. 2.
Fig. 2 shows the modelling error for the two output
variables. As the error is very small, the process output and
the model output are almost virtually identical. To
numerically evaluate the modelling performance, an error
index of the normalized mean absolute modelling error
ARTICLE IN PRESS
0.17
0.12
0.11
0.06
0.07
1
201
401
601
801
1001
1581
0.16
c(k) mol/l
c(k) mol/l
201
401
601
401
601
801
1001 sample
801
1001
sample
630
Tr(k) K
Tr(k) K
530
430
530
430
330
330
201
401
601
Process output
801
1001
NN output
sample
0.01
(24)
0.005
0
The NMAE of Fig. 2 is 2:0267 103 ; 2:0357 103 .

To demonstrate the on-line learning ability of the
developed adaptive NN model, another data set is
generated when a malfunction of an actuator occurs to
the process. The fault is simulated by a 100% abrupt
change on the actuator output, Dqi starting at the sample
k 301.
401
601
801
1001
eTr(k)
60
40
20
0
1
k4300,
201
401
601
801
1001
sample
Fig. 4. Absolute modelling error of the xed MLP model.
0.008
ec(k)
where denotes the faulty data. In order to clearly show

the learning performance with a fault present, the CSTR is
activated by a constant control variable with noise between
the sample instants k 201 and 700. This is to test if the
EKF-based learning algorithm works well for the data
without fully persistent exciting.
For comparison, a xed parameter MLP model with the
same structure as the adaptive one is also trained using
the same set of training data, and then is evaluated with the
same test data with fault. The modelling result by the xed
parameter model is displayed in Fig. 3 with the absolute
modelling error in Fig. 4, while the modelling error by the
adaptive model is displayed in Fig. 5 for comparison.
In Fig. 3 it is observed that an obvious mismatch
between the model output and the process output occurs
after the fault occurs at k 300. This indicates that
without on-line learning the MLP model is unable to track
the change in system dynamics. This error is shown more
clearly in Fig. 4 and is easily compared with Fig. 5. In
contrast, Fig. 5 shows the EKF on-line learning performance of the adaptive model for the data set with fault.
When an abrupt fault occurs at k4300, the modelling error
caused by the abrupt change of the system dynamics is
quickly reduced to normal. The new dynamics are
modelled quickly by the adaptive model. These results
indicate that the on-line learning performs very well even
201
sample
0.004
0
1
201
401
601
801
1001 sample
201
401
601
801
1001 sample
40
eTr(k)
qi k qi k Dqi k 2qi k;
sample
NN output
Fig. 3. Modelling performance of the xed parameter MLP model.
Fig. 2. Process and the MLP model outputs.
(NMAE) is used as shown below:

N
^
1X
yk yk

.
NMAE
N k1 yk
201
process output
ec(k)
20
0
Fig. 5. Absolute modelling error of adaptive MLP model.
when the CSTR process is subjected to a fault. It is evident

that the EKF on-line learning algorithm is able to learn
dynamics change quickly. This is important for an adaptive
model to be used in the FTC scheme. The modelling error
index of adaptive MLP trained by EKF is 2:0635 103 ,
2:0744 103 .
ARTICLE IN PRESS
1582
3. PID controller auto-tuning algorithm

Consider the discrete-time multivariable PID controller
of the following form:
uk K p keT k K i k
k
X
eT i
i1
eT k eT k 1
,
25
T
where eT k 2 Rp is the process tracking error dened as
K d k
eT k yd k yk
with yd k 2 Rp being the desired trajectory. T is the
sampling time, K p ; K i ; K d 2 Rmp are PID controller
parameter matrices and m is the number of input.
The objective of the proposed method is to achieve
tracking control when system dynamics change. To realize
this objective, the conguration of the control system is
proposed and displayed in Fig. 6.
In this system, a MLP model is updated on-line to
capture the time-varying dynamics of the process. Then,
the updated model is used in an iterative algorithm for
auto-learning of the PID controller. In this way an optimal
control is achieved by minimizing the squared error
between the desired trajectory and the model prediction.
The thick line in Fig. 6 denotes the information sharing
between the two MLP models.
In order to estimate the optimum PID parameters,
K p k, K i k and K d k, a parameter vector ypid k 2 R3mp
is formulated as
2
3
kp1 k ki1 k kd 1 k T
6
7
..
6
7
(26)
ypid k 6
7
.
4
5
T
kpm k kim k kd m k
where kpj k denotes the jth row in K p k and it is the same
for K i k and K d k, j 1; . . . ; m. An iterative algorithm is
to be developed and used to obtain the optimal control
variable in each sample period k by minimizing an
objective function of the predictive tracking error. Such
tuned PID parameters will be used at the end of the sample
period to produce a control variable. A new argument, i is
introduced to denote the iterative step within one sample
NN-based PID auto-tuning controller
eT (k)
eT|k
yd(k) yd
+-
PID
et(i) Auto-tuning u(i)
algorithm
y(i)
NN
model
u(k)
y(k)
Process
NN
model
y^ pid i 1 y^ pid i Dy^ pid i y^ pid i K pid iet i,
(27)
where K pid i is the gain matrix, et i 2 Rp is the NN model

tracking error that is different from the process tracking
error, eT ,
^
et i yd jk yi,
(28)
^ g
^ ui;
^ X jk .
yi
(29)
^ is
yd jk 2 Rp is the desired output at sample time k and yi
the NN model output in iterative step i within the sample
^ is a function of the predicted optimum control
period. yi
^ 2 Rm ,
variable, ui
^ K p ieT jk K i iE pid jk K d i
ui
eT jk eT jk1
.
T
(30)
To derive a gain matrix K pid in (27) at each iterative step

and make sure the convergence of the NN model output to
the desired process output is guaranteed, a discrete-time
Lyapunov function is chosen as follows:
V i det iT et i,
(31)
where d is a positive constant. Obviously, V i is positive

denite. Dene
et i et i 1 Det i 1
(32)
for the calculation of V i in (31). In discrete-time

operation, Dy^ pid i can be approximated by the derivative
of y^ pid i with respect to i. Thus,
Det i
qet i qy^ pid i

qy^ pid i qi
^
^
qyd jk yi
qyi
Dy^ pid i
K pid iet i. 33
^
^
qypid i
qypid i
Then, the increment of the Lyapunov function, DV i can

be expressed as
Fault, noise
z-1
period to distinguish from the sample time, k. To avoid

confusion with the iterative step i, the variable at sample
time k is changed to jk in the iterative process, such
as ypid k ! ypid jk .
When predicting the optimum PID parameters ypid k,
y^ pid i 2 R3mp is used to denote the estimated parameter
vector in the iterative process, and the PID auto-tuning
algorithm is dened as
y(k)
-
em(k)
Fig. 6. Conguration of the NN-based auto-learning PID control system.
DV i V i 1 V i
2dDet iT et i dDet i
"
#T
^
qyi
2d
K pid iet i
qy^ pid i
"
#
^
qyi
K pid iet i
et i d
qy^ pid i
ARTICLE IN PRESS
"
#T
^
qyi
K pid i
2det i
qy^ pid i
"
#
^
qyi
K pid i et i.
I d
qy^ pid i
1583
The second term in (37) is given
34
The gain matrix in (27) is designed as

"
#T 8
"
#T 91
< qyi
=
^
^
^
1 d qyi
qyi
K pid i
.
d
qy^ pid i :qy^ pid i qy^ pid i ;
(35)
P
T jk1
q K p ieT jk K i i eT jk K d i eT jk e
^
qui
T
qy^ pid i
qy^ pid i
3
2
Qpid 013p
7
6
6 .
.. 7
..
..
39
6
. 7
.
7
6
5
4
013p Qpid
with
Then, DV i in (34) becomes

DV i 21 deTt iet i.
It is obvious that DV i in (36) is negative if do1 is chosen.

Therefore, the stable convergence to zero of the predictive
tracking error is guaranteed by the design of the gain
matrix K pid in (35), in which do1 can be chosen as a
learning rate to adjust the self-learning speed of the PID
parameters.
The Jacobian matrix in (35) can be expressed as
^
^
^
qyi
qyi
qui
.
^
^
^ qypid i
qypid i qui
(37)
^
^ for
The rst term in the right-hand side of (37), qyi=q
ui
the MLP network model is derived as follows:
^
^ qoi qzi
qyi
qyi
^
^
qui
qoi qzi qui
2
qo1 i
6 qz1 i

6
6
qW y oi
..
.
1 6
6 .
.
qoi 6 .
6
4
0

2
wy1;1
6
6 .
.
6
6 .
4
wyp;1
2
wh1;1
6
6 .
.
6
6 .
4
whq;1

..

..

^
ui
6
7
h
X jk 7
7 qW 6
4
5
7
7
1
.. 7
. 7
7
^
qui
qoq i 7
5
qzq i
3
2
qo1 i
y 3

0
w1;q 6 qz1 i
7
7
76
7
6
7
.. 76 ..
.. 7
..
7
. 76
.
.
.
7
56
6
qoq i 7
5
wyp;q 4 0

qzq i
3
wh1;m
7
.. 7
38
. 7
7,
5
whq;m
0
where
2
qoj i
oj i1 oj i;
qzj i
(36)
^
ui
6
7
j 1; . . . ; q; zi W h 4 X jk 5.
1
Qpid
3T
eT jk
P
6
7
eT jk
6
7
6
7 .
4 eT jk eT jk1 5
T
From the above analysis, it is clear that when the gain

matrix K pid i is chosen according to (35), the PID
parameter vector, y^ pid k will asymptotically converge to
the optimum value at each sampling time in the sense of
driving the model tracking error to minimum. The PID
auto-tuning procedure is given as follows:
Step 1: At sample time k, obtain the desired trajectory
yd k 1, the past process output yk and the past control
variable uk to form the NN model input vector X k.
Step 2: Implement the PID auto-tuning algorithm given
in Eqs. (28), (35), (37)(39) and (27) in iterative form to
predict the optimum PID parameter vector y^ pid k. The
initial value of y^ pid 0 is assigned to be the value at the last
sample time, ypid k 1.
Step 3: Apply the obtained y^ pid i to the PID controller
^ in (30).
to calculate the optimum control variable ui
^ to the NN model to
Step 4: Apply the obtained ui
^
calculate the model output yi
in (9) and NN model
tracking error, et i in (28) at iteration step i.
Step 5: Repeat Steps 14 until the NN model tracking
error, et i is less than a desired threshold or a specied
bound to iterative step is reached.
Step 6: Set ypid k to be equivalent to y^ pid ifinal , and then
apply it to the PID controller in the process.
There is a possibility that the interconnection between
the learning process of the NN model and the parameter
tuning process of the PID controller exists. However, the
NN model and the PID controller have been trained or
tuned to an acceptable level before used on-line, to
guarantee a satisfactory performance without disturbance
and time varying effects. After use in on-line mode, the
effect of the interconnection between the learning process
of the NN model and the parameter tuning process of the
PID controller is not signicant and is ignored in this
research. In each sampling period, the NN model is
updated with the current measurement while the system
under the current PID control has constant parameters. On
the other hand, when the PID controller is updated, the
ARTICLE IN PRESS
NN model is not changed during the process. To view

the adaptation process as a whole, both the model and the
controller are adapted gradually and in steps of sample by
sample. Therefore, the convergences of both updating are
not signicantly inuenced.
4
qi(k) l/sec
1584
3.5
3
2.5
4. Application to CSTR process

4.1. Evaluation on MIMO CSTR
4.2. Comparison with fixed-parameter PID controller

As a typical design of PID controller is for SISO systems,
a SISO CSTR process is used here for comparison.
c(k) mol
0.14
0.12
0.1
101
201
301
401
501
601
sample
201
301
401
501
601 sample
101
201
301
401
501
601 sample
Ti(k) K
360
260
Fig. 8. Control variables for the tracking results in Fig. 7.
Cai(t), q, Ti
v
h(t)
qc(t), T ci
C a(t), q, T
Fig. 9. SISO CSTR process.
4.2.1. The SISO CSTR

The SISO CSTR process given by Morningreda, Padenb,
Seborga, and Mellichamp (1992) is chosen as a realistic
example for the application in this research project. The
CSTR process is a typical dynamic process used in the
chemical and biochemical industries. The schematic
diagram of the reactor is shown in Fig. 9.
This CSTR process is chosen because it has highly
nonlinear dynamics. The following equations are available
to describe the process dynamics:
dC a t q
C ai C a t ko C a teE=RTt ,
dt
v
dTt
q
T i Tt k1 C a teE=RT t
dt
v
k2 qc t1 ek3 =qc t T ci Tt.
500
Tr(k) K
101
460
The developed auto-tuning PID control scheme shown

in Fig. 6, in conjunction with the PID parameter autotuning algorithm developed in the last section, is applied to
the simulated multivariable nonlinear CSTR process. The
process has been described in Section 2 when it is used to
demonstrate adaptive neural network modelling performance. In the tracking control applications, the NN model
is trained with a set of process data before it is used on-line.
The initial values of the PID parameters are chosen as
K P 5I 2 , T I 20I 2 , T D 8 I 2 according to experience
and the learning rate d 0:6 is chosen. Fig. 7 shows the
process tracking response, and the tracking error index in
Fig. 7 is 2:4227 103 ; 2:7556 103 . The tracking
performance show that the proposed self-learning PID
control is capable to track the step input in the entire
operating space with past response and without overshoot.
The corresponding control variables of process without
disturbance are shown in Fig. 8. The thick dashed line with
double dot indicates the bounded ranges of control
variable.
The simulation result shows that in no fault conditions,
the process is well controlled by the FTC based on the
auto-tuning PID controller.
460
420
380
1
101
201
301
desire trajectory
401
501
601
sample
process output
Fig. 7. Tracking performance of auto-tuning PID controller.
(40)
41
The CSTR process can be briey described as follows.

Within the CSTR, two kinds of chemicals are mixed and
react to produce a product, compound A, at a concentration C a t, with the temperature of the mixture being Tt.
The reaction is exothermic, producing heat which acts to
slow the reaction down. By introducing a coolant ow-rate
ARTICLE IN PRESS
1585
qc t, the temperature can be varied, and hence the product

concentration controlled. C ai t is the inlet feed concentration, qt the process ow-rate, T i t and T ci t the inlet and
coolant temperatures, respectively, all of which are
assumed constant at nominal values. Likewise ko t, E=R,
v, k1 , k2 and k3 , are thermodynamic and chemical
constants relating to this CSTR process. The input and
output of this CSTR process are
has the following structure:
ut qc t;
where umax and umin are the maximal and minimal values of
the input in the data set and the same applies to the output
y. The NN model is trained using the EKF algorithm
described in Section 2 in the iterative but off-line mode.
These initial training is the preparation for on-line training
of the model and the model-based PID control.
yt C a t.
In this project, the bounded range of the input variable is

qc t 2 90; 110 (l/min).
The initial conditions of the CSTR process are assumed to
be a nominal steady state where the concentration and tank
temperature have the following values:
C a 0:062 (mol/l);
T 448:752 (K)
corresponding to a constant input of

qi 90 (l/mol).
The CSTR process open-loop response to a step change of
exciting signal in Fig. 10 shows that the process exhibits a
high degree of nonlinearity.
In the investigation of the simulation, a sample time of
3 s was chosen in this SISO CSTR process.
4.2.2. NN model development
A random magnitude sequence (RMS) was used as the
excitation signal to the CSTR simulation and 2000 samples
of input/output data were collected. The data are for offline model development and model structure selection. The
model input and output order and the input transmission
delay were determined in the experiments as, nu 3,
ny 3, d 0. Ten hidden layer nodes were found to be
proper for the MLP model. Thus, the MLP network model
qc(k) l/min
115
105
95
85
601
1201
1801
Before training, the data are linearly scaled to the values

between 0 and 1 by the formula,
us k
ys k
yk ymin
,
ymax ymin
with
eT t yd t yt
being the tracking error. The controller parameters K P , T I
and T D are designed initially using the ZieglerNichols
method and then are ne tuned with trial and error method
to minimize the integrated quadratic tracking error around
the middle operating point in the operating region,
C a 0:105. The PID controller parameters are obtained
as follows:
K P 6;
T I 39;
T D 9:75.
4.2.4. Simulation results

Two simulations have been done with the process
controlled by the developed self-learning PID control and
the xed parameter PID control, for set point tracking
without and with disturbance, respectively. The tracking
reference signal is chosen as multi-step input distributed in
the entire operating space. The tracking performances for a
process without disturbance are shown in Figs. 11 and 12.
sample
0.15
uk umin
;
umax umin
4.2.3. The fixed parameter PID controller

A conventional xed parameter PID controller is
designed for comparison. The continuous time PID
controller has the following structure:

Z t
1
deT t
ut K p eT t
et dt T d
TI 0
dt
0.12
Ca(k) mol/l
Ca(k) mol/l
^
yk
g^ 6:10:1 uk 1; . . . ; uk 3; yk 1; . . . ; yk 3.
0.11
0.1
0.1
0.09
1
0.05
1
601
1201
1801 sample
Fig. 10. Nonlinear dynamics demonstrated by step response at different

operating points.
101
201
301
desire trajectory
401
501
601
Process output
sample
Fig. 11. Process response with auto-tuning PID controller (without

disturbance).
ARTICLE IN PRESS
1586
0.15
Ca(k) mol/l
Ca(k) mol/l
0.12
0.11
0.1
0.09
101
201
301
desire trajectory
401
Fig. 12. Process response with xed-parameter PID controller (without

disturbance).
Comparing Fig. 11 with Fig. 12, it can be seen that the

performance of the developed self-learning PID is superior
to the xed parameter PID. As the xed parameter PID
controller is designed around the operating point of
Ca 0:105, the performance is degraded when the operating point is away from that value. For example, when
Ca 1:2, the process response display a severe oscillation.
This drawback is overcome when the PID parameter has a
self-learning ability. The PID parameters have been
adapted to cope with different dynamics in different
operating regions.
To test the robustness of the control system against
disturbance, an 8% abrupt change of qc is simulated on the
process starting at k 250 to simulate the malfunction of
the actuator.
qc k
qc k 0:08qc k;
k4250.
A larger amplitude of disturbance cannot be tested because

it will cause the control variable over the physical limit.
The system responses are displayed in Figs. 13 and 14.
The disturbance added to the process in this application
keeps a constant value rather than disappearing by itself.
The effects of the disturbance on the tracking performance
are quickly reduced by the compensation of the self-tuning
PID controller (Fig. 13). This is due to the adaptive NN
model that learns the changed dynamics caused by the
disturbance. Hence, the PID controller parameters are
tuned based on this model to compensate the disturbance
effects. The steady state and relative stability remains the
same after the disturbance is compensated. On the
contrary, although the xed parameter PID control can
also recover the degradation caused by the disturbance, the
relative stability gets worse when the operating point
moves to other regions.
5. Conclusions
In this paper, a self-learning PID control is proposed for
nonlinear systems without known dynamics. An adaptive
neural network model is developed to learn system
dynamics on-line, while the PID controller is tuned based
on the model to minimize the quadratic model tracking
error. As the self-learning algorithm of the PID is derived
0.11
0.09
501
601
sample
Process output
101
201
301
desire trajectory
401
501
601
sample
Process output
Fig. 13. Tracking response of SISO CSTR with auto-tuning PID

controller (with disturbance).
0.15
Ca(k) mol/l
0.13
0.13
0.11
0.09
1
101
201
301
desire trajectory
401
501
601
sample
Process output
Fig. 14. Tracking response of SISO CSTR with xed-parameter PID

controller (with disturbance).
using the Lyapunov method, convergence of the model

tracking error is guaranteed stable and to be minimum.
Due to adaptation of the neural model, nonlinear system
dynamics in different operating regions, or time varying
dynamics can be modelled. Therefore, the proposed control
is for nonlinear and time varying systems. Furthermore,
when the system is subject to malfunction of its components or actuators, the adaptive model can learn the postfault dynamics. Then the self-learning PID controller will
adjust its parameters to maintain system stability and
recover the performance. The proposed self-learning PID
controller is inherently for MIMO systems with all
interactions between loops are implicitly considered. This
greatly simplies the additional considerations in applying
conventional PID control, for example, decoupling controllers.
The developed controller is applied to a simulated
MIMO CSTR process for evaluation. The described
features in the above have been veried by the simulation
results. The method is also compared with a well-designed
xed parameter PID controller by applying to a SISO
CSTR process. The superiority of the developed controller
over the xed parameter PID is clearly demonstrated.
References
Astrom, K. J., & Hagglund, T. (1984). Automatic tuning of simple
regulators with specications on phase and amplitude margins.
Automatica, 20, 645651.
ARTICLE IN PRESS
Davidson, E. J. (1976). Multivariable tuning regulators: The feed-forward
and robust control of general servomechanism problem. IEEE
Transactions on Automatic Control, 21, 3547.
Funahashi, K. (1989). On the approximate realization of continuous
mappings by neural networks. Neural Networks, 2, 303314.
Gawthrop, P. J. (1986). Self-tuning PID controller: Algorithms and
implementation. IEEE Transactions on Automatic Control, 31(3),
201209.
Gyongy, I. J., & Clarke, D. W. (2006). On the automatic tuning and
adaptation of PID controllers. Control Engineering Practice, 14(2),
149163.
Halevi, Y., Palmor, Z. J., & Efrati, T. (1997). Automatic tuning of
decentralized PID controllers for MIMO processes. Journal of Process
Control, 7(2), 119128.
Hang, C. C., Astrom, K. J., & Ho, W. K. (1993). Relay auto-tuning in the
presence of static load disturbance. Automatica, 29(2), 563564.
Hang, C. C., Astrom, K. J., & Wang, Q. G. (2002). Relay feedback autotuning of process controllersa tutorial review. Journal of Process
Control, 12(1), 143162.
Ho, W. K., Hang, C. C., & Cao, L. S. (1995). Tuning of PID controllers based
on gain and phase margin specications. Automatica, 31(3), 497502.
Ho, W. K., Hong, Y., Hansson, A., Hjalmarsson, H., & Deng, J. W.
(2003). Relay auto-tuning of PID controllers using iterative feedback
tuning. Automatica, 39(1), 149157.
Iiguni, Y., Sakai, H., & Tokumaru, H. (1992). A real-time algorithm for a
multi-layered neural network based on the extended Kalman lter.
IEEE Transactions on Signal Processing, 40(4), 959966.
Kim, S. M., & Han, W. Y. (2006). Induction motor servo drive using
robust PID-like neuro-fuzzy controller. Control Engineering Practice,
14(5), 481487.
Koivo, H. N., & Tanttu, J. T. (1991). Tuning of PID controllersurvey of
SISO and MIMO. Preprints of IFAC international symposium on
intelligent tuning and adaptive control, Singapore.
Lee, T. H., Hang, C. C., Ho, W. K., & Yue, P. K. (1993). Implementation of a knowledge-based PID auto-tuner. Automatica, 29(4),
11071113.
1587
Lightbody, G., & Irwin, G. W. (1997). Nonlinear control structures based

on embedded neural systems models. IEEE Transactions on Neural
Networks, 8(3), 553567.
Ljung, L., & Soderstrom, T. (1983). Theory and practice of recursive
identification. Cambridge, MA: MIT Press.
Loh, A. P., Hang, C. C., Quek, C. K., & Vasnani, V. U. (1993). Autotuning of multi-loop proportional-integral controllers using relay
feedback. Industrial and Engineering Chemistry Process Design Development, 25, 654660.
Morningreda, J. D., Padenb, B. E., Seborga, D. E., & Mellichamp, D. A.
(1992). An adaptive nonlinear predictive controller. Chemical Engineering Science, 47(4), 755762.
Nishikawa, Y., Sannomya, N., Ohta, T., & Tanaka, H. (1984). A method
for auto-tuning of PID parameters. Automatica, 20(3), 321332.
Palmor, Z. J., Halevi, Y., & Krasney, N. (1993). Automatic tuning of
decentralized PID controllers for TITO processes. In Proceedings of
12th IFAC world congress (pp. 311314). Sydney.
Park, J. H., Sung, S. W., & Lee, I. B. (1997). Improved relay auto-tuning
with static load disturbance. Automatica, 33(4), 711715.
Pentinnen, J., & Koivo, H. N. (1980). Multivariable tuning regulators for
unknown systems. Automatica, 16, 393398.
Radke, F., & Isermann, R. (1987). A parameter adaptive PID controller
with stepwise parameter optimisation. Automatica, 23(4), 449457.
Ruano, A. E. B., Fleming, P. J., & Jones, D. I. (1992). Connectionist
approach to PID auto-tuning. IEE Process Control Theory and
Applications, 139(3), 279285.
Tavakoli, S., Grifn, I., & Fleming, P. J. (2006). Tuning of decentralised
PI(PID) controllers for TITO processes. Control Engineering Practice,
14(9), 10691080.
Zgorzelski, P., Unbehauen, H., & Niederlinski, A. (1990). A new simple
decentralized adaptive multivariable regulator and its application to
multivariable plants. Proceedings of eleventh IFAC world congress
(pp. 226231). Tallin, Estonia.
Zhuang, M., & Atherton, D. P. (1994). PID controller design for a TITO
system. Proceedings of IEE Part D: Control Theory and Applications,
141, 111120.

2007

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

2007

Hochgeladen von

Copyright:

Verfügbare Formate

ARTICLE IN PRESS

Control Engineering Practice 15 (2007) 15771587

A stable self-learning PID control for multivariable

E-mail address: D.Yu@ljmu.ac.uk (D.L. Yu).

to improve this technique has followed (Hang, Astrom, &

D.L. Yu et al. / Control Engineering Practice 15 (2007) 15771587

procedures are required, which can be rather time

where u 2 Rm and y 2 Rp are the sampled process input

where y^ 2 Rp is the estimated output by the NN model and

where n mnu pny , ok 2 Rq is the hidden layer output,

2.2. EKF training algorithm

matrices to be adjusted with the EKF algorithm, a

C w k 2 Rpqn1pq1 in (14) is the Jacobian matrix of

qhk qzk qyh yyk1

The second term,

and where Rjkjk1 2 Rpp is an unknown priori error

The procedure of applying the EKF algorithm to the

Eq. (21) is a mass balance equation for chemicals while

It can be seen in (21) and (22) that the main nonlinearities

D.L. Yu et al. / Control Engineering Practice 15 (2007) 15771587

The NMAE of Fig. 2 is 2:0267  103 ; 2:0357  103 .

Fig. 4. Absolute modelling error of the xed MLP model.

where  denotes the faulty data. In order to clearly show

qi k qi k Dqi k 2qi k;

Fig. 3. Modelling performance of the xed parameter MLP model.

Fig. 2. Process and the MLP model outputs.

(NMAE) is used as shown below:

Fig. 5. Absolute modelling error of adaptive MLP model.

when the CSTR process is subjected to a fault. It is evident

3. PID controller auto-tuning algorithm

y^ pid i 1 y^ pid i Dy^ pid i y^ pid i K pid iet i,

where K pid i is the gain matrix, et i 2 Rp is the NN model

To derive a gain matrix K pid in (27) at each iterative step

where d is a positive constant. Obviously, V i is positive

for the calculation of V i in (31). In discrete-time

qet i qy^ pid i

Then, the increment of the Lyapunov function, DV i can

period to distinguish from the sample time, k. To avoid

Fig. 6. Conguration of the NN-based auto-learning PID control system.

The second term in (37) is given

The gain matrix in (27) is designed as

Then, DV i in (34) becomes

It is obvious that DV i in (36) is negative if do1 is chosen.

From the above analysis, it is clear that when the gain

NN model is not changed during the process. To view

4. Application to CSTR process

4.2. Comparison with fixed-parameter PID controller

Fig. 8. Control variables for the tracking results in Fig. 7.

4.2.1. The SISO CSTR

The developed auto-tuning PID control scheme shown

Fig. 7. Tracking performance of auto-tuning PID controller.

The CSTR process can be briey described as follows.

qc t, the temperature can be varied, and hence the product

has the following structure:

In this project, the bounded range of the input variable is

corresponding to a constant input of

Before training, the data are linearly scaled to the values

4.2.4. Simulation results

4.2.3. The fixed parameter PID controller

Fig. 10. Nonlinear dynamics demonstrated by step response at different

Fig. 11. Process response with auto-tuning PID controller (without

Fig. 12. Process response with xed-parameter PID controller (without

Comparing Fig. 11 with Fig. 12, it can be seen that the

C w k 2 Rpqn1pq1 in (14) is the Jacobian matrix of

and where Rjkjk1 2 Rpp is an unknown priori error

The NMAE of Fig. 2 is 2:0267 103 ; 2:0357 103 .

where denotes the faulty data. In order to clearly show

qi k qi k Dqi k 2qi k;