Beruflich Dokumente
Kultur Dokumente
Jan Jantzen
Technical University of Denmark
Oersted-DTU, Automation
Building 326, DK-2800 Kongens Lyngby, Denmark
Phone: +45 4525 3561, fax: +45 4588 1295, email: jj@oersted.dtu.dk
ABSTRACT: The paper presents the fuzzy self-organising controller (SOC). The original controller con¿guration is
shown and compared to modern model reference adaptive systems. The paper analyses the original con¿guration, and
a novel approximation to the adaptation mechanism is developed. A simulation study in Simulink demonstrates that the
SOC is able to stabilise a plant having a long deadtime. The paper can be used as textbook material for control students
previously initiated in fuzzy control.
INTRODUCTION
Just as fuzzy logic can be described as ’computing with words rather than numbers’, fuzzy control can be described
as ’control with sentences rather than equations’. It is more natural to use sentences, or rules, in for instance operator
controlled plants, with the control strategy written in terms of if-then clauses. If the controller furthermore adjusts the
control strategy without human intervention it is adaptive.
In daily conversation to adapt means to modify according to changing circumstances, for example: ’they adapted
themselves to the warmer climate’. An adaptive controller is therefore intuitively a controller that can modify its behaviour
after changes in the controlled plant or in the environment. Some objectives for using adaptive control are
the character of the plant varies, or
the character of the disturbances varies.
Adaptive controllers are used in aircrafts, for example. The dynamics of an aircraft change with speed, altitude, and
pitch angle the eigenvalues change as the aircraft moves between different operating points within its Àight envelope, a
safe region in the plot of altitude versus speed (Mach number). The controller is aware of the present operating conditions
and accordingly adjusts its gains. Another example is a ship’s roll damper, that takes into account the frequency of the sea
waves.
Research in adaptive control started in the early 1950s. Control engineers have tried to agree on a formal de¿nition
of adaptive control, and for example in 1973 a committee under the Institute of Electrical and Electronics Engineers
(IEEE) proposed a vocabulary including the terms ’self-organising control (SOC) system’, ’parameter adaptive SOC’,
’performance adaptive SOC’, and ’learning control system’ (Åstrøm & Wittenmark, 1995[3]). These terms were never
widely accepted, however, but the story does explain why the adaptive fuzzy controller, invented by E.H. Mamdani and
students in the latter half of the1970s, is called the self-organising fuzzy controller (SOC).
Lacking a formal de¿nition of adaptive control, we choose to quote the pragmatic de¿nition that Åstrøm & Wittenmark
propose in their book on adaptive control, a primary reference in the ¿eld.
De¿nition (adaptive control) An adaptive controller is a controller with adjustable parameters and a mechanism for
adjusting the parameters (Åstrøm & Wittenmark, 1995[3]).
Despite the lack of a formal de¿nition, an adaptive controller has a distinct architecture, consisting of two loops: a
control loop and a parameter adjustment loop (Fig. 1).
Systems with a dominant time delay are notoriously dif¿cult to control, and the fuzzy self-organising controller (Mam-
dani & Baaklini, 1975[6] Procyk & Mamdani, 1979[7] Yamazaki & Mamdani, 1982[9]), or SOC for short, was devel-
oped speci¿cally to cope with dead time. To the inventors it was a further development of the original fuzzy controller
Parameters
Control signal Output
Controller Plant
Figure 1: Adaptive control system. The inner loop (solid line) is an ordinary feedback control loop around the plant. The
outer loop (dashed line) adjusts the parameters of the controller.
(Assilian & Mamdani, 1974a 1974b[1][2]). Today it may be classi¿ed as a model-reference adaptive system (MRAS), an
adaptive system in which the performance speci¿cations are given by a reference model (Fig. 2). In general the model
returns the desired response ym to a command signal uc. The parameters are changed according to the model error, the
deviation of the plant response from the desired response. Designing the adjustment mechanism such that the closed loop
system remains stable is a challenge.
INNER LOOP The inner loop is an incremental, digital controller. The change in output CUn at the current time n
is added to the control signal Un4 from the previous time instant, modelled as a summation in the ¿gure (Fig. 3). The
two inputs to the controller are the error e and the change in error ce: The latter is the derivative of the error. The signals
are multiplied by tuning gains, GE and GCE respectively, before entering the rule base block F. In the original SOC, F
is a lookup table, possibly generated from a linguistic rule base. The table lookup value, called change in output, cu; is
multiplied by the output gain GCU and digitally integrated to become the control signal U . The integrator block can be
omitted, however then the table value is usually called u (not cu), scaled by a gain GU (not GCU ), and used directly as
the control signal U (not CU ).
OUTER LOOP The outer loop monitors error and change in error, and it modi¿es the table F through a modi¿er
algorithm M. It uses a performance measure p to decide the magnitude of each change to F. The performance measure
depends on the current values of error and change in error. The performance measures are preset numbers, organised in a
table P the size of F, expressing what is desirable, or undesirable rather, in a transient response. The table P can be built
using linguistic rules, but is often built by hand, see two examples in Figs. 4 and 5. The same performance table P may
be used with a different plant, without prior knowledge of the plant, since it just expresses the desired transient response.
The controller can start from scratch with an F-table full of zeros but it will converge faster towards a steady table, if F is
primed with sensible numbers to begin with.
Parameters
Adjustment
mechanism
uc
u y
Controller Plant
Figure 2: Model reference adaptive system, MRAS. The outer loop (dashed line) adjusts the controller parameters such
that the error em @ ym y becomes close to zero.
P M
Modifier
Array2
e E
GE
cu CU U
F GCU Σ
ce CE
GCE Integrator
Array1
Figure 3: Self-organising fuzzy controller, SOC. The outer loop adjusts the controller lookup table F according to the
performance measure in P.
CE
9 8 7 6 5 4 3 4 5 6 7 8 9
9 9 9 9 9 9 9 9 3 3 3 3 3 3
8 9 9 9 9 9 9 9 6 5 5 3 3 3
7 9 9 9 9 9 9 9 8 7 5 3 3 3
6 9 8 8 7 7 7 7 6 5 3 3 3 3
5 9 8 7 6 5 5 5 3 3 3 3 3 3
4 8 7 6 5 4 4 4 3 3 3 3 3 3
E 3 7 6 5 4 3 3 3 3 3 4 5 6 7
4 3 3 3 3 3 3 4 4 4 5 6 7 8
5 3 3 3 3 3 3 5 5 5 6 7 8 9
6 3 3 3 3 5 6 7 7 7 7 8 8 9
7 3 3 3 5 7 8 9 9 9 9 9 9 9
8 3 3 3 5 5 6 9 9 9 9 9 9 9
9 3 3 3 3 3 3 9 9 9 9 9 9 9
Figure 4: Example of a performance table note the universes (adapted from Procyk and Mamdani, 1979 [7]).
1. it records the deviation between the actual state and the desired state, and
2. it corrects table F accordingly.
The performance table P evaluates the current state and returns a performance measure P+in ; jn , : Index in cor-
responds to En , such that En @ Ue+in ,; where Ue is the input universe. Index jn corresponds to CEn ; such that
CEn @ Uce+jn ,; where Uce is the other input universe.
Figures 4 and 5 are examples of early performance tables. Intuitively, a zero performance measure, P+in ; jn , @ 3;
implies that the state +En ; CEn , is satisfactory. If the performance measure is nonzero, that state is unsatisfactory with a
nonzero severity pn . In the latter case the modi¿er M assumes that the control signal must be punished by the amount pn .
The current control signal cannot be held responsible, however, because it takes some time before a control action shows
up in the plant output.
The simple strategy is to go back a number of samples d in time to correct an earlier control signal. The modi¿er
must therefore know the time lag in the plant. The integer d is comparable to the plant time lag d is here called the
delay-in-penalty.
It is required that an increase in plant output calls for an adjustment of the control signal always in the same direction,
whether it be an increase or a decrease. The modi¿er thus assumes that the plant output depends monotonously on the
input.
The precise adjustment rule is
und # und . pn (1)
The arrow is to be interpreted as a programmatic assignment of a new value to the variable und . In terms of the tables F
and P, the adjustment rule is
F +i; j ,nd # F +i; j ,nd . P +i; j ,n (2)
The time subscript n denotes the current sample. In words, it regards the performance measure as an extra contribution to
the control signal that should have been, in order to push the plant output to a state with a zero penalty.
Example 1 (update scheme) Assume d @ 5 and variables as in Table 1 after a step in the setpoint. From t @ 4 to t @ 7
the plant moves up towards the setpoint. Apparently it follows the desired trajectory, because the performance measure
p is 3. At t @ 8; error changes sign indicating an overshoot, and the performance table reacts by dictating p8 @ 4.
Since d is 5, the entry to be adjusted in F will be at the position corresponding to t @ n d @ 8 5 @ 6. At that
sampling instant, the state was +E6 ; CE6 , @ +4; 5, and the element F+i; j ,6 was u6 @ 4. The adjusted entry is
u6 @ u6 . p8 @ 4 4 @ 5; which is to be inserted into F+i; j ,6 .
CONSIDERATIONS IN PRACTICE
The previous section describes how the SOC works in theory. In practice additional problems occur, for example how to
tune the gains, how to choose the design parameters, how to stop the adaptation if it behaves unintentionally, and how to
cope with noise. Such problems may seem inferior, but they do deserve attention, since they may ¿nally stand in the way
for a succesful implementation. The SOC works ’surprisingly well’ to quote Mamdani, but cases exist where the SOC
becomes impractical.
GE; GCE; and G+C ,U: The controller gains must be set near some sensible settings, with the exact choice less
important compared to a conventional fuzzy controller. Imagine for example that the output gain is lowered between
two training sessions. The adjustment mechanism will then compensate by producing an F table with numbers of
larger magnitude. Even if the input gains were changed it would still manage to adapt. It is therefore a good idea to
start with a linear F-table, and set the gains according to any PID tuning rule (Jantzen, Verbruggen & Ostergaard,
1999[4]). That is a good starting point for the self-organisation.
Target time constant ¤ : The smaller ¤ , the faster the desired response. If it is too small, however, the closed loop
system cannot possibly follow the desired trajectory, but the modi¿er will try anyway. As a result the F-table winds
up, and the consequence is a large overshoot. The lower bound for ¤ is when this overshoot starts to occur. A plant
with a time constant ¤ p and a dead time Tp requires
Tp ¤ Tp . ¤ p (6)
A value somewhat smaller than the right hand side of the inequality is often achievable, because the closed loop
system is usually faster than the open loop system.
Delay-in-penalty d: The d should be chosen with due respect to the sample period. The delay should in principle
be the target time constant divided by the sample period and rounded to the nearest integer,
The results are usually better, however, with a value somewhat smaller than that.
Adaptation gain Gp : The larger Gp , the faster the F-table builds up, but if it is too large the training becomes
unstable. It is reasonable to choose it such that the magnitude of a large p is less than, say, 4=8 of the maximum
value in the output universe. This rule results in the upper bound:
Gp
3:5 mF +i; j ,mpd{
m+en . ¤ cen ,mpd{ Ts
(8)
Compared to conventional fuzzy control the tuning task is shifted from having to tune accurately ^GE; GCE; G+C ,U `
to tuning ^¤ ; d; Gp ` loosely.
TIME LOCK
The delay-in-penalty d causes a problem, however, when the reference or the disturbances change abruptly.
Consider this case. If the error and the change in error, for a time period longer than d samples, have been near
zero, then the controller is in an allowed state (the steady state). Suddenly there is a disturbance from the outside, the
performance measure p becomes nonzero, and the modi¿er will modify the F-table d samples back in time. It should not
do so, however, because the state was acceptable there. The next time the controller visits that state, the control signal
will Àuctuate. The problem is more general, because it occurs also after step changes in either the reference or the load
on the plant.
A solution is to implement a time-lock (Jespersen, 1981[5]). The time lock stops the self-organisation for the next d
samples if it is activated at the sampling instant Tn , self-organisation stops until sampling instant Tn . d . 4. In order to
trigger the time-lock it is necessary to detect disturbances, abrupt changes in the load, and abrupt changes in the reference.
If these events cannot be measured directly and fed forward into the SOC, it is necessary to try and detect it in the plant
output. If it changes more than a prede¿ned threshold, or if the combination of error and change in error indicates an
abrupt change, then activate the time-lock.
The time-lock is conveniently implemented by means of a queue with length d .4 samples. The queue contains index
pairs into the matrix F,
q @ +i; j ,n +i; j ,n4 : : : +i; j ,nd (9)
The element to update in F is indicated by the last element in q. At the next sampling instant, the current index pair is
attached to the front of the queue, while the last index pair is dropped. If an event triggers the time-lock, the queue is
It is dif¿cult to control, because the dead time of < seconds is large compared to the time constants of the plant. The plant
is inserted in the test bench in Fig. 6 (the software can be downloaded from http://fuzzy.iau.dtu.dk/tedlib.nsf). At time
t @ 3 the reference changes abruptly from 3 to 4 and after 833 seconds, a load of 3:3:8 units is forced on the system. The
transport delay block in Fig. 6 is the dead time of the plant (9 seconds), and the band-limited white noise block is there in
order to experiment with noise. The simulation environment is Matlab (v. 4.2.c.1) for Windows together with Simulink
(v. 1.3c). The strategy is
1. to design and tune a fuzzy controller with a linear control surface, and
2. to start self-organisation without changing the tuning, and
3. to see if self-organisation improved the response.
The experiment is supposed to show qualitatively, whether there is an improvement it is not meant to be an exhaustive
scienti¿c investigation
Since the test setup includes a load on the system, it is necessary to maintain a nonzero control signal in the steady
state thus an incremental controller, with an integrator, is necessary. The rule base was initially a linear 54 54 lookup
table with interpolation. The system was hand-tuned to respond reasonably the resulting gains were
GE @ 433
GCE @ 6333
GCU @ 7 438
The simulation time step was ¿xed at Ts @ 8 seconds which is also the sampling period.
The SOC is able to damp out the oscillation after ¿ve training runs (Fig. 7). After training, the control surface has
several jagged peaks, see Fig. 8, where the last trajectory is shown also. During each run the controller visited a number
of cells in the look-up table, sometimes the same cell is visited several times, and the accumulated changes resulted in the
sharp peaks. One might expect jumps in the control signal as a result, but in fact it is rather smooth thanks to the integrator
in the output end of the incremental controller.
Band-Limited
u ToWs3 Load White Noise
u - 1 +
+ +
Setpoint s(s+1)(s+1) Sum2
Sum1 Scope
Transport Zero-Pole
SOC Delay y
ToWs
Process output
2
-2
0 200 400 600 800 1000
Time [secs]
0.2
Control signal
0.1
-0.1
0 200 400 600 800 1000
Time [secs]
Figure 7: SOC on test plant. Dashed line is before self-organisation, solid line is after ¿ve training runs.
200
100
1
0
42 3
-100
-200
100
100
0 50
0
-50
Change in error -100 -100 Error
Figure 8: Deformation of the initial control plane after ¿ve learning runs.
Tdip @ Ts d @ 8 7 @ 53 vhf
Thus Tdip is considerably less, one third in fact, of the target time constant. With a learning gain
Gp @5
the updates p to the F-table are never larger than 20, which means that it would take ten updates to reach the limit of the
table entry universe ^533; 533`, when starting from zero.
To illustrate the training, the ¿rst, second, and ¿fth runs are plotted in Fig. 9. By visual inspection the second run
is already much better than the ¿rst, and the ¿fth is good, although it has dif¿culties with the steady state after the load
change.
Obviously, good performance means the response is close to the desired response, and there will be only few correc-
tions to F. One measure of how the training is progressing is therefore the number of changes per training run. During a
successful suite of runs this number will decrease to some steady level rarely zero and then the training can stop. If
the number of changes starts to increase again, the training is unstable.
IAP
[
It is perhaps even easier to implement the integrated absolute penalty,
@ mpn m (11)
n
It is related to the magnitude of the error between the actual response and the desired response. The IAP accumulates
the magnitude of both negative and positive penalties, and the smaller IAP the better.
The IAP was collected after each run and plotted in Fig. 10, which hints that the training converges towards a constant
level, which is often the case. If the training becomes unstable after a while, for example if delay-in-penalty is too long or
too short, the curve will start to increase again. This is an indication that the training should stop.
It is also interesting to watch how the penalties progress. Figure 11 shows the plant output from the ¿fth training run
together with the penalty. The modi¿er tries hard to improve the responses to the reference change and the load change,
whereas it is more at rest around the steady state.
Experiments on laboratory models have shown that other plants can be stabilised also, but the rule modi¿er is sensitive
to noise on the plant output it cannot tell if a bad performance measure is due to noise or a bad state. A poor signal to
noise ratio will actually spoil the adaptation.
CONCLUSION
The simulation shows that the SOC is able to stabilise the marginally stable test system. The adjustment mechanism
is simple, but in practice the design is complicated by the time lock and noise precautions. Although the tuning of
the traditional input and output gains (GE , GCE , G+C ,U ) is made less cumbersome by the adjustment mechanism,
it introduces other parameters to tune (delay-in-penalty, learning gain, target time constant). The tuning task should
nevertheless be easier.
The SOC was developed for single-loop control, but it has been applied to multi-loop problems also. Compared to
neural network control, it seems to learn faster a drawback is that it is not a natural multivariable controller.
1.5
Process output
1
0.5
-0.5
0 200 400 600 800 1000
Time [secs]
Figure 9: Three training runs: ¿rst (dotted), second (dash-dot), and ¿fth (solid).
2500
2000
1500
IAP
1000
500
0
1 2 3 4 5
Run
1.5
Process output
0.5
-0.5
0 200 400 600 800 1000
Time [secs]
20
10
Penalty P
-10
-20
0 200 400 600 800 1000
Time [secs]
Figure 11: Process output during ¿fth training run (top) and corresponding penalties (bottom).