Sie sind auf Seite 1von 9

im EEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO.

2, MAY 1995

A New Approach to Fuzzy-Neural System Modeling


Yinghua Lin, Member, ZEEE, and George A. Cunningham 111, Member, ZEEE

Abstruct- We develop simple but effective fuzzy-rule based Fuzzy Neural


models of complex systems from input-output data. We introduce Step Viewpoint Viewpoint
a simple fuzzy-neural network for modeling systems, and we 1. Extract the sienificant Extract the sienificant
prove that it can represent any continuous function over a input variables. input variables.
compact set. We introduce “fuzzy curves” and use them to: i)
identify significant input variables, ii) determine model structure, I I of rules needed and the I structure and the initial I
and iii) set the initial weights in the fuzzy-neural network model.
Our method for input identification is computationally simple, functions.
and, since we determine the proper network structure and initial
weights in advance, we can train the network rapidly. Viewing inferences. neurons.
the network as a fuzzy model gives insight into the real system,
and it provides a method to simplify the neural network. Fig. 1. Steps for developing the system model. We can use either the fuzzy
viewpoint, or the neural viewpoint, or both, at each step.

I. INTRODUCTION
the system and to simplify the model. Our model can be

A multilayered neural network can approximate any contin- viewed as either a fuzzy system, a neural network, or a fuzzy-
uous function on a compact set [7], [121, and 1211, and neural system. We emphasize that we can choose the fuzzy
a fuzzy system can do this same approximation [4], [16], and viewpoint, the neural viewpoint, or both, depending on our
[26]. Several authors have used neural networks to implement needs at the time.
fuzzy models [2], [9], [ll], [13]-[15], and [24], and a few Section I1 describes the architecture of our fuzzy-neural
authors [22], [23] have presented work on building fuzzy or network. Section I11 introduces fuzzy curves and shows how
fuzzy-neural models from input-output data. Some work has to obtain the fuzzy curves for any set of input-output data.
been published [l], [5], [19], and [27] on obtaining the proper Section IV describes how we estimate the number of rules,
network structure and initial weights to reduce training time. and Section V shows how we compute the initial weights.
Fuzzy-neural networks can be divided into two main cat- Section VI discusses the training and simplification of the
egories. One group of neural networks for fuzzy reasoning neural network. Section VI1 gives a tutorial example, and
uses fuzzy weights in the neural network. Examples are given finally we compare the performance of our model with the
in [9], [13], and [14]. In a second group, the input data are performance of several previously published examples.
fuzzified in the first or second layer, but the neural network
weights are not fuzzy. The fuzzy-neural network discussed in 11. T H E ARCHITECTURE OF THE FUZZY-NEURAL NETWORK
this paper, as well as those in [2], [ 111, [ 151, and [24] are in
this group. As we detail in Section 11, our work differs from Fig. 2 shows the architecture of our four layer (input, fuzzi-
both radial basis functions [18], [21], and [28], and fuzzy rule fication, inference, and defuzzification) fuzzy-neural network.
table approaches [81, [251. Refemng to Fig. 2, we see that there are N inputs, with N
We summarize the four steps used to create our system neurons in the input layer, and R rules, with R neurons in the
model in Fig. 1. We use a simple neural network to implement inference layer. There are N x R neurons in the fuzzification
a fuzzy-rule-based model of a real system from input-output layer. Hence, once we determine the number of inputs N
data. We introduce the concept of a “fuzzy curve” and we and the number of rules R, we know the structure of the
use fuzzy curves for: i) identification of the significant input network. The first N neurons (one per input variable) in the
variables, ii) estimation of the number of rules needed in the fuzzification layer incorporate the first rule, the second N
fuzzy model, and iii) determination of the initial weights for neurons incorporate the second rule, and so on.
the neural network. We use the number of input variables Every neuron in the second or fuzzification layer represents
and the number of rules to determine the structure of our a fuzzy membership function for one of the input variables.
neural network. We train the network using back-propagation. The activation function used in the fuzzification layer is
Finally, we return to the fuzzy viewpoint to gain insight into f ( n e t ; j ) = exp(-lnet;jl””j), where Z;j is typically in the
range 0.55 Z;j 55 and initially equals one or two, and
Manuscript received December 10, 1993; revised August 19, 1994.
Y. Lin was with the Department of Computer Science, New Mexico Institute
+
netij = w;j1x; w;jo. Hence, the output of the fuzzification
of Mining and Technology, Socorro, NM, and is now with Los Alamos
layer is
National Laboratory, Center for Nonlinear Studies, Los Alamos, NM 87545
USA. p i j ( x ; ) = exp(-lw;jlzi + wij0llij) (1)
G. Cunningham is with the Department of Electrical Engineering, New
Mexico Institute of Mining and Technology, Socorro, NM 87801 USA. where p;j is the value of fuzzy membership function of the
IEEE Log Number 9410132. ith input variable corresponding to the jth rule. We label the
10634706/95$04.00 0 1995 IEEE

~~ ~ -

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
LIN AND CUNNINGHAM: FUZZY-NEURAL SYSTEM MODELING 191

-2 0 2

-2 2

DEFUZZIFICATION 1-ik -2 0 2 -2 0 2

Fig. 3. 10 input fuzzy membership functions generated by p(z) = e-lzl',

wNRvFUZZIFICATION

Fig. 2. The architecture of the fuzzy-neural network.


2 ~ [ - 2 ,21, I = 0.51;..,5.

Our model differs from neural networks using radial basis


functions in two ways. First, our neural network represents
a fuzzy system so that we can obtain insight into the real
system, and second, our input fuzzy membership functions are
set of weights between the input and the fuzzification layer by defined by (1) instead of basic Gaussian functions. We note
i l , . . . , N j; = l , . . * , R } .
W = { { ~ i j o ,~ i j 1 ) : = that the input fuzzy membership functions created by (1) can
The activation function used in the third or inference layer approximate triangular and trapezoidal membership functions.
is f(netj) = netj. We use multiplicative inference, so the This is illustrated in Fig. 3 which shows the shape of the
output of the inference layer is input fuzzy membership functions for I = 0.5, 1, . . . , 5. We

g
N also find that the fuzzy neural network is easier to train with
Pj(% 52,..',2N) = n,lijc.i, (2) y = p3w3 instead of y = in the last layer, yet our
z
model is simple and univers
where the p i j are from (1). The connecting weights between Our approach also differs from the conventional fuzzy rule
the third layer and the fourth layer are the central values, wj, table approaches [8], [25]. In those models, an input space
of the fuzzy membership functions of the output variable. We is divided into K 1 x K 2 x ... x K , fuzzy subspaces, where
label the set of weights {wj} by V = {wj:j = l , . . . , R } . K,, i = 1, 2, . . . , n, is the number of fuzzy subsets for the
Each fuzzy rule r j ( j = 1, 2, . . . , R) is of the form: if 2 1 ith input variable. There is a fuzzy rule for each of these
is p l j and 2 2 is p 2 j and . . . and 2~ is p ~ j then
, y is wj. subspaces. The main drawback of this approach is that the
Note that the neural network weights in V and W determine number of fuzzy rules increases exponentially with respect to
the fuzzy rules. We use the weighted sum defuzzification, and the number of inputs n. In our model, the number of rules
the equation for the output is needed is determined by the data itself. Typically, a small
R number of rules are produced, even with complex data sets.
0 ( 2 1 , 2 2 , " . , ~ N )= c P j ( 2 1 , 2 2 , . . . , 2 N ) w j
3 111. AN INTRODUCTION TO FUZZY CURVES
R N
Consider a multiple-input, single-output system for which
= ~wjJJeXp(-IWijlSi + w;j01q* (3)
we have input-output data with possible extraneous inputs.
j i
We wish to determine the significant inputs, the number of
We show in the appendix that (3) can represent any con- rules R, and initial values for the weights in V and W which
tinuous function f : s N +. % over a compact set as closely determine the initial membership functions. We call the input
as we desire. Also, if we replace the fuzzification layer by candidates z;(i = 1, 2 , . . . , n), and the output variable y.
indicator functions for set membership, and the output layer Assume that we have m training data points available and
by a Boolean sum, then our model implements the disjunctive that z ; k ( l c = 1, 2 , . . . ,m) are the ith coordinates of each of
normal form in Boolean logic. Hence, this structure is a fuzzy the m training points. Table I shows an example with n = 3
generalization of the crisp disjunctive normal form. and m = 20.
Our model is similar to radial basis functions neural network For each input variable xi, we plot the m data points in
[18], [21], [28]. The input-output relation of the radial basis xi - y space. Fig. 4 illustrates the data points from Table I in
functions neural network is o = wj nr
exp (-(wijlzi + the 2 1 -y, 2 2 -y, and 2 3 -y spaces. For every point (zil~, yk)
Wijo)2) in 2 ; - y space, we draw a fuzzy membership function for the

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
192 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 2, MAY 1995

DATEPOINTSUSEDIN AN
TABLE I
INTRODUCTION TO FUZZYCURVES

2;-I
I
- -
-No. -23 &
.
:
.3
1 14.9 2.0 9.3 B
2 16.6 1.7 5.8 3
3 21.3 2.1 9.1
4 24.3 2.9 7.0
2.5 2.51
5 26.6 1.7 4.8
6 23.2 1.3 4.7 10 15 20 25 5 10
7 22.2 2.5 4.5 xl x2 x3
8 18.1 2.5 5.9
9 13.7 2.8 7.9 Fig. 6 . Fuzzy curves c1, c2, and c3.

10 7.7
- 2.6 -
9.3
input. For example, if we take a random z as an input for the
5 output y in Table I, then the fuzzy curve for y with respect
to z will be nearly flat. If the range of a fuzzy curve c; is
about the range of the output data y, then the input candidate
2 34 , 1 +- 1
xi is important to the output variable. The fuzzy curve tells
us that the output is changing when z; is changing. We rank
2 + the importance of the input variables x; according to the range
I covered by their fuzzy curves c;. The ranges of fuzzy curves
10 20
xl in Fig. 6 are 1.32 for c1, 1.24 for c2, and 0.89 for c3. Hence,

v3,:pl
Fig. 4. Data points plotted in XI - y, xz - y, and 23 - y spaces. we deduce that z1 is most significant input, zz, is second, and
23 is third. We will demonstrate these ideas with an example
in Section VII.

Iv. DETERMINING THE NUMBEROF RULES


l- 3 We estimate the number of rules, Ri,needed to approximate
each fuzzy curve c, , by the maximum and minimum points on
the curve. This is a heuristic based on the idea that the fuzzy
2.5
model will interpolate between the maximum and minimum
10 20 5 10
XI x2 x3 points. If the maximum and minimum points are far apart, or
Fig. 5. Fuzzy membership functions in 11 - y, 1 2 - y, and SS - y spaces. the curve is not smooth between the maximum and minimum
points, we may add rules. As shown in subsequent sections,
we only need an approximate number of R,. If we have
input variable z; defined by N fuzzy curves, then we will have N different numbers
R1, R2,. . ’ ,R N corresponding to the N fuzzy curves. To
determine the number of rules R needed in the fuzzy neural
network, we let R = max (RI, R2, . . . , R N ) .
For the fuzzy curves shown in Fig. 6, we decide that c1
needs about four rules (see the circles in Fig. 6), cg needs about
Each pair of $zk and the corresponding y k provides a fuzzy
rule for y with respect to 2,. The rule is represented as “if four rules, and c3 needs about three rules. We reemphasize
that we only need approximate numbers. The largest number
2, is & k ( Z , ) , then y is y k . ” $zk is the input variable fuzzy
membership function for x, corresponding to the data point IC. of rules needed is four, and we set the number of fuzzy rules
$zk can be any fuzzy membership function, including triangle,
for the model to R = 4.
trapezoidal, Gaussian, and others. Here we use Gaussians. We At this point we have R, the number of rules, and N,
typically take b as about 20% of the length of the input interval the number of the input variables. This means we know the
of 2,. For m training data points, we have m fuzzy rules number of neurons in the input layer (N), the number of
for each input variable. Fig. 5 shows the fuzzy membership neurons in the inference layer (R), and the number of neurons
functions for the points in Fig. 4. In Fig. 5 , we draw the in the fuzzification layer (N x R). Hence, we have determined
membership functions $,le so that the point where $& = 1 the structure of the neural network model.
coincides with the point ( z , k , y k ) .
We use centroid defuzzification to produce a fuzzy curve c, v. SE?TING THE INITIAL WEIGHTS
for each input variable z, by We set the initial weights in V to the centers of output
variable fuzzy membership functions. To do this we divide
(5) the range of the desired output data into R intervals, and we
set the initial w j ( j = 1, 2, . . ,R) to be the central value of
Fig. 6 shows the fuzzy curves c1, c 2 and c3 for the data in these R intervals. For the data shown in Table I, the value of y
Table I. If the fuzzy curve for given input is flat, then this input varies from 1.8 to 4.9. Using four rules, we divide the interval
has little influence in the output data and it is not a significant 1.85 y 54.9 into four parts with centers at 2.19, 2.96, 3.74,

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
LIN AND CUNNINGHAM: FUZZY-NEURAL SYSTEM MODELING 193

and 4.51. We assign these central values to the weights vg in


ascending order. Hence, for this example, we make v1 = 2.19,
~2 = 2.96, ~3 = 3.74, and 214 = 4.51.
We use the fuzzy curves to set the initial weights in W . We
divide the domain for each fuzzy curve c, into R intervals
corresponding to the R intervals in the output space. For
the fuzzy curve c,, we label the centers of the intervals
( x , , ( j = 1, 2 , . - . ,R). We order the x , , ( j = 1, 2 , . . . , R) by
the value of c, at the center of each interval. X , R corresponds
to the interval containing the largest value of c,. The interval
containing the point 2 , is~associated with the output interval
whose center is at VR.In a similar fashion, x , R - ~ is the center
of the interval which contains the next largest central point on
the curve c,, and 2,R-1 is associated with VR-1, and so on
forj = R-2,R-3,...,1 .
The length of the interval over which a rule applies Fig. 7. The output surface of the nonlinear equation y = (2 +x i s 5- 1.5 sin
(
3
.
2)'
.
in the domain of c, is denoted as Ax,. We define the
initial fuzzy membership function of 2 , for rule j to be
exp (-l(x'J-x')l'~ ), a Ax, where a is typically in the range of the performance without the rule is acceptable, then we have
a model with one less rule.
[ O S , 21. Hence, referring to (l), we see that the initial weights
If we find that a fuzzy membership function that is always
wage and wZg1are w , , ~= -& and wyl = -*.a Ax,
near one over its input range, then we can remove this
As noted in Section 11,our rules are of the form: if x, is pl3
neuron with no change in performance. If the performance
and 2 2 is pzg and . . and X N is p ~, they , y is U,. At this point
is acceptable without the neuron, then we remove it from the
we have determined the initial input membership functions pa3
model. By dropping unnecessary rules and neurons, we obtain
and the central output values vg from the input-output data.
a simply fuzzy-neural model with acceptable performance. All
of this is accomplished with no retraining.
VI. TRAINING AND SIMPLIFYING THE MODEL
VII. A SIMPLEEXAMPLE
We define the performance index for our model as
We use a nonlinear system with two inputs, 21 and 2 2 , and
a single output, y defined by
y = (2 + - 1 . 5 s i n ( 3 ~ 2 ) ) ~o, 5 21, 2 2 5 3. (7)
A graph of (7) is shown in Fig. 7. We randomly take 100
where of, (k = 1, 2, . . . , m), are the actual or desired output points from 0 5 xl, x2 5 3 and obtain 100 input-output
values and o k , (k = 1, 2, . . . ,m ) , are the outputs from the data. To illustrate input variable identification, we add random
model. We train tbe neural network with a back-propagation variables, 2 3 and x4, in the range [0, 31 as the dummy inputs.
technique to'modify the variables U,, wag,,,wag1 , and Z, . We To identify the significant input variables, we draw the four
choose a maximum number of iterations I,,, and some fuzzy curves c1, c2, c3, and c4 for the four input candidates
small number E >O. The training is continued until, for 21, 22, 2 3 , and 2 4 . The four fuzzy curves are shown in Fig.
some i, Cl:? PI, - ~ ~ ~ ~ PI, " + 5o E oor the number of 8. From Fig. 8 we find that the ranges of ci, for c1 to c4, are
iterations reaches I,,, . All of the examples in this paper used 28.8, 18.0, 6.0 and 7.5, respectively. From this, we easily and
I,,, = 5000. The choice of E depends on the problem. correctly identify 2 1 and 2 2 as the significant input variables
If, after training, the model's performance is not adequate, for the system.
we increase the number of rules, reset the initial weights, and We examine the curves c1 and c2 for 2 1 and 2 2 , and decide
retrain. If the model's performance is better than we need, that we need approximately three rules for 21, and five rules
and we wish to reduce model complexity, then we decrease for 2 2 (see the circles in Fig. 8). The largest number of rules
the number of rules. This allows us to obtain the simplest is five, and we build a fuzzy-neural model with five rules. At
model that yields the desired performance. We can afford this this point, we know that the input layer has two neurons, the
rebuilding and retraining because our choice of initial weights inference layer has five neurons, and the fuzzification layer
makes the number of training steps small. has ten neurons. By setting the initial values for the weights
Training the neural net, however, does not give the final in V and W as described in Section V, we obtain the initial
model. As shown in Section W we can use the fuzzy fuzzy-neural model.
membership functions to gain insight into the model and the We train the model using the back-propagation with E =
system. If a fuzzy membership function is always near zero 0.001 and I,,, = 5000. With R = 5, the model reached
over its input range, then the output of the rule using this fuzzy 5000 iterations with PI = 0.00336. Next, for the purposes of
membership function is always near zero. We can delete this illustration, we trained model with four, six, and seven rules.
rule and check the model's performance with no retraining. If The results are shown in Table II.

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
194 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 2, MAY 1995

35 I I
TABLE 111

I Xl-c1: ~

30 x2-c2: ...........
COMPARISON

NUMBER
WITH PRIOR WORK.THEINPU COLUMN
CANDIDATES SIGI IS THE NUMBER
OF RULES, NEUR IS THE NUMBER
PERFORMANCE MEASURE,
IS THE NUMBER

OF NEURONS,
AND ITER IS THE NUMBER
TO TRAIN. THE PERFORMANCE MEASURES DIFFER
OF INPUT
OF SIGNIFICANT INPUTS, RULEIS THE
PERF IS A
OF ITERATIONS
FROM AUTHOR TO
AUTHOR, BUT A LOWER NUMBER MEANSBETER PERFORMANCE IN
EVERYCASEWE HAVE USEDTHE PERFORMANCE MEASURE OF THE
ORIGINAL AUTHORS IN ALL CASES HENCE, THE PERFORMANCE FIGURES ARE
COMPARABLE ACROSS A LINE, BUT NOT VERTICALLY. A BLANKSPACE
INDICATESTHAT THE DATAWAS NOT PROVIDED IN THE REFERENCE. IF THE
PERFORMANCE WAS NOT GIVEN IN THE ORIGINAL REFERENCE, WE USED(8).
REFERENCESAND COMMENTARY ON THIS DATAARE GIVEN IN SECTION VI11

I
0.5 1 1.5 2 2.5 3 3.5
Fig. 8. The fuzzy curves for the nonlinear system y = (2 + xi.5 - 1.5 sin
(3x2))’.

If xl

1m
x2 Then

lrzl
Y

R1 ’0
OO 2
OO 2
29.6 R1
If xl x2

anything
Then y

29.6
OO 2

R2 ’Vy
OO 2 OO 2
22.6
R3 ’KI 54.8

OO 2 OO 2

R3 1ml
OO
m 2
OO 2
54.8
Fig. 10. Two rule fuzzy model for the nonlinear system y = ( 2
sin(3~3))~.
+ xi.5 - 1.5

zero. Hence, the forth rule is not contributing, so we delete


this rule and find there is no change in performance: that is
R4 ’/ 52.1 PI = 0.00497 with three rules. Fig. 9 also shows us that the
0
OO 2 0 2 membership function for x 2 in rule two is close to zero over
Fig. 9. Four fuzzy rules for the model representing the nonlinear system most of its range. Thus, we may be able to drop the second
y = (2 +
z:.5 - 1 . 5 s i n ( 3 ~ 2 ) ) Note
~ . that the membership function for rule. We also note that the membership function for 2 2 in rule
2 1 in rule four is always zero. We can drop this rule with no change in
1 is near one over most of its range. Therefore, we may be
performance. Also, note that in rule two the membership function for 2 2 is
near zero over most of its range. In rule one the membership function for xz able to delete this neuron. We delete them and the PI becomes
is near one. Thus, we can consider dropping the second rule and deleting the 0.00897. If this performance is acceptable, we leave them out
neuron for zz’s membership function in rule one. of the model, otherwise they stay in. The fuzzy model with
PI = 0.00897 is shown in Fig. 10.
TABLE I1
MODELPERFORMANCE AND TRAINING ITERATIONS FOR THE NONLINEAR VIII. PERFORMANCE
COMPARISONS
SYSTEM?) = (2 +
- 1.5 Sin ( 3 x 2 ) ) ’ USING VARIOUS NUMBER
OF RULES
Our performance comparisons are summarized in Table
III. We see that our models are almost always simpler than
previously proposed models, they usually yield equivalent or
0.00497 better performance, and they train very rapidly. We also note
0.00336 our models are easy to produce. The details for the various
0.00183 comparison tests are enumerated below. We note that each
0.00144
author has used a different performance measure, and we
have stated our results in terms of the reference’s performance
We assume that the four-rule performance with PI = measure. In all cases, the lower the performance measure,
0.00497 is adequate, and we plot the fuzzy membership the better the performance. In cases where the original author
functions of the four-rule model in Fig. 9. From Fig. 9, we did not quote a performance measure, we use (6). The item
see that membership function for x1 in rule four is always numbers refer to the test number in Table III.

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
~

LIN AND CUNNINGHAM: FUZZY-NEURAL SYSTEM MODELING 195

+
This is the nonlinear equation y = (1 x;2 2;1.51 + If xl x2 x3 x4 x5 Then y
taken from [22]; [22] develops fuzzy methods for doing
input identification and building fuzzy models from
input-output data. In this instance, two random inputs
::qQgpBQpN
0.4

50 60 -2 0 2
.4

-2 0 2 -2 0 2 50 60
47.5

2 3 and 2 4 were inserted to test input identification; [22]


produced a two-input, six rule model with a performance
measure of 0.010. We developed a two-input, three
rule, nine-neuron model with a performance measure of
0.0035.
This is Box and Jenkins gas furnace data taken from [3].
The process is a gas furnace with a single input (gas flow
rate) u(t) and a single output (CO2 concentration) y ( t ) .
op$::~::T-j::R:!m
0.6
0.4
50 60 -2 0 2 -2 0 2 -2 0 2 50 60
54,9

As in [22], we consider the variables y(t - l),. . . , y ( t -


4), u(t - l ) ,. . . ,u(t - 6) as input candidates. We use
296 data points and we compare our results to the model
done in [22]; [22] developed a four rule model using
the inputs y(t - l), u(t - 4), and u(t - 3) as inputs
with a performance measure of 0.190. We determine that Fig. 11. The initial fuzzy rules for the Box and Jenkins gas furnace data
the significant inputs, listed in order of importance, are (test number 2).
u(t - 5), y ( t - l),~ (- t6), u(t - 3), and y ( t - 2).
Our model uses four rules, 25 neurons. We trained on
250-data points and predicted on the next 40. It took in [22]. We also eliminated the temperature inputs and
4000 iterations to train. Our performance measure was produced a seven-rule model with 28 neurons. Our PI
0.071 on the training set and 0.261 on the prediction set. was 0.002245.
Figs. 11 and 12 show the initial and final membership 7) This is data on daily stock prices for stock A from
functions, respectively. [22]. It has ten input variables 21, ,210 and 100 data
a e

This is the nonlinear equation y = 0.2 0.8(2 + + points. The performance measure is not noted in [22]. We
0.7sin(2m)) taken from [19]; [19] develops a method eliminated the variables 2 1 , 2 6 , z, and 2 9 . Our model
for setting the initial weights in a network to reduce the used six inputs, five rules, and 31 neurons. We trained on
training time. They develop a model with 22 neurons 80 data points and predicted on the next 20. Our PI was
that trained in 468 steps with a performance measure of 0.0175 on the training set and 0.1261 on the prediction
3.19. Our model uses four rules and nine neurons. We set.
train in 2200 steps with a performance measure of 0.987. Graphs showing the performance of our model for the Box
This is data came from the chemical oxygen demand in and Jenkins data (test number 2) are shown in Fig. 13, the
Osaka Bay example in [6]. A group method data han- Osaka Bay data (test numbers four and five) in Fig. 14, the
dling algorithm was used for analysis in [6]. There are chemical plant data (test number six) in Fig. 15, and the stock
five input variables: temperature, transparency, oxygen price (test number seven) in Fig. 16>. The original output is
density, salt density, and filtered COD density. There are graphed with a solid line (-) and the models output is graphed
45 data points. The performance measure from [6] on the with a dotted line (. . .) in all cases.
training data is 3.63, and the performance measure on
the checking data is 2.04. We eliminated the salt density E.APPLICATIONS
variable. Our model uses four rules and 19 neurons with
These ideas have been applied to two real systems with
performance measures of 2.76 and 1.78 on the training
good results. The first was a petroleum inverse problem with
and checking data, respectively.
24 inputs and 18 outputs [20], and the second was a financial
The Osaka Bay data was also modeled in [23] using
prediction problem with approximately 130 inputs [17]. In both
a fuzzy-neural method. In contrast to [6], [23] appears
cases these methods yielded good results with a substantial
to used both the training data to form the model and
savings in time over the previous approaches.
the checking data to determine when to stop training.
Hence, [23] makes indirect use of the checking data in
the model. We use only the training data to form our X. CONCLUSIONS
model. We create simple and effective fuzzy-neural models of
This is data for operating a chemical plant originally complex systems from a knowledge of the input-output data.
appeared in [22]. It has 70 data points and five input We introduce the concept of a fuzzy curve and use it to identify
variables: monomer concentration, change in monomer the input variables, estimate the number of rules needed by the
concentration, monomer flow rate, temperature 1, and model, and set the initial weights for the fuzzy-neural network
temperature 2; [22] decided that the two temperature model. Our method quickly and easily identifies the significant
measurements were not significant and produced a six- input variables. Because the initial structure and weights of the
rule model. The performance measure was not stated neural network are set properly, we need few training iterations

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
196 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 2, MAY 1995

If xi x2 x3 x4 x5 Then y Osaka Bay Chemical Oxygen Demand

:::m:::BiqJ .-m
0.4

50 60
0.7

-2 0 2
0.92

-2 0 2
0.82

-2 0 2
E - g
50 60
51,1

:::m::::m
0.4
50 60 -2 0 2
0.885
0:::~
-2 0 2 -2 0 2 50 60
54.8

:::U::::m ;_rm
2- Prediction

0.4 0.54 J - yq0.45


0 58.6 '0 5 10 15 20 25 30 35 40 3
50 60 -2 0 2 -2 0 2 -2 0 2 50 60 Time

Fig. 12. The trained fuzzy rules for the Box ar,d Jenkins gas furnace data Fig. 14. Model performance on the Osaka Bay data (test numbers four and
(test number two). five). We train on the first 33 points and predict the next 12. Actual is shown
by the solid line -, model by the dotted line . . . .

: 1

rl
'rediction
v)
6000

l o o o t 4
Fig. 13. Model performance on the Box and Jenkins gas furnace data (test
number two). Actual is shown by the solid line -, model by the dotted line
...

for the neural network to converge. We use the final fuzzy


memberships to simplify the fuzzy-neural model. In summary, Proof: For any zl, z2 E X, if x1 # z2,then at least one
we can build fuzzy-neural models with simple structure, less coordinate of x1 and x 2 is not equal. Without lost generality,
training time, and adjustable performance. suppose xi # xt. Pick any z from 2 with wljl # 0, w1j0 #
0, and Zij # 0. If z ( d ) # z ( z 2 )then done. If z ( d ) = z ( x 2 ) ,
MPENDIX let
N
Dejinition: Let X C R" and G be a family of functions on
X with values in R. Suppose that for all x, y E X such that z(x') = exp(-lwljlz: + w1jol'tJ)n
i=2
x # y, there is an f E G such that f(x) # f(y), then we say
that G is a separating family of functions on X. .exp (-1wijIxi + wijol'z~)
Lemma: The function family Z and
/ N \ N
z ( x 2 )= exp(-lwljlxT +w ~ j o 1 ' ~ ~ ) n
i=2
is a separating family. -exp (- Iwijlz: + wijo 1'13 ).

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
LIN AND CUNNINGHAM: FUZZY-NEURAL SYSTEM MODELING 197

Stock Price Controller REFERENCES

”:.
-“I 1 I [l] M. G. Bello, “Enhanced training algorithms, and integrated train-
inghrchitecture selection for multilayer perceptron networks,” IEEE
Trans. Neural Networks, vol. 3, no. 6, pp. 864-875, Nov. 1992.
[2] H. R. Berenji and P. Khedkar, “Learning and tuning fuzzy logic
20 controllers through reinforcement,” IEEE Trans. Neural Networks, vol.
3, no. 5, pp. 724-740, 1992.
[3] G. E. P. Box and G. M. Jenkins, l h e Series Analysis, Forecasting and
Control. San Francisco: Holden Day, 1970.
[4] J. J. Buckley, “Sugeno type controllers are universal controllers,” Fuzzy
Sets Syst., vol. 53, 299-303, 1993.
[5] G . P. Drago and S. Ridella, “Statistically controlled activation weight
initialization (SCAWI),” IEEE Trans. Neural Networks, vol. 3, no. 4,
pp. 627-628, July 1992.
[6] S. Fujita and H. Koi, “Application of GMDH to environmental sys-
tem modeling and management,” in SelfOrganizing Methods in Mod-
eling: GMDH Type Algorithms, S. J. Farlow, Ed., Statistics Text-
books Monographs Ser., vol. 54, New York Marcel Dekker, 1984,
p. 257-275.
-30; I I [7] K. Funahashi, “On the approximate realization of continuous mappings
20 40 60 80 100
Time by neural networks,” Neural Networks, vol. 2, p. 183, 1989.
[8] P. Y. Glorennec, “Learning algorithms for neuro-fuzzy networks,” Fuzzy
Fig. 16. Model performance on the stock price data (test number seven). We Control Systems, A. Kandel and G. Langholz, Eds. New York: CRC
train on the first 80 points and predict the next 20. Actual is shown by the Press, 1994, pp. 4-18.
solid line -, model by the dotted line . . . . [9] Y. Hayashi, J. Buckley, and E. Czogala, “Fuzzy neural network with
fuzzy signals and weights,” Inr. J. Zntell. Syst., vol. 8, pp. 527-537, 1993.
- - E. Hewitt. Real and Abstract Analysis. Berlin, New York Springer-
1101 - -
Verlag, 1965.

If exp(-lw1,1x: + w1,01’1,) = e x p ( - l w , l z l +
wl,olllJ),
- . S. Horikawa. T. Furuhashi, and Y. Uchikawa, “On fuzzy modeling
1111
using fuzzy neural networks with the back-propagation algonthm,” IEEE
then let 8 1 , 0 = 201~0 + 1 so that 2(x1) # 2(x2). If Trans. Neural Networks, vol. 3, no. 5 , pp. 801-814, 1992.
+
exp ( - I W I ~ I ~ : ~ 1 ~ 0 1#‘ ~exp
~ )(-1w31x? + wl,ol’l~),then
[12] K. Homik, M. Stinchcombe, and H. White, “Multilayer feedforward
networks are universal approximators,” Neural Networks, vol. 2, pp.
let 8 1 , 1 = 0 so that 2 ( d ) # 2(x2). Q.E.D. 359-366, 1989.
Stone-Weierstrass Theorem [IO]: Let X be a nonvoid [13] H. Ishibuchi, R. Fujioka, and H. Tanaka, “Neural networks that learn
from fuzzy if-then rules,” IEEE Trans. Fuzzy System, vol. 1, no. 2, pp.
compact set and G a separating family of functions on 85-97, 1993.
X containing the function 1, then polynomials with real [14] H. Ishibuchi, H. Tanaka, and H. Okada, “Fuzzy neural networks
coefficients in functions from G are a dense subalgebra of X in with fuzzy weights and fuzzy biases,” in Proc. ICNN 93, 1993, pp.
1650-1 655.
the topology induced by the uniform metric. In other words, for [15] J. R. Jang, “Self-learning fuzzy controllers based on temporal back
any continuous function f (x)defined on X , and for any E > 0, propagation,” IEEE Trans. Neural Networks, vol. 3, no. 5, pp. 714-723,
R 1992.
there exists a polynomial p(x), p(x) = u,(gJ(x))nJ [16] B. Kosko, “Fuzzy systems as universal approximators,” in Proc. I992
and gJ(x) E G, such that IEEE Int. Con5 Fuzzy Systems, San Diego, Mar. 1992, pp. 1153-62.
[17] Y. Lin, “Credit Risk Modeling with Fuzzy Systems,” Los Alamos
National Laboratory, Tech. Rep., LA-CP 94-109, May 1994.
[18] M. T. Musavi, W. Ahmed, K. H. Chan, K. B. Faris, and D. M. Hummels,
“On the training of radial basis function classifiers,” Neural Networks,
Now we are ready to prove our theorem. vol. 5, pp. 595-603, 1992.
[19] H. Narazaki and A. L. Ralescu, “An improved synthesis method for
Theorem: For any given real continuous function f on the multilayered neural networks using qualitative knowledge,’’ IEEE Trans.
compact set X c R” and arbitrary E > 0, there exists a Fuzzy Systems, vol. 1, no. 2, pp. 125-137, 1993.
polynomial p(x) with real coefficients in function family Z ( Z [20] A. Ouenes, R. Doddi, Y. Lin, G. Cunningham, and N. Saad, “A
new approach combining neural networks and simulated annealing for
is defined as in Lemma), such that solving petroleum inverse problems,” 4th European Conf. Math. Oil
Recovery, Roros, Norway, June 7-10, 1994.
[21] T. Poggio and F. Girosi, “Networks for approximation and learning,” in
Proc. IEEE, vol. 78, no. 9, Sept. 1990, pp. 1481-1497.
[22] M. Sugeno and T. Yasukawa, “A fuzzy-logic-based approach to quali-
tative modeling,” IEEE Trans. Fuzzy Systems, vol. 1, no. 1, pp. 7-31,
1993.
[23] H. Takagi, I. Hayashi, ““-driven fuzzy reasoning,” Int. J. Approximate
Reasoning, vol. 5 , no. 3, 1991, 191-212.
[24] H. Takagi, N. Suzuki, T. Koda, and Y. Kojima, “Neural networks
designed on approximate reasoning architecture and their applications,”
+
Proof: By taking w;jl w;jo = 0, we obtain that Z is IEEE Trans. Neural networks, vol. 3, no. 5, pp. 752-760, Sept. 1992.
[25] T. Terano, K. Asai, and M. Sugeno, Fuzzy Systems Theory and Its
containing 1. Then our theorem is the direct consequence of Applications. New York Academic Press, 1992, pp. 159-168.
the Stone-Weierstrass Theorem and Lemma. Q.E.D. [26] L. X. Wang and J. M. Mendel, “On fuzzy basis functions, universal
approximation and orthogonal least-squares learning,” IEEE Trans.
Neural Networks, vol. 3, no. 5, pp. 807-14, Sept. 1992.
[27] L. F. A. Wessels and E. Barnard, “Avoiding false local minima by proper
ACKNOWLEDGMENT initialization of connections,” IEEE Trans. Neural Networks, vol. 3, no.
The authors acknowledge the careful reviews and help- 6, pp. 899-905, Nov. 1992.
[28] L. Xu, A. Krzyzak, and A. Yuille, “On radial basis function nets
ful suggestions made by the anonymous referees, especially and kemel regression: Statistical consistency, convergence rates, and
referee B. receptive field size,” Neural Networks, vol. 7, no. 4, pp. 609-628, 1994.

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.
198 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 2, MAY 1995

Ymghua Lin (M’94) received the B.S. degree in George A. Cunningham III (S’82-M’86) received
physics from Zhejiang Normal University, China, the B.S. degree in engineering from Case Institute
in 1982, and the M.S. degree in computer science of Technology in 1965, the M.A. degree in mathe-
from Ball State University, IN, in 1991. matics in 1972, and the Ph.D. degree in electrical
Currently he is a Ph.D. candidate in the Depart- engineering in 1986 from the University of Wash-
ment of Computer Science at New Mexico Institute ington.
of Mining and Technology in Socorro, NM. From He has held academic positions in the Com-
1982 to 1989, he was a Faculty Member in the De- puter Science Department at the University of North
partment of Physics at Zhejiang Normal University. Florida and the Engineering Department at Harvey
Since 1994, he has been a Graduate Research As- Mudd College. He has also worked as an Engineer
sistant in Los Alamos National Laboratory, working for the Boeing Company and the City of Seattle,
on system identification and prediction. as well as an independent CPA. He is currently an Associate Professor of
Electrical Engineering at the New Mexico Institute of Mining and Technology.
His current research interests are in system modeling and control, and
undergraduate engineering education.

Authorized licensed use limited to: Laser Science and Technology Centre. Downloaded on June 07,2010 at 05:49:55 UTC from IEEE Xplore. Restrictions apply.

Das könnte Ihnen auch gefallen