Sie sind auf Seite 1von 9

Debate III

That Rule-Based Systems are an

Evolutionary Dead End in the
Development of Intelligent Systems

Vee Khong
230 Debate Ul

- The first speaker will speak for the motion.

- Rule-based systems are an evolutionary dead end in the realm of intelligent

systems. There are four notions there. That rules can implement intelligent
systems, and we will see what an intelligent system is. Evolution for me
means one thing, something that can be extended. It can grow, it can adapt to
new things. It can be maintained. A dead end - that means that there is a
facet that we can no longer exploit. Intelligent systems means something that
shows intelligent behaviour.
Let us look at intelligence. What is intelligent behaviour? Everybody here is
trying to simulate it, but do we really know what it is? I think one of the first
things t h a t was done with intelligent systems was decision systems. We
pretend that intelligence is some form of decision. Decisions simulated in a
machine. We might go a bit further and say that intelligence is something
that understands. We might say that something that understands must also
be able to explain. We might say that an intelligent system is something that
can carry out some sort of cognitive, perceptive process. By that I mean vision,
perceptual senses, pattern recognition type of capabilities. There is also
another way of looking at intelligent systems - learning. An intelligent system
must be able to learn. I think those are the four major characteristics
associated with intelligence.
The opposite of this motion says that we would be able to carry on all these
intelligent processes using 'if... then...' type rules, and my duty today is to say
that it is impossible. I think it is quite simple to see that it is impossible to use
'if... then..' statements to carry on all these processes. I would grant that
some sort of decision system would be able to be written using 'if., then...'
rules, b u t when you start getting into more intelligent processes such as
understanding, explaining, cognitive perception, they cannot really do that
type ofthing. How would you write, for example, a rule-based system to see, to
understand pictures? Impossible. Do you write a rule for each pixel in your
image? Nobody would ever dream of t h a t . Learning - what learning
techniques do you use today? They are not rule-based b u t algorithmic
processes, that perhaps produce rules, b u t are not developed as rule-based
systems. So I think that from an evolutionary point of view, if we say that we
are at the first step of trying to simulate intelligence, rule-based systems are
able in certain circumstances to implement decision type systems, decision
type intelligence extracted from experts' know how, in certain cases for very
limited domains. But for any extension to other intelligent applications, they
are a dead end.
Further, there is no way of maintaining a system based on 'if... then../ rules.
When you start to get into very complex applications of decision systems today,
decision systems start to show intelligent behaviour around a hundred rules;
one hundred, two hundred, three hundred rules, then you can say that it is
perhaps a limited sort of intelligent behaviour. But when you start getting into
cognitive processes, what is the number of rules that you need to implement
in order to simulate a cognitive process? A thousand rules? Two thousand
rules? Will you be able to maintain such a system? No. It is a dead end.

- So what you are saying is that intelligence is a cognitive process and rules
cannot best represent these processes?
Debate III 231

- I think that is the major failure, and I would say that another way of saying
this is t h a t we have invented all these other mechanisms, such as neural
networks. People invented neural networks because they think that rule-based
systems are a dead end.

- Do you say that expert systems are a dead end?

- Let us take regular expert systems, such as the diagnostic expert system,
MYCIN, to start with the first one, or maybe the first Schlumberger type of
decision system on geological data surveys. If you tell me t h a t you have to
couple the technique with something else in order for it to be viable, then you
are actually saying that by itself a rule-based system is not sufficient.

- The next speaker will speak against the motion.

- Perception by rule-based systems has been worked on. For example by David
Waltz and David Marr in vision. They are classical AI systems and they work
fine. My second point is t h a t understanding and explaining can be done by
conventional rule-based systems. But explaining or understanding means in
relation to a model. If you have a very weak model, what can you get? It is a
very weak explanation, or shallow understanding. You mentioned complexity
but the complexity of quantitative models is very high. The point I want to
make is t h a t systems like MYCIN or Dipmeter are nearly twenty years old,
and they are what we call shallow expert systems with very crude rules like
'if... condition, then... solution*. But this is not what we are doing today. We
have moved from the quantitative to qualitative models, what we call deep
models. Models with a strong theory of the domain. It is not a criticism, but
what we call deep models based on qualitative reasoning are very slow, very
inefficient. It takes a very long time to get a solution, but there are means of
getting more knowledge. When you have one problem solved, you make a
compiled rule and you get better performance. It is always mentioned that
using this kind of expert system there is no way of learning, but one approach
to learning today is called explanation-based generalisation, and this also uses
deep models.
Today all the systems we have are based on non-monotonic logic, but that is
not exactly rule-based systems. This non-monotonic logic, for example
preference logic, makes a spectrum of models, some called preferred models
according to a set of rules. It gives you all possible models, and interpretation
models give you the preferred ones. As knowledge changes, the preferred
model changes as well. That is a real adaptation: what can be true at one time
is no longer true at another. Also, we can do analogical reasoning. This is a
means for generating new rules, because analogical reasoning is to take a set
of source rules, for example A -> B and if there is a C similar to A, the analogy
between the A and C derives some kind of new rule. This is a new knowledge,
you can test it in your theory, meaning a new model. And one of the main
advantages of qualitative models is that most of the time they are quite simple
and easy to produce while quantitative models need a lot of tuning to get the
right numbers.
232 Debate III

- The next speaker will put views both for and against.

- I interpret the motion as referring to what we know as first generation expert

systems. Indeed that is an evolutionary dead end. In order to illustrate this, I
will say a little on a distinction made by numerous papers some years ago. We
distinguish between the knowledge level and symbol level. At the knowledge
level we talk about knowledge in terms of hypotheses and solutions, causal
knowledge, rules of thumb and things like that, whereas at the symbol level
we talk about the structures we use to represent this type of model. For
example logic, production rules, or whatever. I think it is very useful to make
this distinction because abstracting knowledge at the knowledge level gives an
insight into what kinds of knowledge there are, and the different uses those
kinds of knowledge can have. Obvious benefits of making this distinction are
that it may lead to a modelling methodology for expert systems, and eventually
even to tools which present the various knowledge types available and the way
they may interact in a particular kind of problem solving. You may identify
particular pieces of knowledge and how they are used in different settings so
that you come to knowledge re-use. It allows for a separation of knowledge in
terms of knowledge which directs the problem solving, and knowledge which
is describing the domain. To put it very simply this distinction is of content
versus form. It is about the idea to separate knowledge from the way used to
represent the knowledge.
I don't mean to say that rule languages are no good, because I think a rich
rule language is a very powerful modelling technique, and it should not be
discarded. Even if new things pop up in AI, there is no reason to throw the
baby away with the bath water. However, first generation expert systems
typically equate rules of thumb with production rules. So, the only kind of
knowledge is the rule of thumb, and t h a t makes their behaviour not very
satisfactory, because when starting from an expert system shell, which only
uses production rules, and which is used only to represent rules of thumb, you
will forget about all the other types of knowledge, very useful types of
knowledge, and you do not try to build a model of the problem solving. That
leads to a system which contains private knowledge only, which is one reason
why they will be hard to accept for other users. Systems which are brittle
because the types of rules built in are meant to solve a very particular type of
problem, but no problems which are slightly beyond the scope of the system,
and which are very hard to maintain because there was never an intent to
describe the knowledge on the right level of abstraction. And the systems will
be very hard to adapt. You have no idea where the rules come from and how
they cooperate, and what the behaviour will be if you add another rule. And so
there are various reasons why this kind of system is unsatisfactory.
Now, in high end applications, you cannot present a user with a system like
that because a user wants to change the system according to his own beliefs
and wants to be able to create a system in part himself rather than being
presented with a system which is finished and which he can only turn on and
turn off. So that is why I feel that the typical first generation expert system is
a dead end, notwithstanding the fact that rule languages will be in use for
much longer.
Debate III 233

- Those were remarks for the motion, but I believe rule-based systems can be
very useful as long as they are carefully developed and applied. I will try to
represent the point of view of somebody using these ideas and techniques.
From the usage point of view this concept of rule-based systems has been a big
disappointment compared with the expectations that people clearly had a few
years ago. I w a n t to make one distinction between rule-based as a
programming tool and applications to automate rules, guidelines, procedures
etc. As a programming tool there is not t h a t much distinction between the
rule-based tool and other programming tools. If you try hard enough with
them you can probably manage to programme pretty much anything you
want, even including some examples of MYCIN etc. However, once you get
into the realm of programming complex rules, you can very often get into
serious trouble. If I look at the market, the things that seem to have worked
well for people are invariably small scale systems, with few rules, less than
one hundred, and especially things where the author is the main user. On the
other hand, we have heard of many spectacular failures, such as products for
banks and insurance for example. The consensus with large and complex
systems seems to be that they are giving a lot of trouble. Even the ones that
are still managing to survive do so at great expense and with some difficulty.
They are worth what they are costing, but they are certainly not trivial things
to maintain.
Consider the experience of trying to build a large system. At the beginning we
go and sit with somebody for an interview. One of the earliest things you have
to do is get the information available, which is often not the case. We have to
access a data base somewhere. You have hardly started to develop rules but
you do have the information. J u s t doing this already has a big impact on the
user. You are already making information available to the users. This
already could be of big value to them. And then you go and put rules together.
At some point you reach the stage where you feel the system is viable and
should go by itself. Then you see that people tend to trust blindly that machine
because it appears to be intelligent and know what it is doing. You often
observe t h a t the efficiency of the person starts to degrade compared to how it
was when you were working with them. By the time you have finished your
system very often you have made this information something t h a t the expert
system is using, not the end-user. I think the underlying principle is that
with big systems you hide the underlying information and are removing the
context. While, on the other hand, with smaller systems, when needed you
can access the wider context and can compensate for whatever weaknesses
there are in your system. One message that I get out of this is that if you want
to do something viable, you must make sure that you have some way to access
the underlying information so that the users can resolve the discrepancies.
There is another issue on a wider scale which matches the developments we
see in the management field. You see a very big difference when you
interview a worker and when you interview a manager. The manager will
easily give you lots of great big sized and clear rules. The workers will have a
lot less of those. And in practice it is a worker who will have to apply and put
things together and when you start to look at this view, there are many things
that are very important for him t h a t have been missed from the managers'
rules, because the relationship between the workers and the managers shows
the same characteristics as between having only rules and having all the
234 Debate III

richer context. The manager himself is somebody who has been removed
from the context that is needed to do a job and tends to abstract the situation
and end up with rules. So in the confusion, I see rules as more dangerous
than helpful. But if they are used as a tool in a wider context then they are
proven to be viable, and if used in that way then we can certainly do a good job.

- Am I right in saying that a rule-based system works only if you have the
right knowledge elicitation process to complement it?

- No, I would rather say that they work as long as you leave the person access
to the information you yourself have used and don't try to do a complete job.

- We need to make a big distinction between rules as a computational

paradigm and as a modelling tool for intelligence. You can prove that
anything that you can write in any other way, you can also write with rules.
Of course, you might argue every computational paradigm has its own
properties, and for rules these are things like modularity and not having to
specify the control flow explicitly in advance. These are all good properties
and I think rules as a computational paradigm will always be with us because
if not, then there is something missing in our computer science education.
I think the more important thing is the criticism of rules in knowledge
engineering projects. If you take the whole list of properties that an intelligent
system should have, t h a t is one point of view. The second point of view is
looking at concrete knowledge engineering projects; what is the success of
using rules? I will take these two points of view in turn.
If you evaluate from the viewpoint of knowledge engineering then I think
there are a number of things to say. First of all the failures: these have little to
do with using rules or not using rules. Most of the problems t h a t have been
encountered are either management issues or in terms of the style in which
applications are embedded in a certain existing context. So this is a problem
not just for expert systems, but for all sorts of software today. The idea is that
you can design in your office, analyse your requirements, make this thing and
then go back to the workplace and put it there. It is an idea that does not work.
There is a lot of work going on in software engineering with design
methodologies t h a t incorporate the workers and the process, and all those
things apply to expert systems as well; but they have nothing to do in principle
with whether you use rules or not. They just have to do with how you can
build applications t h a t are really used. In my view ninety percent of the
failures in knowledge engineering have to do with management and are not
technical failures.
Now the second thing is that we have the first generation of systems and from
around 1982 there were papers by a number of people on second generation
expert systems pointing out the limitations here. The basic insight is that you
need to incorporate deeper models, more qualitative models, different types of
reasoning. But again, this has nothing to do with whether you use rules or
not. This has to do with the depth of the models that you make; whether you
think about the methods or do not think about the methods. You can express
all those things in a rule-based computational way.
Another thing which was brought up was the lack of a good design
methodology at a high level. I agree that knowledge level design tools are
Debate Hi 235

needed and in fact there is a lot of very interesting work on knowledge level
modelling which is sweeping through Europe at the moment. So, although
this is still in an early phase, we can already see very concrete tools. It will
allow you to make a knowledge level description and couple it in systematic
ways with an implementation, and this level could be object oriented
programming, it could be logic programming, it could be rule-based. So
again, the thesis of an evolutionary dead end sounds very silly in view of all
these things.
From the viewpoint of intelligence, I think there are indeed big issues that
have not been confronted by the classical AI paradigm. Although I don't at all
agree with previous comments in t h e area of vision: almost all the
sophisticated attempts that I know to build real computer vision systems have
been rule-based. Also it is not a problem of size, there are rule-based systems
now with ten thousand rules. All these interesting works of learning, and not
just explanation based learning but inductive learning, have been done with
rules. I don't know where you get t h a t idea t h a t all those things are

- I am not saying they are impossible, I am saying t h a t they are a dead end
from the maintenance point of view.

- As part of the rule-based paradigm, you can have modules, you can have
object oriented structures for your data. Nothing denies this. There is
fascinating research going on which goes beyond this in the sense that much
more fundamental issues are being considered. For example issues like
autonomy, which was not considered a t all in classical AI. Autonomy
meaning how can a system that has continued interaction with the world, but
is confronted continuously with new situations, be adaptive and build up
representations t h a t can still cope with changes? That is one issue which I
don't think has to do with rule-based systems, but I would agree that it is not
being addressed by any of the work that has been happening for twenty years
in the classical symbolic paradigm. There is also the issue of evolution in the
sense of where do capabilities come from? Where does knowledge come from?
If you build an inductive learning programme then you still assume that the
person t h a t is using the programme is going to supply the conceptual
framework and is going to supply all the examples. And then an inductive
process starts, which basically generalises, but there is no new information
being created. So, issues like autonomy, evolution, adaptivity and so on, have
not been addressed, and I think they need to be, to build things t h a t are
situated in the work place and can really fit with the rest of the environment.
Some believed in the past you could just go to an expert and say give me a
couple of rules, put that in your system and then have an application. This is
not the case. But who has ever seriously said that? Nobody who has been
seriously working in AI has ever proposed that. There is always a danger of
putting up a straw man that doesn't work.
- What is the alternative to first generation expert systems? It looks like people
are looking for an alternative in the direction of neural networks. I would like
to make a comparison between neural networks and first generation expert
systems. First generation expert systems were usually defined as "you watch
a black box, you see it behave in the real world and then you extract some very
236 Debate ill

simple rules out of it." The disadvantage was that the scope of your expert
system was very limited and it could only cover the things that you had defined
it to cover. But that was also an advantage in t h a t once you applied it to
something which it couldn't cover, it didn't work and you didn't expect it to
work. Now it looks like people are trying to implement neural networks where
they do exactly the same. They observe a black box, they try to feed it through a
neural network and then it sometimes works. But then there is this insecurity
that they think it might work for something similar, but they don't know why
and they cannot explain why it works or why it doesn't work. It seems to me
that neural networks will have more disadvantages than the first generation
expert systems.

- I agree with you, but the first thing is t h a t I don't see any difference,
technically speaking, between a neural network and a rule-based system
where you have weights for each of the conditions; you have a computation,
you propagate the value of the weights. In terms of a running system there is
no difference whatsoever. So I think it should be clear that computationally
there is no difference. The only thing t h a t is different is t h a t there is a
mechanism for building up, for finding the weights, based on giving a large
number of examples. In t h a t mechanism there are many limitations.
Perhaps we shouldn't go into it, but the result will be something, as you said,
that is basically first generation in the sense that there is no deep model. To
p u t this as rules versus neural networks comes from trivialising the
underlying concepts and I think that is a wrong debate.

- About first generation rule-based systems, I think the main problem is that it
is a Mickey Mouse system which started with a nice paradigm, the production
rules idea. We use "if... then..." rules and write a linear set of rules and there
is no engine behind it except to go through that set of rules and figure out the
most competent one to fire at the time. It doesn't really help my
programming; in fact, it makes it more unmaintainable.
As for their use in vision, my question is: isn't the real challenge in vision
extraction of knowledge, edge detection, figure-background differentiation,
textual analysis, groupings. You cannot do that in a rule-based system. You
apply rule bases only after that point. And after that point you might as well
use any other paradigms. You could develop input factors for a neural
network if you wanted to.

- 1 see rules as part of a tool set. We do the pre-processing to get our data in the
shape t h a t we need it for propagation to our system. So the question is
basically: Are rules a computational paradigm that are mature that you can
use in certain contexts but are limited, and that we can combine with many
different other paradigms in order to make technological advances, or are
rules as a paradigm still capable of evolving? I take the position they are
capable of evolving as long as you don't stick to one paradigm of rules, one
kind of rules that was invented at some point in time. The idea of technology is
that it changes.
Debate ill 237

- When I hear about more advanced generations of rule-based systems, that

tends to be even more worrying, because it becomes an even bigger challenge
to retain contact with reality once you have to maintain all of the mechanics.

- A final comment. Rules are very easy to use to express certain forms of
knowledge. They apply to some situations in a very natural way. So you can't
escape from rules.