Sie sind auf Seite 1von 52

Lecture 1 Foundation and history of AI

Contents
1. 2. 3. 4. 5. 6. 7. Definition of AI Goals of AI AI as Science AI as Engineering Areas of AI Foundation of AI History of AI

Definition of Artificial Intelligence


Artificial intelligence (AI) is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success. Artificial Intelligence (AI) is the area of computer science focusing on creating machines that can engage on behaviors that humans consider intelligent.

Goals of Artificial Intelligence


AI seeks to understand the working of the mind in mechanistic terms, just as medicine seeks to understand the working of the body in mechanistic terms. The mind is what the brain does. -- Marvin Minsky The strong AI position is that any aspect of human intelligence could, in principle, be mechanized.

Artificial Intelligence as Science


Intelligence should be placed in the context of biology: Intelligence connects perception to action to help an organism survive. Intelligence is computation in the service of life, just as metabolism is chemistry in the service of life. Intelligence does not imply perfect understanding; every intelligent being has limited perception, memory, and computation. Many points on the spectrum of intelligence-versus-cost are viable, from insects to humans. AI seeks to understand the computations required for intelligent behavior and to produce computer systems that exhibit intelligence.

CSE-304-F Intelligent system

By Kaushal Shakya

Aspects of intelligence studied by AI include perception, motor control, communication using human languages, reasoning, planning, learning, and memory.

Artificial Intelligence as Engineering


How can we make computer systems more intelligent?

Autonomy to perform tasks that currently require human operators without human intervention or monitoring. Flexibility in dealing with variability in the environment. Ease of use: computers that are able to understand what the user wants from limited instructions in natural languages. Learning from experience.

Areas of Artificial Intelligence

Perception o Machine vision o Speech understanding o Touch ( tactile or haptic) sensation Robotics Natural Language Processing o Natural Language Understanding o Speech Understanding o Language Generation o Machine Translation Planning Expert Systems Machine Learning Theorem Proving Symbolic Mathematics Game Playing

Foundation of AI
Although the computer provided the technology necessary for AI, it was not until the early 1950's that the link between human intelligence and machines was really observed. Norbert Wiener was one of the first Americans to make observations on the principle of feedback theory feedback theory. The most familiar example of feedback theory is the thermostat: It controls the temperature of an environment by gathering the actual temperature of the house, comparing it to the desired temperature, and responding by turning the heat up or down. What was so important about his research into feedback loops was that Wiener theorized that all intelligent behavior was the result of feedback mechanisms.
CSE-304-F Intelligent system By Kaushal Shakya

Mechanisms that could possibly be simulated by machines. This discovery influenced much of early development of AI. In late 1955, Newell and Simon developed The Logic Theorist, considered by many to be the first AI program. The program, representing each problem as a tree model, would attempt to solve it by selecting the branch that would most likely result in the correct conclusion. The impact that the logic theorist made on both the public and the field of AI has made it a crucial stepping stone in developing the AI field. In 1956 John McCarthy regarded as the father of AI, organized a conference to draw the talent and expertise of others interested in machine intelligence for a month of brainstorming. He invited them to Vermont for "The Dartmouth summer research project on artificial intelligence." From that point on, because of McCarthy, the field would be known as Artificial intelligence. Although not a huge success, (explain) the Dartmouth conference did bring together the founders in AI, and served to lay the groundwork for the future of AI research. Knowledge Expansion In the seven years after the conference, AI began to pick up momentum. Although the field was still undefined, ideas formed at the conference were re-examined, and built upon. Centers for AI research began forming at Carnegie Mellon and MIT, and new challenges were faced: further research was placed upon creating systems that could efficiently solve problems, by limiting the search, such as the Logic Theorist. And second, making systems that could learn by themselves. In 1957, the first version of a new program The General Problem Solver(GPS) was tested. The program developed by the same pair which developed the Logic Theorist. The GPS was an extension of Wiener's feedback principle, and was capable of solving a greater extent of common sense problems. A couple of years after the GPS, IBM contracted a team to research artificial intelligence. Herbert Gelerneter spent 3 years working on a program for solving geometry theorems. While more programs were being produced, McCarthy was busy developing a major breakthrough in AI history. In 1958 McCarthy announced his new development; the LISP language, which is still used today. LISP stands for LISt Processing, and was soon adopted as the language of choice among most AI developers.

In 1963 MIT received a 2.2 million dollar grant from the United States government to be used in researching Machine-Aided Cognition (artificial intelligence). The grant by the Department of
CSE-304-F Intelligent system By Kaushal Shakya

Defense's Advanced research projects Agency (ARPA), to ensure that the US would stay ahead of the Soviet Union in technological advancements. The project served to increase the pace of development in AI research, by drawing computer scientists from around the world, and continues funding. The Multitude of programs The next few years showed a multitude of programs, one notably was SHRDLU. SHRDLU was part of the microworlds project, which consisted of research and programming in small worlds (such as with a limited number of geometric shapes). The MIT researchers headed by Marvin Minsky, demonstrated that when confined to a small subject matter, computer programs could solve spatial problems and logic problems. Other programs which appeared during the late 1960's were STUDENT, which could solve algebra story problems, and SIR which could understand simple English sentences. The result of these programs was a refinement in language comprehension and logic. Another advancement in the 1970's was the advent of the expert system. Expert systems predict the probability of a solution under set conditions. For example: Because of the large storage capacity of computers at the time, expert systems had the potential to interpret statistics, to formulate rules. And the applications in the market place were extensive, and over the course of ten years, expert systems had been introduced to forecast the stock market, aiding doctors with the ability to diagnose disease, and instruct miners to promising mineral locations. This was made possible because of the systems ability to store conditional rules, and a storage of information. During the 1970's Many new methods in the development of AI were tested, notably Minsky's frames theory. Also David Marr proposed new theories about machine vision, for example, how it would be possible to distinguish an image based on the shading of an image, basic information on shapes, color, edges, and texture. With analysis of this information, frames of what an image might be could then be referenced. another development during this time was the PROLOGUE language. The language was proposed for In 1972, During the 1980's AI was moving at a faster pace, and further into the corporate sector. In 1986, US sales of AI-related hardware and software surged to $425 million. Expert systems in particular demand because of their efficiency. Companies such as Digital Electronics were using XCON, an expert system designed to program the large VAX computers. DuPont, General Motors, and Boeing relied heavily on expert systems Indeed to keep up with the demand for the computer experts, companies such as Teknowledge and Intellicorp specializing in creating software to aid in producing expert systems formed. Other expert systems were designed to find and correct flaws in existing expert systems.
CSE-304-F Intelligent system By Kaushal Shakya

The Transition from Lab to Life The impact of the computer technology, AI included was felt. No longer was the computer technology just part of a select few researchers in laboratories. The personal computer made its debut along with many technological magazines. Such foundations as the American Association for Artificial Intelligence also started. There was also, with the demand for AI development, a push for researchers to join private companies. 150 companies such as DEC which employed its AI research group of 700 personnel, spend $1 billion on internal AI groups. Other fields of AI also made there way into the marketplace during the 1980's. One in particular was the machine vision field. The work by Minsky and Marr were now the foundation for the cameras and computers on assembly lines, performing quality control. Although crude, these systems could distinguish differences shapes in objects using black and white differences. By 1985 over a hundred companies offered machine vision systems in the US, and sales totaled $80 million. The 1980's were not totally good for the AI industry. In 1986-87 the demand in AI systems decreased, and the industry lost almost a half of a billion dollars. Companies such as Teknowledge and Intellicorp together lost more than $6 million, about a third of there total earnings. The large losses convinced many research leaders to cut back funding. Another disappointment was the so called "smart truck" financed by the Defense Advanced Research Projects Agency. The projects goal was to develop a robot that could perform many battlefield tasks. In 1989, due to project setbacks and unlikely success, the Pentagon cut funding for the project. Despite these discouraging events, AI slowly recovered. New technology in Japan was being developed. Fuzzy logic, first pioneered in the US has the unique ability to make decisions under uncertain conditions. Also neural networks were being reconsidered as possible ways of achieving Artificial Intelligence. The 1980's introduced to its place in the corporate marketplace, and showed the technology had real life uses, ensuring it would be a key in the 21st century. AI put to the Test The military put AI based hardware to the test of war during Desert Storm. AI-based technologies were used in missile systems, heads-up-displays, and other advancements. AI has also made the transition to the home. With the popularity of the AI computer growing, the interest of the public has also grown. Applications for the Apple Macintosh and IBM compatible computer, such as voice and character recognition have become available. Also AI technology has made steadying camcorders simple using fuzzy logic. With a greater demand for AI-related technology, new advancements are becoming available. Inevitably Artificial Intelligence has, and will continue to affecting our lives.

CSE-304-F Intelligent system

By Kaushal Shakya

History of AI
1943 1950 1956 195269 1950s McCulloch & Pitts: Boolean circuit model of brain Turing's "Computing Machinery and Intelligence" Dartmouth meeting: "Artificial Intelligence" adopted Look, Ma, no hands! Early AI programs, program, Newell & Gelernter's Geometry Engine including Simon's Samuel's Logic checkers Theorist,

1965 196673 196979 1980 1986 1987 1995

Robinson's complete algorithm for logical reasoning AI discovers computational Neural network research almost disappears Early development of knowledge-based systems AI becomes an industry Neural networks return to popularity AI becomes a science The emergence of intelligent agents complexity

CSE-304-F Intelligent system

By Kaushal Shakya

Lecture 2 AI Problems and Techniques


Contents
1. AI Problems 2. AI Techniques

AI Problems
The general problem of simulating (or creating) intelligence has been broken down into a number of specific sub-problems. These consist of particular traits or capabilities that researchers would like an intelligent system to display. The traits described below have received the most attention. Deduction, reasoning, problem solving Early AI researchers developed algorithms that imitated the step-by-step reasoning that humans were often assumed to use when they solve puzzles, play board games or make logical deductions. For difficult problems, most of these algorithms can require enormous computational resources most experience a "combinatorial explosion": the amount of memory or computer time required becomes astronomical when the problem goes beyond a certain size. The search for more efficient problem solving algorithms is a high priority for AI research. Human beings solve most of their problems using fast, intuitive judgments rather than the conscious, step-by-step deduction that early AI research was able to model. AI has made some progress at imitating this kind of "sub-symbolic" problem solving: embodied agent approaches emphasize the importance of sensorimotor skills to higher reasoning; neural net research attempts to simulate the structures inside human and animal brains that give rise to this skill. Knowledge representation Knowledge representation and knowledge engineering are central to AI research. Many of the problems machines are expected to solve will require extensive knowledge about the world. Among the things that AI needs to represent are: objects, properties, categories and relations between objects; situations, events, states and time; causes and effects; knowledge about knowledge (what we know about what other people know); and many other, less well researched domains. A complete representation of "what exists" is an ontology (borrowing a word from traditional philosophy), of which the most general are called upper ontologies. Among the most difficult problems in knowledge representations are:

CSE-304-F Intelligent system

By Kaushal Shakya

Default reasoning and the qualification problem Many of the things people know take the form of "working assumptions." For example, if a bird comes up in conversation, people typically picture an animal that is fist sized, sings, and flies. None of these things are true about all birds. John McCarthy identified this problem in 1969 as the qualification problem: for any commonsense rule that AI researchers care to represent, there tend to be a huge number of exceptions. Almost nothing is simply true or false in the way that abstract logic requires. AI research has explored a number of solutions to this problem.

The breadth of commonsense knowledge The number of atomic facts that the average person knows is astronomical. Research projects that attempt to build a complete knowledge base of commonsense knowledge (e.g., Cyc) require enormous amounts of laborious ontological engineering they must be built, by hand, one complicated concept at a time. A major goal is to have the computer understand enough concepts to be able to learn by reading from sources like the internet, and thus be able to add to its own ontology.

The subsymbolic form of some commonsense knowledge Much of what people know is not represented as "facts" or "statements" that they could express verbally. For example, a chess master will avoid a particular chess position because it "feels too exposed" or an art critic can take one look at a statue and instantly realize that it is a fake. These are intuitions or tendencies that are represented in the brain non-consciously and sub-symbolically. Knowledge like this informs, supports and provides a context for symbolic, conscious knowledge. As with the related problem of sub-symbolic reasoning, it is hoped that situated AI or computational intelligence will provide ways to represent this kind of knowledge.

Planning Intelligent agents must be able to set goals and achieve them. They need a way to visualize the future (they must have a representation of the state of the world and be able to make predictions about how their actions will change it) and be able to make choices that maximize the utility (or "value") of the available choices. In classical planning problems, the agent can assume that it is the only thing acting on the world and it can be certain what the consequences of its actions may be. However, if this is not true, it must periodically check if the world matches its predictions and it must change its plan as this becomes necessary, requiring the agent to reason under uncertainty. Multi-agent planning uses the cooperation and competition of many agents to achieve a given goal. Emergent behavior such as this is used by evolutionary algorithms and swarm intelligence.
CSE-304-F Intelligent system By Kaushal Shakya

Learning Machine learning has been central to AI research from the beginning. Unsupervised learning is the ability to find patterns in a stream of input. Supervised learning includes both classification and numerical regression. Classification is used to determine what category something belongs in, after seeing a number of examples of things from several categories. Regression takes a set of numerical input/output examples and attempts to discover a continuous function that would generate the outputs from the inputs. In reinforcement learning the agent is rewarded for good responses and punished for bad ones. These can be analyzed in terms of decision theory, using concepts like utility. The mathematical analysis of machine learning algorithms and their performance is a branch of theoretical computer science known as computational learning theory. Natural language processing ASIMO uses sensors and intelligent algorithms to avoid obstacles and navigate stairs. Natural language processing gives machines the ability to read and understand the languages that humans speak. Many researchers hope that a sufficiently powerful natural language processing system would be able to acquire knowledge on its own, by reading the existing text available over the internet. Some straightforward applications of natural language processing include information retrieval (or text mining) and machine translation. Motion and manipulation The field of robotics is closely related to AI. Intelligence is required for robots to be able to handle such tasks as object manipulation and navigation, with sub-problems of localization (knowing where you are), mapping (learning what is around you) and motion planning. Perception Machine perception is the ability to use input from sensors (such as cameras, microphones, sonar and others more exotic) to deduce aspects of the world. Computer vision is the ability to analyze visual input. A few selected subproblems are speech recognition, facial recognition and object recognition. Social intelligence Emotion and social skills play two roles for an intelligent agent. First, it must be able to predict the actions of others, by understanding their motives and emotional states. (This involves elements of game theory, decision theory, as well as the ability to model human emotions and the perceptual skills to detect emotions.) Also, for good human-computer interaction, an intelligent machine also needs to display emotions. At the very least it must appear polite and sensitive to the humans it interacts with. At best, it should have normal emotions itself.

CSE-304-F Intelligent system

By Kaushal Shakya

Creativity A sub-field of AI addresses creativity both theoretically (from a philosophical and psychological perspective) and practically (via specific implementations of systems that generate outputs that can be considered creative, or systems that identify and assess creativity). A related area of computational research is Artificial Intuition and Artificial Imagination. General intelligence Most researchers hope that their work will eventually be incorporated into a machine with general intelligence (known as strong AI), combining all the skills above and exceeding human abilities at most or all of them. Many of the problems above are considered AI-complete: to solve one problem, you must solve them all. For example, even a straightforward, specific task like machine translation requires that the machine follow the author's argument (reason), know what is being talked about (knowledge), and faithfully reproduce the author's intention (social intelligence). Machine translation, therefore, is believed to be AI-complete: it may require strong AI to be done as well as humans can do it.

AI Techniques
Cybernetics and brain simulation There is no consensus on how closely the brain should be simulated. In the 1940s and 1950s, a number of researchers explored the connection between neurology, information theory, and cybernetics. Some of them built machines that used electronic networks to exhibit rudimentary intelligence. Symbolic When access to digital computers became possible in the middle 1950s, AI research began to explore the possibility that human intelligence could be reduced to symbol manipulation. Cognitive simulation Logic based "Anti-logic" or "scruffy" Knowledge based

Computational Intelligence Interest in neural networks and "connectionism" was revived by David Rumelhart and others in the middle 1980s. These and other sub-symbolic approaches, such as fuzzy systems and evolutionary computation, are now studied collectively by the emerging discipline of computational intelligence.

CSE-304-F Intelligent system

By Kaushal Shakya

10

Agent architectures and cognitive architectures Researchers have designed systems to build intelligent systems out of interacting intelligent agents in a multi-agent system. A system with both symbolic and sub-symbolic components is a hybrid intelligent system, and the study of such systems is artificial intelligence systems integration. A hierarchical control system provides a bridge between sub-symbolic AI at its lowest, reactive levels and traditional symbolic AI at its highest levels, where relaxed time constraints permit planning and world modelling. Rodney Brooks' subsumption architecture was an early proposal for such a hierarchical system.

CSE-304-F Intelligent system

By Kaushal Shakya

11

Lecture 3 and 4 AI Programming Languages: Introduction to LISP


Contents
1. Introduction to AI Languages 2. LISP

Introduction to AI Languages
Programming languages in artificial intelligence (AI) are the major tool for exploring and building computer programs that can be used to simulate intelligent processes such as learning, reasoning and understanding symbolic information in context. Although in the early days of computer language design the primarily use of computers was for performing calculations with numbers, it was also found out quite soon that strings of bits could represent not only numbers but also features of arbitrary objects. Operations on such features or symbols could be used to represent rules for creating, relating or manipulating symbols. This led to the notion of symbolic computation as an appropriate means for defining algorithms that processed information of any type, and thus could be used for simulating human intelligence. Soon it turned out that programming with symbols required a higher level of abstraction than was possible with those programming languages which were designed especially for number processing, e.g., Fortran. In AI, the automation or programming of all aspects of human cognition is considered from its foundations in cognitive science through approaches to symbolic and sub-symbolic AI, natural language processing, computer vision, and evolutionary or adaptive systems. It is inherent to this very complex problem domain that in the initial phase of programming a specific AI problem, it can only be specified poorly. Only through interactive and incremental refinement does more precise specification become possible. This is also due to the fact that typical AI problems tend to be very domain specific, therefore heuristic strategies have to be developed empirically through generateandtest approaches (also known as rapid prototyping). In this way, AI programming notably differs from standard software engineering approaches where programming usually starts from a detailed formal specification. In AI programming, the implementation effort is actually part of the problem specification process. Due to the fuzzy nature of many AI problems, AI programming benefits considerably if the programming language frees the AI programmer from the constraints of too many technical constructions (e.g., low-level construction of new data types, manual allocation of memory). Rather, a declarative programming style is more convenient using built-in high-level data structures (e.g., lists or trees) and operations (e.g., pattern matching) so that symbolic computation is supported on a much more abstract level than would be possible with standard imperative languages, such as Fortran, Pascal or C. Of course, this sort of abstraction does not come for free, since compilation of AI programs on standard von Neumann computers cannot be done as efficiently as for imperative languages. However, once a certain AI problem is understood (at least

CSE-304-F Intelligent system

By Kaushal Shakya

12

partially), it is possible to reformulate it in form of detailed specifications as the basis for reimplementation using an imperative language. From the requirements of symbolic computation and AI programming, two new basic programming paradigms emerged as alternatives to the imperative style: the functional and the logical programming style. Both are based on mathematical formalisms, namely recursive function theory and formal logic. The first practical and still most widely used AI programming language is the functional language Lisp developed by John McCarthy in the late 1950s. Lisp is based on mathematical function theory and the lambda abstraction. A number of important and influential AI applications have been written in Lisp so we will describe this programming language in some detail in this article. During the early 1970s, a new programming paradigm appeared, namely logic programming on the basis of predicate calculus. Another most important logic programming language is Prolog, developed by Alain Colmerauer, Robert Kowalski and Phillippe Roussel. Problems in Prolog are stated as facts, axioms and logical rules for deducing new facts. Prolog is mathematically founded on predicate calculus and the theoretical results obtained in the area of automatic theorem proving in the late 1960s.

LISP: LIST Processing language


LISP is an AI language developed in 1958 (J. McCarthy at MIT) Its special focus is on symbolic processing and symbol manipulation. It uses: Linked list structures Also programs, functions are represented as lists At one point special LISP computers with basic LISP functions implemented directly on hardware were available (Symbolics Inc., 80s) LISP today: Many AI programs now are written in C,C++, Java List manipulation libraries are available LISP Competitors: Prolog, Python but LISP keeps its dominance among high level (AI) programming languages Current LISP: Common Lisp Scheme are the most widely-known general-purpose Lisp dialects Common LISP: Interpreter and compiler CLOS: object oriented programming
CSE-304-F Intelligent system By Kaushal Shakya

13

Syntax: Prefix notation Operator first, arguments follow E.g. (+ 3 2) adds 3 and 2 A lot of parentheses are used which define lists and also programs Examples: (a b c d) is a list of 4 elements (atoms) a,b,c,d (defun factorial (num) (cond ((<= num 0) 1) (t (* (factorial (- num 1)) num)) )) Basic data types: Symbols: a john 34 Lists () (a) (a john 34) (lambda (arg) (* arg arg)) For each symbol lisp attempts to find its value > (setq a 10) ;; sets a value of symbol a to 10 10 > a ;; returns the value of a 10 Special symbols: > t ;; true T > nil ;; nil stands for false or NIL > ( ) ;; an empty list NIL Lists represent function calls as well as basic data structures > (factorial 3) 6 > (+ 2 4) 6 > (setq a (john peter 34)) ;; quote means: do not eval the argument (john peter 34)
CSE-304-F Intelligent system By Kaushal Shakya

14

> (setq a ((john 1) (peter 2))) ((john 1) (peter 2)) List representation: A singly linked list > (setq a (john peter)) (john peter) > (car a) john > (cdr a) (peter) List building functions > (cons b nil) ;; quote means: do not eval the argument (b) > (setq a (cons b (cons c nil)) ;; setq a is a shorthand for set a (b c) > (setq v (list john 34 25)) (john 34 25) > (setq v (list a 34 25)) ((b c) 34 25) > (append (1 2) (2 3)) (1 2 2 3) List copying > (setq foo (list 'a 'b 'c)) (a b c) > (setq bar (cons 'x (cdr foo))) (x b c) > foo (a b c) ;; (cdr foo) makes a copy of the remaining list before cons > bar (x b c) Car and cdr operations are nondestructive. > (setq bar (a b c)) (a b c) > (setq foo (cdr bar)) (b c) > (rplaca foo u) ;; replaces car component of foo (destructive op) (u c) > foo (u c) > bar (a u c)
CSE-304-F Intelligent system By Kaushal Shakya

15

> (rplacd foo (v)) ;; replaces cdr component of foo (destructive) (u v) > bar (a u v) The same effect as with rplaca and rplacd can be achieved with setf > (setq bar (a b c)) (a b c) > (setq foo (cdr bar)) (b c) > (setf (cadr bar) u) u > bar (a u c) > foo (u c) Evaluation rules: A symbol value is sought and substituted A quoted value is kept untouched > (setq a 12) 12 > (setq b (+ a 4)) 16 > (setq b (+ a 4)) (+ a 4) > (eval b) ;; explicit evaluation call 16 Some useful functions and predicates: > (setq a (1 2 3 4 5)) (1 2 3 4 5) > (length a) ;; gives the list length of the argument 5 > (atom a) ;; checks if the argument is an atom T > (atom a) NIL > (listp a) ;; checks if the argument is a list NIL > (listp a) T Definition of a function
CSE-304-F Intelligent system By Kaushal Shakya

16

(defun <f-name> <parameter-list> <body>) >(defun square (x) (* x x)) SQUARE >(square 2) 4 >(square (square 2)) 16 Definition of a function (defun <f-name> <parameter-list> <body>) <body> can be a sequence of function calls, the function returns the value of the last call in the sequence > (defun foo (a) (setq b (+ a 1)) (setq c (+ a 2)) c) FOO > (foo 2) 4 Cond statement: It sequentially tests conditions; the call associated with the first true condition is executed > (defun abs (a) (cond ((> a 0) a) (t (- a)))) ABS > (abs 2) 2 > (abs -3) 3 if statement: (if <test> <then> <else>) > (defun abs (a) (if (> a 0) a (- a))) ABS > (abs 2) 2 > (abs -3) 3 4 equality predicates: =, equal, eq, eql > (= 2 4/2) ;; used for numerical values only T
CSE-304-F Intelligent system By Kaushal Shakya

17

> (setf a '(1 2 3 4)) (1 2 3 4) >(setf b '(1 2 3 4)) (1 2 3 4) >(setf c b) (1 2 3 4) > (equal a b) ;; equal is true if the two objects are isomorphic T > (equal c b) T >(eq a b) ;; eq is true if the two arguments point to the same object NIL >(eq b c) T

CSE-304-F Intelligent system

By Kaushal Shakya

18

Lecture 4, 5 and 6 PROLOG


Contents
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Introduction PROPLOG Program Relations versus Functions Facts and Rules Goals More on Facts More on rules Unification Lists Tree Order in the rules Variables in terms Backtracking

Introduction
Prolog was invented in the early seventies at the University of Marseille. Prolog stands for PROgramming in LOGic. It is a logic language that is particularly used by programs that use non-numeric objects. For this reason it is a frequently used language in Artificial Intelligence where manipulation of symbols is a common task. Prolog differs from the most common programming languages because it is declarative language. Traditional programming languages are said to be procedural. This means that the programmer specify how to solve a problem. In declarative languages the programmers only give the problem and the language find himself how to solve the problem. Although it can be, and often is, used by itself, Prolog complements traditional languages when used together in the same application.

Prolog program
Programming in Prolog is very different from programming in a traditional procedural language like Pascal. In Prolog you don't say how the program will work. Prolog can be separated in two parts : The Program The program, sometimes called Database is a texte file (*.pl) that contain the facts and rules that will be used by the user of the program. It contains all the relations that make this program.
CSE-304-F Intelligent system By Kaushal Shakya

19

The Query When you launch a program you are in query modequery mode. This mode is represented by the sign ? - at the begining of the line. In query mode you ask questions about relations described in the program. Loading a program For loading First you have to launch your Prolog compiler, for this report we used the SWIProlog which is a freeware (if you want to know more about SWI-Prolog go to this page). When Prolog is launched the ?- should appear meaning you are in query mode. The manner to launch a program depends of your compiler. For SWI-Prolog you can load a program by typing the command [file]. when the file of your program is file.pl. If you compiler is not SWI-Prolog you can also try the command reconsult(file). When you have done this you can use all the facts and rules that are contained in the program. Now let's begin to see what a fact...is.

Relations versus Functions


In logic programming we talk about relations rather than functions. Consider the append function: - fun append (nil, x) = x | append (hd::tl, x) = hd::append(tl, x); - append([1, 2], [3, 4]); val it = [1, 2, 3, 4] : int list But what if we want to find the result to the following query? - append ([1, 2], x) = [1, 2, 3, 4] The desired result would be [3, 4] ; we want the system to fill in the missing information. ML won't know that we want to find the result of x in this case. We would have to write another function that dealt with this particular query. Another way to look at this concept involves the following definition: - append relation(X, Y, Z): [] [1] [1] [1] [] [1] [1, 2] [3] [1, 2, 3] . . . . . . Where the relation is defined on an infinite set of tuples.

CSE-304-F Intelligent system

By Kaushal Shakya

20

We could then ask a number of questions to see if the desired result exists: append([], [], [])? yes append([], [1], [1])? yes append([1,2], [3], [1,2,3])? yes append([], [], [])? yes What if we ask the following? append(x, [1], [1])? append([1], x, [1])? append([1], [], x)? In the first case, X is an output where Y and Z are inputs. In the second case, Y is an output where X and Z are inputs. In the third case, Z is the output while Xand Y are inputs. These all ask a similar question: is there an x such that the given pattern is in the relation append ? The output would be a possible result for x, if one existed. Subsequent queries would return other possible answers until all are exhausted. The key to logic programming is you do not need to specify exactly how a result is computed. You simply need to describe the form of the result. It is assumed that the computer system can somehow determine how the result is computed. With this type of language, it is necessary to provide both the relevant information and a method of inference for computing the results. Arguments and results are treated uniformly. This means there is no distinction between input and output. It is also important to note that Prolog is NOT a typed language.

Facts and Rules


Logic programming describes information in terms of facts and rules. Fact A fact is analogous to a base case in a proof. It is a piece of information that is known and is a special case of a rule. A fact holds without any conditions, i.e. they are propositions that are assumed to be true. Facts are also known as unconditional Horn Clauses. The general case of a Horn clause is: P if Q1 and Q2 . . . and Qn

CSE-304-F Intelligent system

By Kaushal Shakya

21

Where a fact is simply P with no conditional statements Q1, Q2 . . . . Note that horn clauses CANNOT represent negative information. That is, you cannot ask if a tuple is NOT in a relation. Rule A rule describes a series of conditions followed by a conclusion returned if the conditions are met. A rule is analogous to inductive steps in proofs. Rules are also known as conditional Horn clauses. Some Prolog examples: Fact: append([], X, X). This says that if you append [] and something, you will obtain something. Rule: append([a, b], [c, d], [a, b, c, d]) if append([b], [c, d], [b, c, d]). In Prolog, capital letters are variables and lower case letters are atoms. A "." (period) is used to end a statement. Another notation convention is described by: [a, b] = [a | [b]] where the | operator is equivalent to cons (i.e. it separates the head and tail of a list). The above rule says: if the condition is met, return the appended result of the given lists.

Goals
A goal is simply a question we want answered by the system based on the knowledge in the database of information. Creating facts and rules builds our database of information. We can then state goals and get results back based on the database. append([], X, X). <----- fact append([H | X1], Y, [H | Z]) :- append(X1, Y, Z). <---- rule This is our database of knowledge. Now we can state goals to see if the information resides in the relation(s) created. Note that the ":-" operator means "if" and tells us we are creating a rule. (Thus, P :Q means P if Q .) ?- append([a, b], [c, d], [a, b, c, d]). yes This goal was found in the relation.

CSE-304-F Intelligent system

By Kaushal Shakya

22

?- append([a, b], [c, d], Z). Z = [a, b, c, d] ? yes We query the system and it returns the first result it comes across in the relation that satisfies the value of Z. The question mark after the result is the system asking if we want to search for more answers to our goal. Pressing enter means that we don't want to search for more answers. The system will respond that "yes" or "no", there are more answers that exist. Similar to the above query: ?- append(X, [c, d], [a, b, c, d]). X = [a, b] ? yes ?- append([a, b], Y, [a, b, c, d]). Y = [c, d] ? yes After obtaining an answer to a goal, if the user responds to the "?" prompt with a semi-colon, the next answer will be returned if any other answer exists. A "no" will be returned if all possibilities have been exhausted. A fact and a rule: append([], X, X). append([H | X], Y, [H | Z]) :- append(X, Y, Z). Goals: | ?- append([a], [b, c, d], X). X = [a, b, c, d] ? ; <--------- ";" means show more answers no <--------- the system says there are no more answers | ?- append(X, Y, [a, b, c, d]). X = [], Y = [a, b, c, d] ? ; <--------- show more answers X = [a], Y = [b, c, d] ? ; X = [a, b], Y = [c, d] ? ; X = [a, b, c], Y = [d] ? ;
CSE-304-F Intelligent system By Kaushal Shakya

23

X = [a, b, c d], Y = [] ? ; no <--------- the system says there are no more answers Another example: append([], X, X). append([H | X], Y, [H | Z]) :- append(X, Y, Z). | ?- append(X, X, [b, c, b, c]). X = [b, c] ? ; no This is pretty straight forward. But what happens when we ask this: | ?- append([b, c], X, X). X = [b, c, b, c, b, c, b, c, b, c, b, c, b, c, b . . . . We get an infinite list. Relations can be defined on other relations. For instance, the prefix relation is defined:

? prefix(X, Z) :- append(X, Y, Z). And the suffix relation can be defined: ? suffix(X, Z) :- append(Y, X, Z).

More on facts
Prolog is interactive like ML and Scheme. You can also create a file of facts and load them from the command line. Assume that the following facts are in a file calledfacts .

father(john, mary). father(sam, john). father(sam, kathy). Now, we start Prolog, load the file, and state some goals/queries: sicstus <-------- command to start Prolog SICStus 2.1 *8: Wed Apr 28 18:33:10 PDT 1993 | ?- [facts]. <-------- load the file: facts | ?- father(john, mary). yes
CSE-304-F Intelligent system By Kaushal Shakya

24

Our interpretation of the father relation father(X, Y) is that X is a father of Y . So, when we query the system if john is the father of mary , the answer we get back is yes . Indeed, that fact is in our information database because of the file we loaded. Now, we want to know if sam has any children:

| ?- father(sam, X). X = john ? ; <-------- are there any more? X = kathy ? ; no Sam has two children: john and kathy . Who is the father of mary?

| ?- father(X, mary). X = john ? ; no Given the same fact file from above, how are these goals evaluated: | ?- father(X, john), father(X, kathy). X = sam ? ; no | ?- father(X, john), father(X, Y). X = sam, Y = john ? ; X = sam, Y = kathy ? ; no The goals are stated in terms of two sub-goals which are "and"-ed together. The system sequentially searches the database of information from the top down. One pointer for each subgoal traces through the database until matches that satisfy both sub-goals are reached. In tracing the second example, we start out by trying to satisfy the first sub-goal father(X, john) . Starting at the top of the database, we look for a pattern that matches the sub-goal. The second item in the database, father(sam, john) matches. Thus, X = sam now. Next, we try to satisfy the second sub-goal father(X, Y) which is actually father(sam, Y) now. We again begin tracing through the database looking for matches. The first match encountered is father(sam, john) . At this point, the entire goal has been satisfied and the result X = sam, Y = john is returned.
CSE-304-F Intelligent system By Kaushal Shakya

25

We enter a semi-colon which tells the system we want to look for another possible solution that satisfies our goal. The system continues tracing through the file attempting to find a match to the goal father(sam, Y) . Another match is found at father(sam, kathy) . The goal has been satisfied again and the result X = sam, Y = kathy is returned. When prompted for another possibility, the system continues searching for a match to the second sub-goal. There is no more information left in the database at this point. However, the pointer to the first sub-goal is still at father(sam, john) . We haven't exhausted all possibilities for the first sub-goal yet. Therefore, we continue searching through the database looking for a match to the goal father(X, john) . None is found, so we are done. "No" is returned. If another match had been found for the first sub-goal, we would look at the second sub-goal again starting from the beginning of the database.

Following from the same example above, what happens if we do the following: | ?- father(X, john), father(X, Y), Y \== john. X = sam, Y = kathy ? ; no The operator \== tells the system to ignore any matches where Y equals john . Thus, we eliminate the first solution we had in the previous example. Another example: father(john, mary). father(sam, john). father(sam, kathy). | ?- father(X, Y), father(X, Z). X = john, Y = mary, Z = mary, ? ; X = sam, Y = john, Z = john ? ; X = sam,
CSE-304-F Intelligent system By Kaushal Shakya

26

Y = john, Z = kathy ? ; X = sam, Y = kathy, Z = kathy ? ; no

More on Rules
A rule is defined in the following way: <term> :- <term1>, <term2>, . . . . . , <termn> ^ ^ | | HEAD CONDITIONS The head is the conclusion and the conditions are considered to be "and"-ed together. Note: A fact is just a special rule with no conditions. Example: father(john, mary). father(sam, john). father(sam, kathy). <------ facts

grandpa(X, Y) :- father(X, Z), father(Z, Y). <------ rule | ?- grandpa(X, Y). X = sam, Y = mary ? ; no Without going into too much detail, the first condition of the rule will essentially match on anything in the given database of information. The important step is in the second condition as we try to satisfy the goal. We essentially are looking for a match where one person is both a father and a child (variable Z) to two other people. These two other people are then returned as the grandpa and grandchild. For each value of Z we encounter in the first condition we must then match that Z value to the other position in the father relation for the goal to be satisfied.

Unification
Two terms unify if they have a common instance (term) U between them. Deduction in Prolog is based on unification.

CSE-304-F Intelligent system

By Kaushal Shakya

27

Unification is similar to pattern matching. However, there is a difference because pattern matching can only happen one way, from left to right. Unification can match both ways; it depends on where the variables and the atoms are.

| ?- X is 2+3. X=5? X is instantiated to the value 5. The expression 2+3 is evaluated because we are using the operator "is" which tells us to evaluate 2+3 and make X that value. To unify, we use the operator =.

(2+3=X) + /\ 2 3 The diagram on the right is the term that X is unified with. | ?- 5 = 2 + 3. no This goal is NOT unifiable. | ?- 2 + 3 = 2 + Y. + + Y=3? /\ /\ 2 3 2 Y This goal is unifiable. We are increasing the information of Y to equal 3. | ?- f(X, b) = f(a, Y). + + X = a, /\ /\ Y=b? X b a Y This goal is unifiable because we can match X to a and Y to b . Another example: | ?- 2*X = Y*(3+Y). * * X = 3 + 2, /\ /\ Y=2? 2 X Y + /\ 3 Y The Y on the left is unified with the 2 on the right. The X on the left is the unified with (3 + Y) on the right which is actually (3 + 2) after the unification of Y with 2. Infinite loop examples: | ?- X = X + 2.
CSE-304-F Intelligent system By Kaushal Shakya

| ?- X = 2 + 3. X=2+3?

28

X= Prolog interruption (h for help)? a {Execution aborted} What happened? The X on the left is infinitely matched to X + 2 on the right. | ?- X = 2 + X X = 2+(2+(2+(2+(2+(2+(2+(2+(2+(2+(2+(2+(2+(2+( Prolog interruption (h for help)? a {Execution aborted} What happened? The same thing happened as above, except it knew what to do with each instance of 2+( that it found. The problem was that the unification for Xstill went on infinitely. Other unification examples: | ?- X is 2 + 3, X = 5. X=5? | ?- X is 2 + 3, X = 2 + 3. no | ?- 2 + 3 is X. ERROR. | ?- X is 5, 6 is X + 1. X=5? | ?- X = 5, 6 = X + 1. no | ?- 6 is X + 1, X = 5. ERROR. | ?- Y = 2, 2*X is Y*(Y+3). no | ?- Y = 2, Z is Y*(Y+3). Y = 2, Z = 10 ?

Lists
Notation: [a, b, c] = [a, b, c | []] = [a | [b, c]] This simply shows how lists are constructed with a head and a tail.
CSE-304-F Intelligent system By Kaushal Shakya

29

Here, we try to unify two lists: | ?- [H | T] = [a, b, c]. H = a, T = [b, c] ? | ?- [a | T] = [H, b, c]. H = a, T = [b, c] ? The head and tail is matched and the values are output.

Trees
Here is a definition for a binary tree in ML: - datatype bintree = empty | node of int * bintree * bintree; - fun member(k, empty) = false | member(k, node(n, s, t)) = if k < n then member(k, s) else if k < n then member(k, t) else true; How might you define a binary tree in Prolog? member(K, node(K, _ , _ )). <------ fact member(K, node(N, S, _ )) :- K < N, member(K, S). <---- rules member(K, node(N, _ , T)) :- K > N, member(K, T). Note: the underscore "_" is used to "match" anything. The fact defines the general structure of a tree -- a node with two branches. The first rule only worries about one side of the tree and searches that side of the tree if the value we are searching for is less than the current node. Similarly, the second rule only worries about the other side of the tree and searches that side of the tree if the value we are searching for is greater that the current node. This is a nice use of this language. It isn't even necessary to declare a special datatype to handle the tree; you simply write facts and rules to handle the data in the way you want to traverse the trees. The rest is handled by the system through deduction. More Examples Starting with the following fact and rule:

CSE-304-F Intelligent system

By Kaushal Shakya

30

member(M, [M | _]). member(M, [_ | T]) :- member(M, T). The fact looks at the next item in the list to see if it exists. The rule checks to see if the item M is in the rest of the list, i.e. T. What does the following return?

| ?- member(a, [a]). yes This matches to the fact. What about this: | ?- member(a, [b]) no This fails at the fact because the first item in the list is b not a . The condition in the rule also fails because the tail of the list [b] is just b which won't satisfy the fact. Thus, there is no match in the relation.

Order in the rules


The order of rules and goals is important to the end result. Consider the following examples: overlap(X,Y) :- member(M, X), member(M, Y). | ?- overlap(Z, [a, b, c, d]), member(Z, [1, 2, c, d]). (infinite computation) | ?- X = [1, 2, 3], member(a, X). no | ?- member(a, X), X = [1, 2, 3]. (infinite computation) The most interesting thing to note here is the difference between the answers of the second and third queries. The second query simply answers "no", but the third query never returns. Why does this happen? The best way to examine what happens is to look at the associated relation tree which shows the structure of the possible solutions:

member(a,X) / \ one ------> X=[a|_] X=[_|T] <------ another possibility ?member(a,T) possibility / \
CSE-304-F Intelligent system By Kaushal Shakya

31

T=[_|T'] ?member(a,T') / \ T'=[a|_] T'=[_|T''] X=[_,_,a,_] ?member(a,T''') When we know X , as in the second query, we can compare a to each element in the list. No matches occur, so "no" is returned. When we don't know X , as in the third query, the system keeps looking for elements in X to compare to a . . .it never finds any, but it keeps searching indefinitely. In essence, it constructs the equivalent of a search tree and traverses it until if finds something that satisfies the goal. Notice how the order of the expressions in the goal made a difference in how the result was deduced. There is a design issue when considering the ordering of facts and rules. Traversing relation trees could be implemented either in a depth-first manner or a breadth-first manner. Prolog uses depth-first "traversal" because each sub-goal must be satisfied before any subsequent sub-goals are satisfied. The ideal method would use breadth-first "traversal" where each sub-goal is examined in parallel. However, breadth-first requires a large amount of memory and resources. This is why depthfirst was used in Prolog. Unfortunately, this method does have its drawbacks as demonstrated by the following:

T=[a|_] X=[_,a,_]

?f(1) /\ f(1) yes <------ two | ?- f(1) /\ possibilities (infinite computation) f(1) yes /\ . . You might expect this to simply return f(1). In fact, if this were true logic, it SHOULD return f(1). However, because Prolog is implemented in a depth-first manner we never get to the second possibility on any level; we just keep traversing the rightmost branches. Thus, it traverses the tree forever.

f(X) :- f(X). f(1).

Variables in terms

CSE-304-F Intelligent system

By Kaushal Shakya

32

Variables in terms allow you to increase information using unification. | ?- L = [a, b | X]. L = [a, b | X] ? ; no | ?- L = [a, b | X], X = [C, Y]. L = [a, b, C, Y], X = [C, Y] ? ; no Notice how the information about L is increased by unifying X to the list [C, Y] . Unification of a variable representing the end of the list is similar to an assignment to that variable.

Backtracking
Backtracking simply refers to reconsidering sub-goals that were previously proven/satisfied. A new solution is found by beginning the search where the previous search for that sub-goal stopped. Considering the possibility tree, backtracking refers to retracing you steps as you follow branches back up to previously untraversed branches. A similar example to the example in the previous section:

?f(1) /\ fail yes Steps followed: 1. Trace from f(1) to fail. 2. Trace back from fail to f(1). This is the backtracking step. 3. Trace from f(1) to yes.

f(x) :- fail f(1)

CSE-304-F Intelligent system

By Kaushal Shakya

33

Lecture 8, 9 and 10 Problem Spaces and Searches


Contents
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Formal Description of a Problem The Monkey & Bananas Problem Missionaries and Cannibals Graphs/Trees 8 Puzzle Problem Search Consequences of Search Forward vs Backward Search Depth-First Search Breadth-First Search

Formal Description of a Problem


In AI, we will formally define a problem as a space of all possible configurations where each configuration is called a state thus, we use the term state space an initial state one or more goal states a set of rules/operators which move the problem from one state to the next

In some cases, we may enumerate all possible states (see monkey & banana problem on the next slide) but usually, such an enumeration will be overwhelmingly large so we only generate a portion of the state space, the portion we are currently examining

The Monkey & Bananas Problem


A monkey is in a cage and bananas are suspended from the ceiling, the monkey wants to eat a banana but cannot reach them in the room are a chair and a stick if the monkey stands on the chair and waves the stick, he can knock a banana down to eat it what are the actions the monkey should take?

CSE-304-F Intelligent system

By Kaushal Shakya

34

Initial state: Monkey on ground with empty hand bananas suspended Goal state: Monkey eating Actions: climb chair/get off grab X wave X eat X

Missionaries and Cannibals


3 missionaries and 3 cannibals are on one side of the river with a boat that can take exactly 2 people across the river o how can we move the 3 missionaries and 3 cannibals across the river o with the constraint that the cannibals never outnumber the missionaries on either side of the river (lest the cannibals start eating the missionaries!)?? We can represent a state as a 6-item tuple:

(a, b, c, d, e, f) a/b = number of missionaries/cannibals on left shore c/d = number of missionaries/cannibals in boat
By Kaushal Shakya

CSE-304-F Intelligent system

35

e/f = number of missionaries/cannibals on right shore where a + b + c + d + e + f = 6 and a >= b unless a = 0, c >= d unless c = 0, and e >= f unless e = 0 Legal operations (moves) are o 0, 1, 2 missionaries get into boat o 0, 1, 2 missionaries get out of boat o 0, 1, 2 cannibals get into boat o 0, 1, 2 missionaries get out of boat o boat sails from left shore to right shore o boat sails from right shore to left shore drawing the state space will be left as a homework problem

Graphs/Trees
We often visualize a state space (or a search space) as a graph. A tree is a special form of graph where every node has 1 parent and 0 to many children, in a graph, there is no parent/child relationship implied. Some problems will use trees, others can use graphs. Here is an example of representing a situation as a graph. On the top is the city of Konigsberg where there are 2 shores, 2 islands and 7 bridges. The graph below shows the connectivity. The question asked in this problem was: is there a single path that takes you to both shores and islands and covers every bridge exactly once?

CSE-304-F Intelligent system

By Kaushal Shakya

36

By representing the problem as a graph, it is easier to solve the answer by the way is no, the graph has four nodes whose degree is an odd number, the problem, finding an Euler path, is only solvable if a graph has exactly 0 or 2 nodes whose degrees are odd.

8 Puzzle Problem

The 8 puzzle search space consists of 8! states (40320). Problem Characteristics o Is the problem decomposable? o if yes, the problem becomes simpler to solve because each lesser problem can be tackled and the solutions combined together at the end Can solution steps be undone or ignored? o a game for instance often does not allow for steps to be undone (can you take back a chess move?) Is the problems universe predictable? o will applying the action result in the state we expect? for instance, in the monkey and banana problem, waving the stick on a chair does not guarantee that a banana will fall to the ground! Is a good solution absolute or relative? o for instance, do we care how many steps it took to get there? Is the desired solution a state or a path? o is the problem solved by knowing the steps, or reaching the goal?
By Kaushal Shakya

o o

CSE-304-F Intelligent system

37

o o

Is a large amount of knowledge absolutely required? Is problem solving interactive?

Search
Given a problem expressed as a state space (whether explicitly or implicitly) with operators/actions, an initial state and a goal state, how do we find the sequence of operators needed to solve the problem? This requires search. Formally, we define a search space as [N, A, S, GD] N = set of nodes or states of a graph A = set of arcs (edges) between nodes that correspond to the steps in the problem (the legal actions or operators) S = a nonempty subset of N that represents start states GD = a nonempty subset of N that represents goal states Our problem becomes one of traversing the graph from a node in S to a node in GD. We can use any of the numerous graph traversal techniques for this but in general, they divide into two categories: o o brute force unguided search heuristic guided search

Consequences of Search
The 8-puzzle has over 40000 different states. What about the 15 puzzle? A brute force search means trying all possible states blindly until you find the solution for a state space for a problem requiring n moves where each move consists of m choices, there are 2m*n possible states. Two forms of brute force search are: depth first search, breath first search A guided search examines a state and uses some heuristic (usually a function) to determine how good that state is (how close you might be to a solution) to help determine what state to move to o o o o hill climbing best-first search A/A* algorithm Min Max

While a good heuristic can reduce the complexity from 2m*n to something tractable, there is no guarantee so any form of search is O(2n) in the worst case.

Forward vs Backward Search


The common form of reasoning starts with data and leads to conclusions. For instance, diagnosis is data-driven given the patient symptoms, we work toward disease hypotheses. We
CSE-304-F Intelligent system By Kaushal Shakya

38

often think of this form of reasoning as forward chaining through rules. Backward search reasons from goals to actions. Planning and design are often goal-driven--backward chaining.

Depth-first Search

Starting at node A, our search gives us: A, B, E, K, S, L, T, F, M, C, G, N, H, O, P, U, D, I, Q, J, R

CSE-304-F Intelligent system

By Kaushal Shakya

39

Example:

Traveling Salesman Problem

CSE-304-F Intelligent system

By Kaushal Shakya

40

Breadth-First Search

Starting at node A, our search would generate the nodes in alphabetical order from A to U. Example

CSE-304-F Intelligent system

By Kaushal Shakya

41

Lecture 11, 12 and 13 Heuristic search techniques


Contents
1. 2. 3. 4. 5. 6. 7. 8. Definition of Heuristic Heuristic search Hill Climbing Simulated annealing Best First Search The A* Algorithm AND-OR Graphs The AO* Algorithm

Definition
A Heuristic is an operationally-effective nugget of information on how to direct search in a problem space. Heuristics are only approximately correct. Their purpose is to minimize search on average.

Heuristic Search
A heuristic is a method that

might not always find the best solution but is guaranteed to find a good solution in reasonable time. By sacrificing completeness it increases efficiency. Useful in solving tough problems which o could not be solved any other way. o solutions take an infinite time or very long time to compute.

The classic example of heuristic search methods is the travelling salesman problem. Heuristic Search methods Generate and Test Algorithm 1. generate a possible solution which can either be a point in the problem space or a path from the initial state. 2. test to see if this possible solution is a real solution by comparing the state reached with the set of goal states. 3. if it is a real solution, return. Otherwise repeat from 1. This method is basically a depth first search as complete solutions must be created before testing. It is often called the British Museum method as it is like looking for an exhibit at
CSE-304-F Intelligent system By Kaushal Shakya

42

random. A heuristic is needed to sharpen up the search. Consider the problem of four 6-sided cubes, and each side of the cube is painted in one of four colours. The four cubes are placed next to one another and the problem lies in arranging them so that the four available colours are displayed whichever way the 4 cubes are viewed. The problem can only be solved if there are at least four sides coloured in each colour and the number of options tested can be reduced using heuristics if the most popular colour is hidden by the adjacent cube.

Hill climbing
Here the generate and test method is augmented by an heuristic function which measures the closeness of the current state to the goal state. 1. Evaluate the initial state if it is goal state quit otherwise current state is initial state. 2. Select a new operator for this state and generate a new state. 3. Evaluate the new state o if it is closer to goal state than current state make it current state o if it is no better ignore 4. If the current state is goal state or no new operators available, quit. Otherwise repeat from 2. In the case of the four cubes a suitable heuristic is the sum of the number of different colours on each of the four sides, and the goal state is 16 four on each side. The set of rules is simply choose a cube and rotate the cube through 90 degrees. The starting arrangement can either be specified or is at random.

Simulated Annealing
This is a variation on hill climbing and the idea is to include a general survey of the scene to avoid climbing false foot hills. The whole space is explored initially and this avoids the danger of being caught on a plateau or ridge and makes the procedure less sensitive to the starting point. There are two additional changes; we go for minimization rather than creating maxima and we use the term objective function rather than heuristic. It becomes clear that we are valley descending rather than hill climbing. The title comes from the metallurgical process of heating metals and then letting them cool until they reach a minimal energy steady final state. The probability that the metal will jump to a higher energy level is given by where k is Boltzmann's constant. The rate at which the system is cooled is called the annealing schedule. is called the change in the value of the objective function and kT is called T a type of temperature. An example of a problem suitable for such an algorithm is the travelling salesman. The SIMULATED ANNEALING algorithm is based upon the physical process which occurs in metallurgy where metals are heated to high temperatures and are then cooled. The rate of cooling clearly affects the finished product. If the rate of cooling is fast, such as when the metal is quenched in a large

CSE-304-F Intelligent system

By Kaushal Shakya

43

tank of water the structure at high temperatures persists at low temperature and large crystal structures exist, which in this case is equivalent to a local maximum. On the other hand if the rate of cooling is slow as in an air based method then a more uniform crystalline structure exists equivalent to a global maximum.The probability of making a large uphill move is lower than a small move and the probability of making large moves decreases with temperature. Downward moves are allowed at any time. 1. 2. 3. 4. 5. Evaluate the initial state. If it is goal state Then quit otherwise make the current state this initial state and proceed. Make variable BEST_STATE to current state Set temperature, T, according to the annealing schedule Repeat -- difference between the values of current and new states 1. If this new state is goal state Then quit 2. Otherwise compare with the current state 3. If better set BEST_STATE to the value of this state and make the current the new state 4. If it is not better then make it the current state with probability p'. This involves generating a random number in the range 0 to 1 and comparing it with a half, if it is less than a half do nothing and if it is greater than a half accept this state as the next current be a half. 5. Revise T in the annealing schedule dependent on number of nodes in tree Until a solution is found or no more new operators 6. Return BEST_STATE as the answer

Best First Search


It is a combination of depth first and breadth first searches. Depth first is good because a solution can be found without computing all nodes and breadth first is good because it does not get trapped in dead ends. The best first search allows us to switch between paths thus gaining the benefit of both approaches. At each step the most promising node is chosen. If one of the nodes chosen generates nodes that are less promising it is possible to choose another at the same level and in effect the search changes from depth to breadth. If on analysis these are no better then this previously unexpanded node and branch is not forgotten and the search method reverts to the descendants of the first choice and proceeds, backtracking as it were. This process is very similar to steepest ascent, but in hill climbing once a move is chosen and the others rejected the others is never reconsidered whilst in best first they are saved to enable
CSE-304-F Intelligent system By Kaushal Shakya

44

revisits if an impasse occurs on the apparent best path. Also the best available state is selected in best first even its value is worse than the value of the node just explored whereas in hill climbing the progress stops if there are no better successor nodes. The best first search algorithm will involve an OR graph which avoids the problem of node duplication and assumes that each node has a parent link to give the best node from which it came and a link to all its successors. In this way if a better node is found this path can be propagated down to the successors. This method of using an OR graph requires 2 lists of nodes OPEN is a priority queue of nodes that have been evaluated by the heuristic function but which have not yet been expanded into successors. The most promising nodes are at the front. CLOSED are nodes that have already been generated and these nodes must be stored because a graph is being used in preference to a tree. Heuristics In order to find the most promising nodes a heuristic function is needed called f' where f' is an approximation to f and is made up of two parts g and h' where g is the cost of going from the initial state to the current node; g is considered simply in this context to be the number of arcs traversed each of which is treated as being of unit weight. h' is an estimate of the initial cost of getting from the current node to the goal state. The function f' is the approximate value or estimate of getting from the initial state to the goal state. Both g and h' are positive valued variables. Best First The Best First algorithm is a simplified form of the A* algorithm. From A* we note that f' = g+h' where g is a measure of the time taken to go from the initial node to the current node and h' is an estimate of the time taken to solution from the current node. Thus f' is an estimate of how long it takes to go from the initial node to the solution. As an aid we take the time to go from one node to the next to be a constant at 1. Best First Search Algorithm: 1. 2. 3. 4. Start with OPEN holding the initial state Pick the best node on OPEN Generate its successors For each successor Do o If it has not been generated before evaluate it add it to OPEN and record its parent o If it has been generated before change the parent if this new path is better and in that case update the cost of getting to any successor nodes 5. If a goal is found or no more nodes left in OPEN, quit, else return to 2.

CSE-304-F Intelligent system

By Kaushal Shakya

45

The A* Algorithm
Best first search is a simplified A*. 1. 2. 3. 4. Start with OPEN holding the initial nodes. Pick the BEST node on OPEN such that f = g + h' is minimal. If BEST is goal node quit and return the path from initial to BEST Otherwise Remove BEST from OPEN and all of BEST's children, labelling each with its path from initial node.

Graceful decay of admissibility If h' rarely overestimates h by more than d then the A* algorithm will rarely find a solution whose cost is d greater than the optimal solution.
CSE-304-F Intelligent system By Kaushal Shakya

46

And-Or Graphs
Useful for certain problems where

The solution involves decomposing the problem into smaller problems. We then solve these smaller problems.

Here the alternatives often involve branches where some or all must be satisfied before we can progress. For example if I want to learn to play a Frank Zappa guitar solo I could (Fig. 2.2.1)

Transcribe it from the CD. OR Buy the ``Frank Zappa Guitar Book'' AND Read it from there.

CSE-304-F Intelligent system

By Kaushal Shakya

47

Note the use of arcs to indicate that one or more nodes must all be satisfied before the parent node is achieved. To find solutions using an And-Or GRAPH the best first algorithm is used as a basis with a modification to handle the set of nodes linked by the AND factor. Inadequate: CANNOT deal with AND bit well.

AO* Algorithm
1. Initialise the graph to start node 2. Traverse the graph following the current path accumulating nodes that have not yet been expanded or solved 3. Pick any of these nodes and expand it and if it has no successors call this value FUTILITY otherwise calculate only f' for each of the successors. 4. If f' is 0 then mark the node as SOLVED 5. Change the value of f' for the newly created node to reflect its successors by back propagation. 6. Wherever possible use the most promising routes and if a node is marked as SOLVED then mark the parent node as SOLVED. 7. If starting node is SOLVED or value greater than FUTILITY, stop, else repeat from 2.

CSE-304-F Intelligent system

By Kaushal Shakya

48

Lecture 14 and 15 Game Playing


Contents
1. Introduction to Game Playing 2. Minimax Algorithm 3.

Alpha-beta pruning

Introduction to Game Playing


Game playing has been a major topic of AI since the very beginning. Beside the attraction of the topic to people, it is also because its close relation to "intelligence", and its well-defined states and rules. The most common used AI technique in game is search. In other problem-solving activities, state change is solely caused by the action of the agent. However, in Multi-agent games, it also depends on the actions of other agents who usually have different goals. A special situation that has been studied most is "two-person zero-sum game", where the two players have exactly opposite goals. (Not all competition are zero-sum!) There are perfect information games (such as Chess and Go) and imperfect information games (such as Bridge and games where a dice is used). Given sufficient time and space, usually an optimum solution can be obtained for the former by exhaustive search, though not for the latter. However, for most interesting games, such a solution is usually too inefficient to be practically used.

MINIMAX Algorithm
Minimax (sometimes minmax) is a decision rule used in decision theory, game theory, statistics and philosophy for minimizing the possible loss while maximizing the potential gain. Alternatively, it can be thought of as maximizing the minimum gain (maximin). Originally formulated for two-player zero-sum game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision making in the presence of uncertainty. Game theory In the theory of simultaneous games, a minimax strategy is a mixed strategy which is part of the solution to a zero-sum game. In zero-sum games, the minimax solution is the same as the Nash equilibrium. Minimax theorem states:

CSE-304-F Intelligent system

By Kaushal Shakya

49

For every two-person, zero-sum game with finite strategies, there exists a value V and a mixed strategy for each player, such that (a) Given player 2's strategy, the best payoff possible for player 1 is V, and (b) Given player 1's strategy, the best payoff possible for player 2 is V. Equivalently, Player 1's strategy guarantees him a payoff of V regardless of Player 2's strategy, and similarly Player 2 can guarantee himself a payoff of V. The name minimax arises because each player minimizes the maximum payoff possible for the othersince the game is zero-sum, he also maximizes his own minimum payoff. This theorem was established by John von Neumann, who is quoted as saying "As far as I can see, there could be no theory of games without that theorem I thought there was nothing worth publishing until the Minimax Theorem was proved". Example B chooses B1 B chooses B2 B chooses B3 The following example of a zero-sum game, where A and B make 2 +2 simultaneous moves, A chooses A1 +3 illustrates minimax solutions. Suppose 0 +4 each player has three choices and A chooses A2 1 consider the payoff 3 +1 matrix for A displayed at right. A chooses A3 4 Assume the payoff matrix for B is the same matrix with the signs reversed (i.e. if the choices are A1 and B1 then B pays 3 to A). Then, the minimax choice for A is A2 since the worst possible result is then having to pay 1, while the simple minimax choice for B is B2 since the worst possible result is then no payment. However, this solution is not stable, since if B believes A will choose A2 then B will choose B1 to gain 1; then if A believes B will choose B1 then A will choose A1 to gain 3; and then B will choose B2; and eventually both players will realize the difficulty of making a choice. So a more stable strategy is needed. Some choices are dominated by others and can be eliminated: A will not choose A3 since either A1 or A2 will produce a better result, no matter what B chooses; B will not choose B3 since B1 or B2 will produce a better result, no matter what A chooses. A can avoid having to make an expected payment of more than 1/3 by choosing A1 with probability 1/6 and A2 with probability 5/6, no matter what B chooses. B can ensure an expected gain of at least 1/3 by using a randomized strategy of choosing B1 with probability 1/3 and B2 with probability 2/3, no matter what A chooses. These mixed minimax strategies are now stable and cannot be improved.

Alpha-beta pruning
Alpha-beta pruning is a search algorithm which seeks to reduce the number of nodes that are evaluated by the minimax algorithm in its search tree. It is a search with adversary algorithm used commonly for machine playing of two-player games (Tic-tac-toe, Chess, Go, etc.). It stops completely evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move. Such moves need not be evaluated further.
CSE-304-F Intelligent system By Kaushal Shakya

50

Alpha-beta pruning is a sound optimization in that it does not change the score of the result of the algorithm it optimizes Improvements over naive minimax

An illustration of alpha-beta pruning. The grayed-out subtrees need not be explored (when moves are evaluated from left to right), since we know the group of subtrees as a whole yields the value of an equivalent subtree or worse, and as such cannot influence the final result. The max and min levels represent the turn of the player and the adversary, respectively. The benefit of alpha-beta pruning lies in the fact that branches of the search tree can be eliminated. The search time can in this way be limited to the 'more promising' subtree, and a deeper search can be performed in the same time. Like its predecessor, it belongs to the branch and bound class of algorithms. The optimization reduces the effective depth to slightly more than half that of simple minimax if the nodes are evaluated in an optimal or near optimal order (best choice for side on move ordered first at each node). With an (average or constant) branching factor of b, and a search depth of d plies, the maximum number of leaf node positions evaluated (when the move ordering is pessimal) is O(b*b*...*b) = O(bd) the same as a simple minimax search. If the move ordering for the search is optimal (meaning the best moves are always searched first), the number of leaf node positions evaluated is about O(b*1*b*1*...*b) for odd depth and O(b*1*b*1*...*1) for even depth, or . In the latter case, where the ply of a search is even, the effective branching factor is reduced to its square root, or, equivalently, the search can go twice as deep with the same amount of computation.[8] The explanation of b*1*b*1*... is that all the first player's moves must be studied to find the best one, but for each, only the best second player's move is needed to refute all but the first (and best) first player move alpha-beta ensures no other second player moves need be considered. If b=40 (as in chess), and the search depth is 12 plies, the ratio between optimal and pessimal sorting is a factor of nearly 406 or about 4 billion times.

CSE-304-F Intelligent system

By Kaushal Shakya

51

An animated pedagogical example that attempts to be human-friendly by substituting initial infinite (or arbitrarily large) values for emptiness and by avoiding using the negamax coding simplifications. Normally during alpha-beta, the subtrees are temporarily dominated by either a first player advantage (when many first player moves are good, and at each search depth the first move checked by the first player is adequate, but all second player responses are required to try and find a refutation), or vice versa. This advantage can switch sides many times during the search if the move ordering is incorrect, each time leading to inefficiency. As the number of positions searched decreases exponentially each move nearer the current position, it is worth spending considerable effort on sorting early moves. An improved sort at any dep th will exponentially reduce the total number of positions searched, but sorting all positions at depths near the root node is relatively cheap as there are so few of them. In practice, the move ordering is often determined by the results of earlier, smaller searches, such as through iterative deepening. The algorithm maintains two values, alpha and beta, which represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of respectively. Initially alpha is negative infinity and beta is positive infinity. As the recursion progresses the "window" becomes smaller. When beta becomes less than alpha, it means that the current position cannot be the result of best play by both players and hence need not be explored further. Additionally, this algorithm can be trivially modified to return an entire principal variation in addition to the score. Some more aggressive algorithms such as MTD(f) do not easily permit such a modification.

CSE-304-F Intelligent system

By Kaushal Shakya

52

Das könnte Ihnen auch gefallen