Sie sind auf Seite 1von 160

2011

ARTIFICIAL INTELLIGENCE

R.SHANTHI VICTORIA ., B.E., M.E


TAMIL NADU COLLEGE OF ENGINEERING COIMBATORE.
Designed according to Anna University Coimbatore
syllabus FOR IV YEAR CSE STUDENTS.

11/1/2011
ARTIFICIAL INTELLIGENCE

LTPMC
3 1 0 100 4
UNIT I Introduction and Problem Solving I 9

Artificial Intelligence: Definition-Turing Test-Relation with other Disciplines-History of AI


Applications- Agent: Intelligent Agent-Rational Agent - Nature of Environments-Structure of
Agent.-Problem Solving Agent - Problems: Toy Problems and Real-world Problems-Uninformed
Search Strategies: BFS, DFS, DLS, IDS, Bidirectional Search -Comparison of uninformed search
strategies.

UNIT II Problem Solving II: 9


Informed Search Strategies-Greedy best-first search-A* search-Heuristic functions-Local search
Algorithms and Optimization problems - Online Search Agent-Constraint Satisfaction Problems-
Backtracking Search for CSPs Local Search for Constraint Satisfaction Problems-Structure of
Problems -Adversarial Search-Optimal Decision in Games-Alpha-Beta Pruning-Imperfect Real Time
Decisions-Games that Include an Element of Chance.

UNIT III Knowledge Representation 9

First-Order Logic-Syntax and Semantics of First-Order-Logic-Using First-Order-Logic-Knowledge


Engineering in First-Order-Logic.- Inference in First-Order-Logic- Inference rules-Unification and
Lifting-Forward Chaining-Backward Chaining-Resolution.

UNIT IV Learning 9

Learning from Observations- Forms of Learning-Learning Decision Ensemble Learning - A Logical


Formulation of Learning-Knowledge in Learning-Explanation Based Learning-Learning using
Relevance Information-Inductive Logic Programming.

UNIT V Applications 9

Communication Communication as action -A formal grammar for a fragment of English


Syntactic Analysis Augmented Grammars Semantic Interpretation Ambiguity and
Disambiguation Discourse Understanding Grammar Induction.
Perception Image Formation Early Image Processing Operations Extracting Three Dimensional
Information Object Recognition Using Vision for Manipulation and Navigation.
Total:45

2
TEXT BOOKS:

1. Stuart Russell, Peter Norvig, Artificial Intelligence A Modern Approach, 3rd Edition, Pearson
Education / Prentice Hall of India 2010(yet to be published).
2. Nils J. Nilsson, Artificial Intelligence: A new Synthesis, Harcourt Asia Pvt. Ltd,2003.
REFERENCES:
1. Elaine Rich and Kevin Knight, Artificial Intelligence, 2nd Edition, Tata McGraw-
Hill, 2003.
2. Patrick Henry Winston, Artificial Intelligence, Pearson Education / PHI, 2004.

3
UNIT-1

INTRODUCTION AND PROBLEM SOLVING I


DEFINITION:
Artificial Intelligence is the study of how to make computers do things at which, at the moment,
people are better.

SOME DEFINITIONS OF AI

Building systems that think like humans

The exciting new effort to make computers think machines with minds, in the full
and literal sense -- Haugeland, 1985

The automation of activities that we associate with human thinking, such as


decision-making, problem solving, learning, -- Bellman, 1978

Building systems that act like humans

The art of creating machines that perform functions that require intelligence when
performed by people -- Kurzweil, 1990

The study of how to make computers do things at which, at the moment, people
are better -- Rich and Knight, 1991

Building systems that think rationally

The study of mental faculties through the use of computational models --


Charniak and McDermott, 1985

The study of the computations that make it possible to perceive, reason, and act --
Winston, 1992

Building systems that act rationally

A field of study that seeks to explain and emulate intelligent behavior in terms of
computational processes -- Schalkoff, 1990

The branch of computer science that is concerned with the automation of


intelligent behavior -- Luger and Stubblefield, 1993

TURING TEST
It is proposed by Alan Turing 1950 .According to this test, a computer could be considered to be
thinking only when a human interviewer, conversing with both an unseen human being and an
unseen computer, could not determine which is which.

Description:

2 human being,1 computer

The computer would need to posses the following capabilities:

The computer processing: to enable it to communicate successfully in English

4
Knowledge representation: to store what it knows or hears
Automated reasoning: to use the stored information to answer questions and to draw new
conclusions
Machine learning: to adapt to new circumstances and to detect and extrapolate patterns.

To pass the total Turing test, the computer will need,

Computer vision: to perceive objects


Robotics: to manipulate objects and move about

Thinking and Acting Humanly

Acting humanly

"If it looks, walks, and quacks like a duck, then it is a duck

The Turing Test

Interrogator communicates by typing at a terminal with TWO other agents.


The human can say and ask whatever s/he likes, in natural language. If the
human cannot decide which of the two agents is a human and which is a
computer, then the computer has achieved AI

this is an OPERATIONAL definition of intelligence, i.e., one that gives an


algorithm for testing objectively whether the definition is satisfied

Thinking humanly: cognitive modeling

Develop a precise theory of mind, through experimentation and introspection, then


write a computer program that implements it

Example: GPS - General Problem Solver (Newell and Simon, 1961)

trying to model the human process of problem solving in general

Thinking Rationally- The laws of thought approach

Capture ``correct'' reasoning processes

A loose definition of rational thinking: Irrefutable reasoning process

How do we do this

Develop a formal model of reasoning (formal logic) that always leads to


the right answer

Implement this model

How do we know when we've got it right?

when we can prove that the results of the programmed reasoning are
correct

soundness and completeness of first-order logic

Example:

5
Ram is a student of III year CSE. All students are good in III year CSE.

Ram is a good student.

Acting Rationally

Act so that desired goals are achieved

The rational agent approach (this is what well focus on in this course)

Figure out how to make correct decisions, which sometimes means thinking
rationally and other times means having rational reflexes

correct inference versus rationality

reasoning versus acting; limited rationality

RELATION WITH OTHER DISCIPLINES:


- Expert Systems

- Natural Language Processor

- Speech Recognition

- Robotics

- Computer Vision

- Intelligent Computer-Aided Instruction

- Data Mining

- Genetic Algorithms

Philosophy Logic, methods of reasoning, mind as physical


system foundations of learning, language,
rationality

Mathematics Formal representation and proof algorithms,


computation, (un)decidability, (in)tractability,
probability

Economics utility, decision theory

Neuroscience physical substrate for mental activity

Psychology phenomena of perception and motor control,


experimental techniques

Computer building fast computers


engineering

Control theory design systems that maximize an objective


function over time

6
Linguistics knowledge representation, grammar

HISTORY OF AI:
1943 McCulloch & Pitts: Boolean circuit model of brain

1950 Turing's "Computing Machinery and Intelligence"

1956 Dartmouth meeting: "Artificial Intelligence" adopted

195269 Look, Ma, no hands!

1950s Early AI programs, including Samuel's checkers


program, Newell & Simon's Logic Theorist,
Gelernter's Geometry Engine

1965 Robinson's complete algorithm for logical reasoning

196673 AI discovers computational complexity


Neural network research almost disappears

196979 Early development of knowledge-based systems

1980-- AI becomes an industry

1986-- Neural networks return to popularity

1987-- AI becomes a science

1995-- The emergence of intelligent agents

INTELLIGENT AGENT:
Agent = perceive + act

Thinking

Reasoning

Planning

7
Agent: entity in a program or environment capable of generating action.

An agent uses perception of the environment to make decisions about actions to take.

The perception capability is usually called a sensor.

The actions can depend on the most recent perception or on the entire history (percept
sequence).

An agent is anything that can be viewed as perceiving its environment through sensors and acting
upon the environment through actuators.

Ex: Robotic agent

Human agent

Agent

E
Sensors N
PERCEPT V
S I
R
? O
N
M
ACTUATORS E
ACTION N
T

8
Agents interact with environment through sensors and actuators.

A B

Percept sequence action


[A, clean] right

[A, dirt] suck

[B, clean] left

[B, dirty] suck

[A, clean], [A, clean] right

[A, clean], [A, dirty] suck

Fig: practical tabulation of a simple agent function for the vacuum cleaner world

Agent Function
The agent function is a mathematical function that maps a sequence of perceptions into
action.

The function is implemented as the agent program.

The part of the agent taking an action is called an actuator.

environment sensors agent function actuators environment

9
RATIONAL AGENT:
A rational agent is one that can take the right decision in every situation.

Performance measure: a set of criteria/test bed for the success of the agent's behavior.

The performance measures should be based on the desired effect of the agent on the
environment.

Rationality:
The agent's rational behavior depends on:

the performance measure that defines success

the agent's knowledge of the environment

the action that it is capable of performing

The current sequence of perceptions.

Definition: for every possible percept sequence, the agent is expected to take an action that
will maximize its performance measure.

Agent Autonomy:
An agent is omniscient if it knows the actual outcome of its actions. Not possible in practice.

An environment can sometimes be completely known in advance.

Exploration: sometimes an agent must perform an action to gather information (to increase
perception).

10
Autonomy: the capacity to compensate for partial or incorrect prior knowledge (usually by
learning).

NATURE OF ENVIRONMENTS:
Task environment the problem that the agent is a solution to.

Includes

Performance measure

Environment

Actuator

Sensors

Agent Type Performance Environment Actuators Sensors


Measures

Taxi Driver Safe, Fast, Legal, Roads, other Steering, Camera, sonar,
Comfort, traffic, accelerators, GPS,
Maximize Profits pedestrians, brake, signal, Speedometer,
customers horn keyboard, etc

Medical Healthy patient, Patient, Screen display Keyboard (entry


diagnosis minimize costs, hospital, staff (questions, of symptoms,
system lawsuits tests, findings,
diagnoses, patient's
treatments, answers)
referrals)

Properties of Task Environment:


Fully Observable (vs. Partly Observable)

Agent sensors give complete state of the environment at each point in time

Sensors detect all the aspect that is relevant to the choice of action.

An environment might be partially observable because of noisy and inaccurate


sensors or apart of the state are simply missing from the sensor data.

Deterministic (vs. Stochastic)

Next state of the environment is completely determined by the current state and
the action executed by the agent

11
Strategic environment (if the environment is deterministic except for the actions of
other agent.)

Episodic (vs. Sequential)

Agents experience can be divided into episodes, each episode with what an agent
perceive and what is the action

Next episode does not depend on the previous episode

Current decision will affect all future sates in sequential environment

Static (vs. Dynamic)

Environment doesnt change as the agent is deliberating

Semi dynamic

Discrete (vs. Continuous)

Depends the way time is handled in describing state, percept, actions

Chess game : discrete

Taxi driving : continuous

Single Agent (vs. Multi Agent)

Competitive, cooperative multi-agent environments

Communication is a key issue in multi agent environments.

Partially Observable:

Ex: Automated taxi cannot see what other devices are thinking.

Stochastic:

Ex: taxi driving is clearly stochastic in this sense, because one can never predict the behavior
of the traffic exactly.

Semi dynamic:

If the environment does not change for some time, then it changes due to agents
performance is called semi dynamic environment.

Single Agent Vs multi agent:

An agent solving a cross word puzzle by itself is clearly in a single agent environment.

An agent playing chess is in a two agent environment.

Example of Task Environments and Their Classes

12
STRUCTURE OF AGENT:

Simple Agents:

13
Table-driven agents: the function consists in a lookup table of actions to be taken for every
possible state of the environment.

If the environment has n variables, each with t possible states, then the table
size is tn.

Only works for a small number of possible states for the environment.

Simple reflex agents: deciding on the action to take based only on the current perception
and not on the history of perceptions.

Based on the condition-action rule:

(if (condition) action)

Works if the environment is fully observable

Four types of agents:


1. Simple reflex agent

2. Model based reflex agent

3. goal-based agent

4. utility-based agent

Simple reflex agent


Definition:

SRA works only if the correct decision can be made on the basis of only the current percept
that is only if the environment is fully observable.

Characteristics

no plan, no goal

14
do not know what they want to achieve

do not know what they are doing

Condition-action rule

If condition then action

Ex: medical diagnosis system.

15
Algorithm Explanation:

Interpret Input:

Function generates an abstracted description of the current state from the percept.

RULE- MATCH:

Function returns the first rule in the set of rules that matches the given state description.

RULE - ACTION:

The selected rule is executed as action of the given percept.

Model-Based Reflex Agents:


Definition:

An agent which combines the current percept with the old internal state to generate
updated description of the current state.

If the world is not fully observable, the agent must remember observations about the parts
of the environment it cannot currently observe.

This usually requires an internal representation of the world (or internal state).

Since this representation is a model of the world, we call this model-based agent.

Ex: Braking problem

characteristics

Reflex agent with internal state

Sensor does not provide the complete state of the world.

must keep its internal state

Updating the internal world

requires two kinds of knowledge

How world evolves

How agents action affect the world

16
Algorithm Explanation:

UPDATE-INPUT: This is responsible for creating the new internal stated description.

Goal-based agents:
The agent has a purpose and the action to be taken depends on the current state and on
what it tries to accomplish (the goal).

In some cases the goal is easy to achieve. In others it involves planning, sifting through a
search space for possible solutions, developing a strategy.

Characteristics

Action depends on the goal. (consideration of future)

e.g. path finding

17
Fundamentally different from the condition-action rule.

Search and Planning

Solving car-braking problem?

Yes, possible but not likely natural.

Appears less efficient.

Utility-based agents
If one state is preferred over the other, then it has higher utility for the agent

Utility-Function (state) = real number (degree of happiness)

The agent is aware of a utility function that estimates how close the current state is to the agent's
goal.

Characteristics

to generate high-quality behavior

Map the internal states to real numbers.

(e.g., game playing)

Looking for higher utility value utility function

18
Learning Agents
Agents capable of acquiring new competence through observations and actions.

Learning agent has the following components

Learning element

Suggests modification to the existing rule to the critic

Performance element

Collection of knowledge and procedures for selecting the driving actions

Choice depends on Learning element

Critic

Observes the world and passes information to the learning element

Problem generator

Identifies certain areas of behavior needs improvement and suggest


experiments

19
Agent Example

A file manager agent.

Sensors: commands like ls, du, pwd.

Actuators: commands like tar, gzip, cd, rm, cp, etc.

Purpose: compress and archive files that have not been used in a while.

Environment: fully observable (but partially observed), deterministic (strategic), episodic,


dynamic, discrete.

Agent vs. Program

Size an agent is usually smaller than a program.

Purpose an agent has a specific purpose while programs are multi-functional.

Persistence an agent's life span is not entirely dependent on a user launching and quitting
it.

Autonomy an agent doesn't need the user's input to function.

Problem Solving Agents


Problem solving agent

A kind of goal based agent

Finds sequences of actions that lead to desirable states.

20
Formulate Goal, Formulate Problem

Search

Execute

PROBLEMS

Four components of problem definition


Initial state that the agent starts in

Possible Actions

Uses a Successor Function

Returns <action, successor> pair

State Space the state space forms a graph in which the nodes are states
and arcs between nodes are actions.

Path

Goal Test which determine whether a given state is goal state

Path cost function that assigns a numeric cost to each path.

Step cost

Problem formulation is the process of deciding what actions and states to consider, given a
goal

Path:
A path in the state space is a sequence of states connected by a sequence of actions.

The sequence of steps done by intelligent agent to maximize the performance measure:

Goal Formulation: based on the current situation and the agents performance measure, it is
the first step in problem solving.

Problem Formulation: it is the process of deciding what actions and states to consider, given
a goal.

Search: the process of looking for different sequence.

Solution: A search algorithm takes a problem as input and returns a solution in the form of
an action sequence.

Execution: Once a solution is found, the actions it recommends can be carried out called execution
phase.

Solutions

21
A Solution to the problem is the path from the initial state to the final state

Quality of solution is measured by path cost function

Optimal Solution has the lowest path cost among other solutions

An Agent with several immediate options of unknown value can decide what to do by
first examining different possible sequences of actions that lead to a state of known
value, and then choosing the best sequence Searching Process

Input to Search : Problem

Output from Search : Solution in the form of Action Sequence

A Problem solving Agent, Assuming the environment is

Static

Observable

Discrete

Deterministic

Example

22
A Simplified Road Map of Part of Romania

Explanation:

On holiday in Romania; currently in Arad

Flight leaves tomorrow from Bucharest

Formulate goal:

be in Bucharest

Formulate problem:

states: various cities

actions: drive between cities

Find solution:

sequence of cities, e.g., Arad, Sibiu, Fagaras, Bucharest

TOY PROBLEM

Example-1 : Vacuum World

Problem Formulation
States

2 x 22 = 8 states

Formula n2n states

Initial State

23
Any one of 8 states

Successor Function

Legal states that result from three actions (Left, Right, Suck)

Goal Test

All squares are clean

Path Cost

Number of steps (each step costs a value of 1)

24
State Space for the Vacuum World.

Labels on Arcs denote L: Left, R: Right, S: Suck

Example-2 : The 8-Puzzle

States : Location of Tiles

Initial State : One of States

Successor Function : Move blank left, Right, Up, down

Goal Test : Shown in Fig. Above

Path Cost : 1 for each step

Eight puzzle is from a family of sliding block puzzles

25
NP Complete

8 puzzle has 9!/2 = 181440 states

15 puzzle has approx. 1.3*1012 states

24 puzzle has approx. 1*1025 states

Place eight queens on a chess board such that no queen can attack another queen

No path cost because only the final state counts!

Incremental formulations

Complete state formulations

States : Any arrangement of 0 to 8 queens on the board. Arrangements of n queens, one


per column in the leftmost n columns, with no queen attacking another are states

Initial state : No queens on the board

Successor function: Add a queen to an empty square. Add a queen to any square in the leftmost
empty column such that it is not attacked by any other queen. 2057 sequences to investigate

26
Goal Test: 8 queens on the board and none are attacked. 64*63**57 = 1.8*1014 possible
sequences.

SOME MORE REAL-WORLD PROBLEMS


Route finding

Touring (traveling salesman)

Logistics

VLSI layout

Robot navigation

Learning

Robotic assembly:
States: real-valued co ordinates of robot joint angels part of the object to be assembled.

Actions: continuous motion of robot joint.

Goal test: complete assembly

Path cost: time to execute.

Route-finding
Find the best route between two cities given the type & condition of existing roads & the drivers
preferences

Used in

computer networks

automated travel advisory systems

airline travel planning systems

path cost

money

seat quality

time of day

type of airplane

Traveling Salesman Problem (TSP)


A salesman must visit N cities.

Each city is visited exactly once and finishing the city started from.

There is usually an integer cost c (a, b) to travel from city a to city b.

27
However, the total tour cost must be minimum, where the total cost is the sum of the individual
cost of each city visited in the tour.

Given a road map of n cities, find the shortest tour which visits every city on the map exactly
once and then return to the original city (Hamiltonian circuit)

(Geometric version):

A complete graph of n vertices (on an unit square)

Distance between any two vertices: Euclidean distance

n!/2n legal tours

Find one legal tour that is shortest

Its an NP Complete problem no one has found any really efficient way of solving them for large
n. Closely related to the Hamiltonian-cycle problem.

VLSI layout
The decision of placement of silicon chips on breadboards is very complex. (or standard
gates on a chip).

This includes

cell layout

channel routing

The goal is to place the chips without overlap.

Finding the best way to route the wires between the chips becomes a search problem.

Searching for Solutions to VLSI Layout

28
Generating action sequences

Data structures for search trees

Generating action sequences

What do we know?

define a problem and recognize a solution

Finding a solution is done by a search in the state space

Maintain and extend a partial solution sequence

UNINFORMED SEARCH STRATEGIES


Uninformed strategies use only the information available in the problem definition

Also known as blind searching

Uninformed search methods:

Breadth-first search

Uniform-cost search

Depth-first search

Depth-limited search

Iterative deepening search

BREADTH-FIRST SEARCH

Definition:

29
The root node is expanded first, and then all the nodes generated by the node are expanded.

Expand the shallowest unexpanded node

Place all new successors at the end of a FIFO queue

Implementation:


Properties of Breadth-First Search

30
Complete

Yes if b (max branching factor) is finite

Time

1 + b + b2 + + bd + b(bd-1) = O(bd+1)

exponential in d

Space

O(bd+1)

Keeps every node in memory

This is the big problem; an agent that generates nodes at 10 MB/sec will produce
860 MB in 24 hours

Optimal

Yes (if cost is 1 per step); not optimal in general

Lessons from Breadth First Search


The memory requirements are a bigger problem for breadth-first search than is execution
time

Exponential-complexity search problems cannot be solved by uniformed methods for any


but the smallest instances

Ex: Route finding problem

Given:

31
Task: Find the route from S to G using BFS.

Step1:

Step 2:

Step3:

32
Step4:

Answer : The path in the 2nd depth level that is SBG (or ) SCG.

Time complexity

1+b+b+..+

O )

DEPTH-FIRST SEARCH OR BACK TRACKING SEARCH:

Definition:
Expand one node to the depth of the tree. If dead end occurs, backtracking is done to the next
immediate previous node for the nodes to be expanded

Expand the deepest unexpanded node

Unexplored successors are placed on a stack until fully explored

Enqueue nodes on nodes in LIFO (last-in, first-out) order. That is, nodes used as a stack data
structure to order nodes.

It has modest memory requirement.

It needs to store only a single path from the root to a leaf node, along with remaining
unexpanded sibling nodes for each node on a path

Back track uses less memory.

Implementation:

33
34
35
Properties of Depth-First Search
Complete

No: fails in infinite-depth spaces, spaces with loops

Modify to avoid repeated spaces along path

Yes: in finite spaces

36
Time

O(bm)

Not great if m is much larger than d

But if the solutions are dense, this may be faster than breadth-first search

Space

O(bm)linear space

Optimal

No

When search hits a dead-end, can only back up one level at a time even if the problem
occurs because of a bad operator choice near the top of the tree. Hence, only does
chronological backtracking

Advantage:
If more than one solution exists or no of levels is high then dfs is best because exploration is
done only a small portion of the white space.

Disadvantage:
No guaranteed to find solution.

Example: Route finding problem


Given problem:

Task: Find a route between A to B

Step 1:

37
Step 2:

Step 3:

A B C

Step 4:

A B C

38
Answer: Path in 3rd level is SADG

DEPTH-LIMITED SEARCH

Definition:
A cut off (Maximum level of the depth) is introduced in this search technique to overcome the
disadvantage of Depth First Search. The cut off value depends on the number of states.

DLS can be implemented as a simple modification to the general tree search algorithm or the
recursive DFS algorithm.

DLS imposes a fixed depth limit on a dfs.

A variation of depth-first search that uses a depth limit

Alleviates the problem of unbounded trees

Search to a predetermined depth l (ell)

Nodes at depth l have no successors

Same as depth-first search if l =

Can terminate for failure and cutoff

Two kinds of failure

Standard failure: indicates no solution

Cut off: indicates no solution within the depth limit

Properties of Depth-Limited Search


Complete

Yes if l < d

39
Time

N(IDS)=(d)b+(d-1)b+..+(1)

O(bl)

Space

O(bl)

Optimal

No if l > d

Advantage:
Cut off level is introduced in DFS Technique.

Disadvantage:
No guarantee to find the optimal solution.

E.g.: Route finding problem


Given:

B C

D E

The number of states in the given map is five. So it is possible to get the goal state at the maximum
depth of four. Therefore the cut off value is four.

Task: find a path from A to E.

1. 2. 3. 4.
A A A A

B C B C B C

40
D D
Answer: Path = ABDE Depth=3

ITERATIVE DEEPENING SEARCH (OR) DEPTH-FIRST ITERATIVE DEEPENING (DFID):

Definition:
Iterative deepening depth-first search It is a strategy that steps the issue of choosing the
best path depth limit by trying all possible depth limit

Uses depth-first search

Finds the best depth limit

Gradually increases the depth limit; 0, 1, 2, until a goal is found

Iterative Lengthening Search:


The idea is to use increasing path-cost limit instead of increasing depth limits. The resulting
algorithm called iterative lengthening search.

Implementation:

41
Properties of Iterative Deepening Search:
Complete

Yes

Time : N(IDS)=(d)b+(d-1)b2++(1)bd

O(bd)

Space

O(bd)

Optimal

Yes if step cost = 1

Can be modified to explore uniform cost tree

Advantages:
This method is preferred for large state space and when the depth of the search is not
known.

Memory requirements are modest.

Like BFS it is complete

42
Disadvantages:
Many states are expanded multiple times.

Lessons from Iterative Deepening Search

If branching factor is b and solution is at depth d, then nodes at depth d are generated once,
nodes at depth d-1 are generated twice, etc.

Hence bd + 2b(d-1) + ... + db <= bd / (1 - 1/b)2 = O(bd).

If b=4, then worst case is 1.78 * 4d, i.e., 78% more nodes searched than exist at
depth d (in the worst case).

Faster than BFS even though IDS generates repeated states

BFS generates nodes up to level d+1

IDS only generates nodes up to level d

In general, iterative deepening search is the preferred uninformed search method when
there is a large search space and the depth of the solution is not known

Example: Route finding problem


Given:

A F

B C

Task: Find a path from A to G.

Limit=0 D E G
A

Limit=1

B C F

Limit=2

1.
A

B C F 43
2.

3.
A

B C F

A-B-D-E-G Limit 4 G

A B D-E-G G

A-C-E-G Limit 3

A-F-G- Limit 2

Answer: Since it is a IDS tree the lowest depth limit (i.e.) A-F-G is selected as the solution path.

BI-DIRECTIONAL SEARCH

Definition:
It is a strategy that simultaneously searches both the directions (i.e) forward from the initial state
and backward from the goal state and stops when the two searches meet in the middle.

44
Alternate searching from the start state toward the goal and from the goal state toward the
start.

Stop when the frontiers intersect.

Works well only when there are unique start and goal states.

Requires the ability to generate predecessor states.

Can (sometimes) lead to finding a solution more quickly.

Properties of Bidirectional Search:


1. Time Complexity: O(b d/2)

2. Space Complexity: O(b d/2)

3. Complete: Yes

4. Optimal: Yes

Advantages:
Reduce time complexity and space complexity

Disadvantages:
The space requirement is the most significant weakness of bi-directional search.

If two searches do not meet at all, complexity arises in the search technique. In backward search
calculating predecessor is difficult task. If more than one goal state exists then explicitly, multiple
state searches are required.

Ex: Route Finding Problem


Given:
A

B C
45

D E
Task: Find a path from A to E

Search from forward (A):

B C

Search from backward (E):

E
E

D C

Answer: Solution path is A-C-E.

COMPARING UNINFORMED SEARCH STRATEGIES

Completeness

Will a solution always be found if one exists?

Time

How long does it take to find the solution?

46
Often represented as the number of nodes searched

Space

How much memory is needed to perform the search?

Often represented as the maximum number of nodes stored at once

Optimal

Will the optimal (least cost) solution be found?

Time and space complexity are measured in

b maximum branching factor of the search tree

m maximum depth of the state space

d depth of the least cost solution

47
UNIT-2
PROBLEM SOLVING-II
INFORMED SEARCH STRATEGIES:

Heuristic / Informed

It uses additional information about nodes (heuristics) that have not yet been explored to decide
which nodes to examine next

Use problem specific knowledge

Can find solutions more efficiently than search strategies that do not use domain specific
knowledge.

find solutions even when there is limited time available

General approach of informed search:

Best-first search: node is selected for expansion based on an evaluation function f(n)

Idea: evaluation function measures distance to the goal.

Choose node which appears best

Best First Search algorithms differs in the evaluation function

Evaluation function incorporate the problem specific knowledge in the form of h(n)

h(n) heuristic function , a component of f(n)

Estimated cost of cheapest path to the goal node

h(n) = 0, if n is the goal node

Implementation:

fringe is queue sorted in decreasing order of desirability.

Special cases: greedy search, A* search

GREEDY BEST-FIRST SEARCH

Expands the node that is closest to the goal

Consider route finding problem in Romania

48
Use of hSLD, Straight Line Distance Heuristic

Evaluation function f(n) = h(n) (heuristic), estimate of cost from n to goal

Definition:

A best first search that uses to select next node to expand is called greedy search.

Ex:

Given,

75 99 F 211
A 140 B
S
118 101

80
T R P
97

f(n)=h(n)

Straight line distance to B from A

A 366 R 193

B-0 S 253

F 178 T 329

P 98 Z 374

Solution:

From the given graph and estimated cost the goal state is estimated as B from A.

Apply the evaluation function h(n) to find a path from A to B.

1. A

2. Zh=374

A
S
h=253

T
49
h=329

3. S is selected for next level of expansion since h(n) is minimum from S, when comparing T & Z.

h=374
Z
h=178
F

A
S A
h=253 h=366

T h=329 Rh =193

4. F is selected for next level of expansion since h(n) is minimum for F.

h=374 h=253
S
3 Z
F
h=178
A
S A B
h=253 h=366 h=0

T h=329 R h=193

From F goal state B is reached. Therefore the path from A to B using greedy search is A-S-F-B = 450
(i.e.) (140+99+211).

For the problem of finding route from Arad to Burcharest

50
Greedy search example:

Assume that we want to use greedy search to solve the problem of travelling from Arad to
Bucharest.

The initial state=Arad

Arad

Sibiu(253)

Timisoara(329) Zerind(374)

The first expansion step produces:

Sibiu, Timisoara and Zerind

Greedy best-first will select Sibiu.

51
If Sibiu is expanded we get:

Arad, Fagaras, Oradea and Rimnicu Vilcea

Greedy best-first search will select: Fagaras

If Fagaras is expanded we get:

Sibiu and Bucharest

Goal reached !!

Yet not optimal (see Arad, Sibiu, Rimnicu Vilcea, Pitesti)

GREEDY SEARCH, EVALUATION:

Completeness: NO (cfr. DF-search)

Check on repeated states

Minimizing h(n) can result in false starts, e.g. Iasi to Fagaras.

52
Properties of greedy best-first search:

Complete? No can get stuck in loops, e.g., Iasi Neamt Iasi Neamt

Time? O(bm), but a good heuristic can give dramatic improvement

Space? O(bm) -- keeps all nodes in memory

Optimal? No

A* SEARCH

A better form of best-first search

f(n)=g(n) + h(n)

g(n) the cost to reach the node

h(n) the estimated cost of the cheapest path from n to the goal

f(n) the estimated cost of the cheapest solution through the node n

h(n) Admissible Heuristic / Consistent (in case of graph search)

Admissible heuristics

An admissible heuristic never overestimates the cost to reach the goal from a given state,
i.e., it is optimistic

53
Example: hSLD(n) (never overestimates the actual road distance)

Theorem: If h(n) is admissible, A* using TREE-SEARCH is optimal

A* Search

Definition:

A best first search using f as an evaluation function and an admissible h function is known as A*
search.

f(n)=g(n)+h(n)

Ex: Given,

75 99 F 211

A 140 B
S
118 101

80 97
T R
P
Straight line distance to B from A

A 366 R 193

B-0 S 253

F 178 T 329

P 98 Z 374

Solution:

From the given graph and estimated cost the goal state is estimated as B from A.

Apply the evaluation function f(n)=g(n)+h(n) to find a path from A to B.

1. A

f=0+366=366

2. Zf=75+374=449

A
S
f=140+253=393

54
T
f=118+329=447

3. S is selected for next level of expansion since f(S) is minimum from S, when comparing T & Z.

f=449
Z
f=293+178=417
F

A
S A
f=393 f=280+366=646

T
f=447 R
f=220+193=413

How to calculate g(n)

Ex:

A-S-A

140+140=280

A-S-F

140+99=239

4. R is selected for next level of expansion since f(R) is minimum when comparing to A and F.

f=449
Z
f=417
F

A
S
f=393 f=646 A
S
f=553

Tf=447 R
f=413

f=415 P

55
5. P is selected for next level of expansion since f(P) is minimum.

Z
f=449 f=417
F
A f=646
A
S f=393

S f=553

Tf=447 R
f=413 f=317+193=510
R

f=415 B

f=418+0=418

From P, goal state B is reached. Therefore the path from A to B using A* search is A-S-R-P-B = 418
(i.e.) (40+80+97+101).

A* search example:

Find Bucharest starting at Arad

f(Arad) = c(??,Arad)+h(Arad)=0+366=366

Expand Arrad and determine f(n) for each node

f(Sibiu)=c(Arad,Sibiu)+h(Sibiu)=140+253=393

56
f(Timisoara)=c(Arad,Timisoara)+h(Timisoara)=118+329=447

f(Zerind)=c(Arad,Zerind)+h(Zerind)=75+374=449

Best choice is Sibiu

Expand Sibiu and determine f(n) for each node

f(Arad)=c(Sibiu,Arad)+h(Arad)=280+366=646

f(Fagaras)=c(Sibiu,Fagaras)+h(Fagaras)=239+179=415

f(Oradea)=c(Sibiu,Oradea)+h(Oradea)=291+380=671

f(Rimnicu Vilcea)=c(Sibiu,Rimnicu Vilcea)+

h(Rimnicu Vilcea)=220+192=413

Best choice is Rimnicu Vilcea

Expand Rimnicu Vilcea and determine f(n) for each node

f(Craiova)=c(Rimnicu Vilcea, Craiova)+h(Craiova)=360+160=526

f(Pitesti)=c(Rimnicu Vilcea, Pitesti)+h(Pitesti)=317+100=417

f(Sibiu)=c(Rimnicu Vilcea,Sibiu)+h(Sibiu)=300+253=553

Best choice is Fagaras

57
Expand Fagaras and determine f(n) for each node

f(Sibiu)=c(Fagaras, Sibiu)+h(Sibiu)=338+253=591

f(Bucharest)=c(Fagaras,Bucharest)+h(Bucharest)=450+0=450

Best choice is Pitesti !!!

Expand Pitesti and determine f(n) for each node

f(Bucharest)=c(Pitesti,Bucharest)+h(Bucharest)=418+0=418

Best choice is Bucharest !!!

Optimal solution (only if h(n) is admissable)

Note values along optimal path !!

Optimality of A* (proof)

Suppose some suboptimal goal G2 has been generated and is in the fringe. Let n be an
unexpanded node in the fringe such that n is on a shortest path to an optimal goal G.

58
Suppose suboptimal goal G2 in the queue.

Let n be an unexpanded node on a shortest to optimal goal G.

f(G2 ) = g(G2 ) since h(G2 )=0

> g(G) since G2 is suboptimal

>= f(n) since h is admissible

Since f(G2) > f(n), A* will never select G2 for expansion

Discards new paths to repeated state.

Previous proof breaks down

Solution:

Add extra bookkeeping i.e. remove more expensive of two paths.

Ensure that optimal path to any repeated state is always first followed.

Extra requirement on h(n): consistency (monotonicity)

Consistency:

59
A heuristic is consistent if

h(n) c(n, a, n' ) h(n' )

If h is consistent, we have

f ( n' ) g ( n' ) h( n' )


g (n) c(n, a, n' ) h(n' )
g ( n) h( n)
f ( n)

i.e. f(n) is nondecreasing along any path.

Theorem: If h(n) is consistent, A* using GRAPH-SEARCH is optimal

Optimality of A*(more usefull)

A* expands nodes in order of increasing f value

Contours can be drawn in state space

Uniform-cost search adds circles.

F-contours are gradually

Added:

1) nodes with f(n)<C*

60
2) Some nodes on the goal

Contour (f(n)=C*).

Contour I has all

Nodes with f=fi, where

fi < fi+1.

Contours of equal f-cost

A* Characteristics:

A* expands no nodes with f(n)>C*

These nodes are said to be pruned

This pruning still guarantees optimality

61
Expands nodes in the increasing order of costs

A* is optimally efficient

For a given heuristic, A* finds optimal solution with the fewest number of nodes
expansion

Any algorithm that doesnt expand nodes with f(n)<C* has the risk of missing an
optimal solution

For most problems the number of nodes with costs lesser than C* is exponential

Properties of A* Search:

Completeness: YES

Time complexity: (exponential with path length)

Space complexity:(all nodes are stored)

Optimality: YES

Cannot expand fi+1 until fi is finished.

A* expands all nodes with f(n)< C*

A* expands some nodes with f(n)=C*

A* expands no nodes with f(n)>C*

Memory Bounded Heuristic Search:

2 types of Memory Bounded algorithm

IDA*

MA*

Memory bounded heuristic search:

The memory requirement of A* is reduced by combining the heuristic fn with iterative depending
resulting on IDA* algorithm.

IDA* search:

Space capacity: bd

Time complexity: difficult to characterize depends on the number of different values that the
heurestic function can take on

Optimality:yes

Completeness:Yes

62
Disadvantages: It will require more storage space in complex domain (i.e)Each contour will include
only one state with the previous contour.To avoid this, We increase the f-cost limit by a fixed
amount E on each iteration.So that the total no.of.iteration is proportional to 1/E.Such an algorihm
is called E admissible.

Function IDA*(problem) returns a solution sequence.

Inputs:problem,a problem

Local Variable:F-limit, the current f-cost limit root, a node

Root MAKE-NODE (INITIAL STATE[problem])

f-limitf-cost(root)

loop do

solutions,f-limitDFS-CONTOUR (root,f-limit)

If solution is non null then return solution

If f-limit =Then return failure

End

Properties of SMA* search:

1.Complete:Yes

If the available memory is sufficient to store the deepest solution path

2.Time&space complexity:

Depends on the available no.of.nodes

3.Optimal:If enough memory is available to store the deepest solution path, otherwise it returns the
best solution that can be reached with available memory

Advantage : SMA* uses only the available memory

Disadvantage:if enough memory is not available it leads to unoptional solutions.

Iterative-deepening A* (IDA*)

uses f-value (g + h) as the cutoff

Keep increase the cutoff for each iteration by the smallest amount, in excess of the
current cutoff

63
Recursive best-first search

RECURSIVE BEST FIRST SEARCH ALGORITHM:

1.After expanding A,S and R the current best leaf(p) has a value that is worse then the best
alernative path(f).

infinity
A

447
393 449
S
Z
T 447

A R
646 F 415 413

P
S

417 553

2.f-limit value of each recursive call is shown on the top of each current node.

3.After expanding R,the condition f[best]>f-limit(417>415) is true and returns f[best] to that node.

4.After unwinding back to & expanding F

infinity

447

S 64
393 449
Z
T 447

A R
646 F 415 417

S
591 450
B

Hints:

F[best]>f[limit]

True

F[best]

Here the f[best] is 450 and which is greater than the f-limit of 417.Therefore it returns and unwind
with f[best] value to that node.

After switching back to R and expanding P.

infinity

447
393 449
S
Z
T 447

65
A R
646 F 450 417

P
S

417 447 553

B R
418 607

The best alternative path through T costs atleast 447,therefore the path through R and P is
considered as the best one.

Keeps track of the f-value of the best-alternative path available.

If current f-values exceeds this alternative f-value than backtrack to alternative path.

Upon backtracking change f-value to best f-value of its children.

Re-expansion of this result is thus still possible.

Example:

66
RBFS evaluation

RBFS is a bit more efficient than IDA*

Still excessive node generation (mind changes)

Like A*, optimal if h(n) is admissible

Space complexity is O(bd).

IDA* retains only one single number (the current f-cost limit)

Time complexity difficult to characterize

Depends on accuracy if h(n) and how often best path changes.

IDA* en RBFS suffer from too little memory.

A search technique(algorithm) which uses all available memory are:

1.MA*(memory bounded A*)

2.SMA*(simplified MA*)

67
(simplified) memory-bounded A*:

Use all available memory.

I.e. expand best leafs until available memory is full

When full, SMA* drops worst leaf node (highest f-value)

Like RFBS backup forgotten node to its parent

What if all leafs have the same f-value?

Same node could be selected for expansion and deletion.

SMA* solves this by expanding newest best leaf and deleting oldest worst leaf.

SMA* is complete if solution is reachable, optimal if optimal solution is reachable.

Properties of SMA* search:

1.Complete:Yes

If the available memory is sufficient to store the deepest solution path

2.Time&space complexity:

Depends on the available no.of.nodes

3.Optimal:If enough memory is available to store the deepest solution path, otherwise it returns the
best solution that can be reached with available memory

Advantage : SMA* uses only the available memory

Disadvantage:if enough memory is not available it leads to unoptional solutions.

0+12=12
A

B
G
8+5=13

68
C
D D

RBFS (mimics depth-first search, but backtracks if the current path is not promising and a
better path exist.

advantage: limited size of open list

disadvantage: excessive node regeneration)

Learning to Search:

The idea is to search at the meta-level space.ach state here is a search tree. The goal is to learn from
different search strategies to avoid exploring useless parts of the tree.

Several fixed strategies BFS, Greedy Best First Search

Could an agent learn how to search better?

69
Answer: Yes.

Reason: Metalevel state space

Each state in metalevel state space capture the internal state of a program that is searching
in an object level state space

HEURISTIC FUNCTIONS:

Heuristic function

A key compound of Best First Search algorithm is heuristic function denoted h(n)

h(n)= Estimated cost of the cheapest path from node n to a goal node

Heuristic functions are the most common form in which additional knowledge of the problem
imparted to the search algorithm.

70
A good heuristic function can reduce the search process.

h(n) = estimated cost of the cheapest path from node n to goal node.

If n is goal then h(n)=0

Effective branching factor:

Ex: Depth=5

N=52

Effective branching factor=1.91

Assume the branching factor=2 depth=5

N= 1+2+(2)^2+(2)^3+(2)^4+(2)^5

= 1+2+4+8+16+32= 63

From the above example, the number of nodes in effective branching factor is reduced, when
comparing to branching factor.

That is,

Difference =63-52 = 11

E.g for the 8-puzzle:

Avg. solution cost is about 22 steps (branching factor +/- 3)

Exhaustive search to depth 22: 3.1 x 1010 states.

71
Common candidates:

h1 = the number of misplaced tiles

h1(s)=8

h2 = the sum of the distances of the tiles from their goal positions (manhattan distance). (or) city
block distance

h2(s)=3+1+2+2+2+3+3+2=18

The effect of heuristic accuracy on performance:

Effective branching factor b*

Is the branching factor that a uniform tree of depth d would have in order to contain
N+1 nodes.

N 1 1 b * (b*)2 ... (b*)d

Measure is fairly constant for sufficiently hard problems.

Can thus provide a good guide to the heuristics overall usefulness.

A good value of b* is 1.

Heuristic quality and dominance

Q: Whether h2 is always better than h1?

A: Yes.

Reason: h2 (n)>h1 (n)

1200 random problems with solution lengths from 2 to 24.

If h2(n) >= h1(n) for all n (both admissible) then h2 dominates h1 and is better for search

Inventing admissible heuristics

72
Admissible heuristics can be derived from the exact solution cost of a relaxed version of the
problem:

Relaxed problem:

Relaxed problem: A problem with fewer restrictions on the action is called Relaxed Problem.

Ex: 8-puzzle problem

Relaxed 8-puzzle for h1 : a tile can move anywhere

As a result, h1(n) gives the shortest solution

Relaxed 8-puzzle for h2 : a tile can move to any adjacent square.

As a result, h2(n) gives the shortest solution.

The optimal solution cost of a relaxed problem is no greater than the optimal solution cost of the
real problem.

Admissible heuristics can also be derived from the solution cost of a subproblem of a given
problem.

This cost is a lower bound on the cost of the real problem.

Pattern databases store the exact solution to for every possible subproblem instance.

Disjoint pattern databases: The problem can be divided up in such a way that each move affects
only one subproblem.

The complete heuristic is constructed using the patterns in the DB

Another way to find an admissible heuristic is through learning from experience:

Experience = solving lots of 8-puzzles

73
An inductive learning algorithm can be used to predict costs for other states that
arise during search.

LOCAL SEARCH ALGORITHM AND OPTIMIZATION PROBLEMS:

Local search algorithm and optimization problem:

Definition: Local search algorithm operates using a single current state and generally moves only to
neighbours of that state.

A landscape has

Location defined by the state


Elevation defined by the value of the heuristic cost function (or) objective function.
Global minimum - finds the lowest valley.
Global maximum finds the highest peak.

A complete local search always finds a goal if one exists.


An optimal algorithm always finds a global minimum/maximum

Applications:

1. Integrated circuit design


2. Factory floor layout
3. Job shop scheduling
4. Automatic programming
5. Vehicle routing
6. Telecommunication network optimization

Local Search Algorithm Types:

Hill climbing search


Simulated annealing
Local beam search
Genetic algorithm

They are useful for solving optimization problems

Aim is to find a best state according to an objective function

e.g. survival of the fittest as a metaphor for optimization

74
Many optimization problems do not fit the standard search model outlined in
chapter 3

E.g. There is no goal test or path cost in Darwinian evolution

State space landscape

Local search= use single current state and move to neighboring states.

Advantages:

Use very little memory

Find often reasonable solutions in large or infinite state spaces.

Hill-climbing search

is a loop that continuously moves in the direction of increasing value

It terminates when a peak is reached.

Hill climbing does not look ahead of the immediate neighbors of the current state.

Hill-climbing chooses randomly among the set of best successors, if there is more than one.

Hill-climbing a.k.a. greedy local search

Algorithm:

function HILL-CLIMBING( problem) return a state that is a local maximum

75
input: problem, a problem

local variables: current, a node.

neighbor, a node.

current MAKE-NODE(INITIAL-STATE[problem])

loop do

neighbor a highest valued successor of current

if VALUE [neighbor] VALUE[current] then return STATE[current]

current neighbor

Example:

8-queens problem (complete-state formulation).

Successor function: move a single queen to another square in the same column.

Heuristic function h(n): the number of pairs of queens that are attacking each other (directly
or indirectly).

A) b)

76
a) shows a state of h=17 and the h-value for each possible successor.

b) A local minimum in the 8-queens state space (h=1).

Drawbacks

Ridge = sequence of local maxima difficult for greedy algorithms to navigate

Plateaux = an area of the state space where the evaluation function is flat.

Gets stuck 86% of the time.

Hill-climbing variations:

Stochastic hill-climbing

Random selection among the uphill moves.

77
The selection probability can vary with the steepness of the uphill move.

First-choice hill-climbing

cfr. stochastic hill climbing by generating successors randomly until a better one is
found.

Random-restart hill-climbing-it conducts a series of hill climbing searches from randomly


generated initial state,stopping when a goal is found.

Tries to avoid getting stuck in local maxima.

Simulated annealing Search:

An algorithm which combines with random walk to yield both the efficiency and completeness.

Escape local maxima by allowing bad moves.

Idea: but gradually decrease their size and frequency.

Origin; metallurgical annealing-the process of gradually cooling a liquid until it freezes.

Ex:

Bouncing ball analogy:

Shaking hard (= high temperature).

Shaking less (= lower the temperature).

Properties of simulated annealing search

If T decreases slowly enough, best state is reached.

Applied for VLSI layout, airline scheduling, etc.(applications)

Algorithm:

function SIMULATED-ANNEALING( problem, schedule) return a solution state

input: problem, a problem

schedule, a mapping from time to temperature

local variables: current, a node.

next, a node.

T, a temperature controlling the probability of downward steps

current MAKE-NODE(INITIAL-STATE[problem])

78
for t 1 to do

T schedule[t]

if T = 0 then return current

next a randomly selected successor of current

E VALUE[next] - VALUE[current]

if E > 0 then current next

else current next only with probability eE /T

Algorithm Expanation:

E - E variable is introduced to calculate the probability of worse end

T it is introduced to determine the probability, which measures temperatures.

VALUE-it corresponds to the total energy of the atoms in the material.

Schedule-it determines the rate at which the temperature is lowered.

The K-Means Algorithm

1. Choose a value for K, the total number of clusters.

2. Randomly choose K points as cluster centers.

3. Assign the remaining instances to their closest cluster center.

4. Calculate a new cluster center for each cluster.

5. Repeat steps 3-5 until the cluster centers do not change.

Table 3.6 K-Means Input Values

Instance X Y

1 1.0 1.5
2 1.0 4.5
3 2.0 1.5
4 2.0 3.5
5 3.0 2.5
6 5.0 6.0

Table 3.7 Several Applications of the K-Means Algorithm (K = 2)

Outcome Cluster Centers Cluster Points Squared Error

1 (2.67,4.67) 2, 4, 6
14.50
(2.00,1.83) 1, 3, 5

2 (1.5,1.5) 1, 3 79
15.94
(2.75,4.125) 2, 4, 5, 6

3 (1.8,2.7) 1, 2, 3, 4, 5
7
6
5
4
3
f(x)
2
1General Considerations
0 Requires real-valued data. x
0 We must select
1 the number of2clusters present 3
in the data. 4 5 6
Works best when the clusters in the data are of approximately equal size.

Attribute significance cannot be determined.

Lacks explanation capabilities.

Local beam search

Keep track of k states instead of one

80
Initially: k random states

Next: determine all successors of k states

If any of successors is goal finished

Else select k best from successors and repeat.

Major difference with random-restart search

Information is shared among k search threads.

Can suffer from lack of diversity.

Stochastic variant: choose k successors at proportional to state success.

Genetic algorithms

A successor state is generated by combining two parent states

Operations

Crossover (2 parents -> 2 children)

Mutation (one bit)

Basic structure

Create population

Perform crossover & mutation (on fittest)

Keep only fittest children

81
Children carry parts of their parents data

Only good parents can reproduce

Children are at least as good as parents?

No, but worse children dont last long

Large population allows many current points in search

Can consider several regions (watersheds) at once

Fitness Function:

Value of expression assignment

Hard to judge the quality

Number of clauses that pattern satisfies

82
Representation

Children (after crossover) should be similar to parent, not random

Binary representation of numbers isnt good - what happens when you crossover in
the middle of a number?

Need reasonable breakpoints for crossover (e.g. between R, xcenter and ycenter
but not within them)

Cover

Population should be large enough to cover the range of possibilities

Information shouldnt be lost too soon

Mutation helps with this issue

General Considerations

Global optimization is not a guarantee.

The fitness function determines the complexity of the algorithm.

Explain their results provided the fitness function is understandable.

Transforming the data to a form suitable for genetic learning can be a


challenge.

83
The General Form:

84
An Example of Crossover

The CNF-satisfaction Problem:

85
ONLINE SEARCH AGENT:

Offline search vs. online search

Offline search agents

Compute a solution before setting foot in the real world

Online search agents

Interleave computation and action

E.g. takes an action and then observes environments and then computes the
next action

Necessary for an exploration problem

States and actions are unknown

E.g. robot in a new building, or labyrinth

online search problems

Agents have access to:

ACTIONS(s)- returns a list of actions allowed in states

c(s,a,s)- this step-cost cannot be used until the agent knows that s is the outcome

GOAL-TEST(s)

The agent cannot access the successors of a state except by actually trying all the actions in
that state

Assumptions

The agent can recognize a state that it has visited before

Actions are deterministic

Optionally, an admissible heuristic function

Objective: reach goal while minimizing cost

Safely Explorable space goal can be reached from any reachable state (no dead ends)

Performance of Online Search

Competitive Ratio: total path cost actually traverse vs. cost agent would traverse if it knew
state space in advance.

86
best achievable CR is often unbounded
(very high, possibly infinite)

adversary argument

Better to evaluate performance relative to size of state space, rather than depth of
shallowest goal

Robot Maze

A simple maze problem , the agent start at S and must reach G ,but knows nothing of the
environment.

Adversary Arguments:

Imagine an adversary that constructs state space as agent is moving through it and can
adjust unexplored space to create worst case behavior.

The space that would be created by this adversary is a space that would yield worst-case
behavior.

Competitive Ratio:

This defines the comparison between the total pathcost with shortest complete
exploration pathcost. This ratio should be as small as possible.

Adversaries

87
Fig(a):

Two state space that might lead on online search agent into a dead end.Any given agent will
fail in atleast one of these spaces.

Fig(b):

A two dimensional environment that can cause an online search agent to follow on
arbitrarily inefficient route to the goal.

Fig:

An environment in which a random will take exponentially many steps to find the goal.

88
online search agents:

Online algorithm can expand only a node that it physically occupies

Offline algorithms can expand any node in fringe

Same principle as DFS

Online DFS

function ONLINE_DFS-AGENT(s) return an action

input: s, a percept identifying current state

static: result, a table of the next state, indexed by action and state, initially empty

unexplored, a stack that lists, for each visited state, the action not yet tried

unbacktracked, a stack that lists, for each visited state, the predecessor states
to which the agent has not yet backtracked

s, a, the previous state and action, initially null

if GOAL-TEST(s) then return stop

89
if s is a new state then unexplored[s] ACTIONS(s)

if s is not null then do

result[a,s] s

add s to the front of unbackedtracked[s]

if unexplored[s] is empty then

if unbacktracked[s] is empty then return stop

else a an action b such that result[b, s]=POP(unbacktracked[s])

else a POP(unexplored[s])

s s

return a

Online DFS algorithm Exploration:

o Result[a,s]-records the state resulting from executing action in state S.


o An action from the current state has not been explored , then it is explored and
return action a.
o If there is no action exists from the current and possible to backtracking is done
with action b return action a.
o If the agent has run out of states to which it can backtrack then its search is
complete.

Online DFS, example

Task:

From S to reach G.

90
GOAL-TEST((1,1))?

s != G thus false

(1,1) a new state?

false

s is null?

false (s=(1,2))

result*DOWN,(1,2)+ (1,1)

UB[(1,1)] = {(1,2)}

UX[(1,1)] empty?

False

a=RIGHT, s=(1,1)

return a

91
GOAL-TEST((2,1))?

s != G thus false

(2,1) a new state?

True, UX[(2,1)]={RIGHT,UP,LEFT}

s is null?

false (s=(1,1))

result*RIGHT,(1,1)+ (2,1)

UB[(2,1)]={(1,1)}

UX[(2,1)] empty?

False

a=LEFT, s=(2,1)

return a

92
Online local search

Hill-climbing is already online

One state is stored.

Bad performance due to local maxima

Random restarts impossible.

Solution1: Random walk introduces exploration

Selects one of actions at random, preference to not-yet-tried action

can produce exponentially many steps

An environment in which a random walk will take exponentially many steps to find the goal

Random Walks

Instead of random restart, we can consider random walks for online search

select a possible action at random, give precedence to untried actions

probability of success 1 for finite space

Solution 2: Add memory to hill climber

Store current best estimate H(s) of cost to reach goal

H(s) is initially the heuristic estimate h(s)

Afterward updated with experience (see below)

Learning real-time A* (LRTA*)

The current position of agent

93
C(S,a,S)+H(S)

1+9 1+2

10 3(min)

So move right.

Learning real-time A*(LRTA*)

Build a map of the environment using result table.


H(s) is initially empty,when the process start h(s) is initialized for new states and it is
updated when the agent agains experience in the state space.
Each state is updated with minimum H(s) out of all possible actions and the same action is
returned.

function LRTA*-COST(s,a,s,H) return an cost estimate

if s is undefined the return h(s)

else return c(s,a,s) + H[s]

function LRTA*-AGENT(s) return an action

input: s, a percept identifying current state

static: result, a table of next state, indexed by action and state, initially empty

94
H, a table of cost estimates indexed by state, initially empty

s, a, the previous state and action, initially null

if GOAL-TEST(s) then return stop

if s is a new state (not in H) then H[s] h(s)

unless s is null

result[a,s] s

H[s] min LRTA*-COST(s,b,result[b,s],H)

b ACTIONS(s)

a an action b in ACTIONS(s) that minimizes LRTA*-COST(s,b,result[b,s],H)

s s

return a

CONSTRAINT SATISFACTION PROBLEMS (CSPS)(refer pg no 79)

Standard search problem:

state is a "black box any data structure that supports successor function, heuristic
function, and goal test

CSP:

state is defined by variables Xi with values from domain Di

goal test is a set of constraints specifying allowable combinations of values for


subsets of variables

Simple example of a formal representation language

Allows useful general-purpose algorithms with more power than standard search algorithms

Arc consistency:

Arc refers to a directed arc in the constraint graph.


Arc consistency checking can be applied either as a preprocessing. step before the
beginning of the search process(or) propagation step.
The process must be applied repeatedly until no more inconsistency remain.
Refer idea of arc consistency in Xerox
Explain AC-3 algorithm.

95
Path consistency:

3 consistency means that any pair of adjacent variables can always be extended to a third
neighboring variable, this is also called path consistency

K-consistency:

Stronger forms of propagation can be defined using the notation called K-consistency.
A CSP is K-consistency if for any set of K-1 variables and for any consistent assignment to
those variables, a constant value can always be assigned to any variable.

Example: Map-Coloring

Variables WA, NT, Q, NSW, V, SA, T

Domains Di = {red,green,blue}

Constraints: adjacent regions must have different colors

e.g., WA NT, or (WA,NT) in ,(red,green),(red,blue),(green,red),


(green,blue),(blue,red),(blue,green)}

96
Solutions are complete and consistent assignments, e.g., WA = red, NT = green,Q = red,NSW
= green,V = red,SA = blue,T = green

Constraint graph

Binary CSP: each constraint relates two variables

Constraint graph: nodes are variables, arcs are constraints

Refer pg 80 for explanation

97
Varieties of CSPs

Discrete variables

finite domains:

n variables, domain size d O(dn) complete assignments

e.g., Boolean CSPs, incl.~Boolean satisfiability (NP-complete)

infinite domains:

integers, strings, etc.

e.g., job scheduling, variables are start/end days for each job

need a constraint language, e.g., StartJob1 + 5 StartJob3

Continuous variables

e.g., start/end times for Hubble Space Telescope observations

linear constraints solvable in polynomial time by linear programming

Varieties of constraints:

Unary constraints involve a single variable,

e.g., SA green

Binary constraints involve pairs of variables,

e.g., SA WA

Higher-order constraints involve 3 or more variables,

e.g., cryptarithmetic column constraints

Example: Cryptarithmetic

98
Variables: F T U W
R O X 1 X2 X3

Domains: {0,1,2,3,4,5,6,7,8,9}

Constraints: Alldiff (F,T,U,W,R,O)

O + O = R + 10 X1

X1 + W + W = U + 10 X2

X2 + T + T = O + 10 X3

X3 = F, T 0, F 0

Where X1, X2 , X3 are auxilliary variables representing the digit (0&1) carried over into
the next column. Higher order constraint can be represented as constraint
hypergraph.

CSP:

CSP are mathematical problem where one must find states or object that satisfy a
number of constraint or criteria. A constraint is a restriction of the feasible solutions in
an optimization problem

Example

N-queen problem
Cross word puzzle
Map coloring problem

Real-world CSPs

Assignment problems

e.g., who teaches what class

Timetabling problems

e.g., which class is offered when and where?

Transportation scheduling

Factory scheduling

Notice that many real-world problems involve real-valued variables

Standard search formulation (incremental)

Let's start with the straightforward approach, then fix it

99
States are defined by the values assigned so far

Initial state: the empty assignment { }

Successor function: assign a value to an unassigned variable that does not conflict with
current assignment

fail if no legal assignments

Goal test: the current assignment is complete

1. This is the same for all CSPs

2. Every solution appears at depth n with n variables


use depth-first search

3. Path is irrelevant, so can also use complete-state formulation

4. b = (n l)d at depth l, hence n! dn leaves

The result in (4) is grossly pessimistic, because the order in which values are assigned to variables
does not matter. There are only dn assignments.

BACKTRACKING SEARCH FOR CSPS

Backtracking search:

It is used for depth first search that chooses the values for one variable at a time and backtracks
when a variable has no legal values left to assign.

variable and value ordering


propagating information through constraints
forward checking
constraint propagation
handling special constraint

intelligent backtracking: looking backward

Variable assignments are commutative}, i.e.,

[ WA = red then NT = green ] same as [ NT = green then WA = red ]

Only need to consider assignments to a single variable at each node

100
b = d and there are dn leaves

Depth-first search for CSPs with single-variable assignments is called backtracking search

Backtracking search is the basic uninformed algorithm for CSPs

Can solve n-queens for n 25

Algorithm explanation:

VARSELECT-UNASSIGNED-VARIABLE (VARIABLES [CSP], assignment, CSP)

SELECT-UNASSIGNED-VARIABLE- at simply select the next unassigned variable in the


order given by the list VARIABLES [CSP]

Intelligent backtracking: looking backward

Chronological backtracking:

Backup the preceding variables and try a different value for it called chronological
backtracking.

Ex: map coloring problem

101
Variable: WA, NT, SA, Q, NSW, V, T

Domain d={red, green, blue}

Generate the partial assignment

{Q=red, NSW=green, V=blue, T=red}

Try next variable SA

We see that every values violates a constraint. So back up to T and change it color but this is solve
the problem have to introduce the new approach here

cause the failure set called conflict set


set of variables that cause the failure. the set is called conflict set

Constraint directed back jumping:

When a branch of the search fails back track to one of the set of variables that caused the
failure conflict set. the conflict set for the variable X is set of previously assigned variables that are

102
connected to X by constraints. A backtracking alg that was conflict sets defined in this way is called
conflict directed backtracking.

Constraint Learning:

It actually modifies the CSP by adding a new constraint that is induced from these conflicts.

Back jumping: this method back tracks to the most recent variable in the conflict set

Ex: Back jumping would jump over Tasmania and try a new value for V

Backtracking example

103
104
Improving backtracking efficiency

General-purpose methods can give huge gains in speed:

Which variable should be assigned next?

In what order should its values be tried?

Can we detect inevitable failure early?

Variable and value ordering:

Variable and value ordering

MRV/most constraint value


Degree heuristic
Least constraining value

Most constrained variable

Most constrained variable:/minimum remaining values/fail-firtheuristic called MRV heuristic

choose the variable with the fewest legal values

a.k.a. minimum remaining values (MRV) heuristic

105
Tie-breaker among most constrained variables

Most constraining variable:

choose the variable with the most constraints on remaining variables

Degree heuristic:

The MRV doesnt help at all in choosing the first region to color in Australia. Because initially
every region has three colors.
In this case, the degree heuristic used.
Used to reduce branching factor on future choices by selecting the variable that is involved
in the largest number of constraint on other unassigned variables.
Useful in tie-breaker

Least constraining value

Given a variable, choose the least constraining value:

the one that rules out the fewest values in the remaining variables

Combining these heuristics makes 1000 queens feasible

Propagating information through constraints:

Forward checking

106
Idea:

Keep track of remaining legal values for unassigned variables

Terminate search when any variable has no legal values

107
RGB RGB RGB RGB RGB RGB RGB
R GB RGB RGB TGB GB RGB

R B G R B RGB B RGB

R B G B B RGB

The progress of a map coloring search with forward checking. WA=red is assigned first; then forward
checking deletes red from the domains of the neighboring variables NT and SA. After Q=green, green
is deleted from the domain of NT, SA and NSW. After V=blue, blue is deleted from the domains is
NSW and SA, leaving SA with no legal values.

Constraint propagation

Forward checking propagates information from assigned to unassigned variables, but


doesn't provide early detection for all failures:

NT and SA cannot both be blue!

Constraint propagation repeatedly enforces constraints locally

108
Constraint propagation:
consistency techniques:

Arc consistency(two consistency)


Node consistency(one consistency)
Path consistency (or) K- consistency

Handling special constraint:

Alldiff constraint
Resource constraint
Bounds propagation

Intelligent backtracking: looking backward

Chronological backtracking
Conflict set
Back jumping
Conflict-directed back jumping
Constraint learning

Handling special constraint:

Alldiff all variables involved must have distinct value

m-variable

n-values

so m>n then constraint cannot be satisfied

Resource constraint/ atmost constraint/higher order constraint

In which consistency is achieved by deleting the maximum value of any domain if it is not consistent
with minimum values of other domains.

Ex: personal assigned task PA1P(A4)

109
Constraint no more 10 personal are assigned in total

So atmost(10,PA1,PA2,PA3,PA4)

Domain(3,4,5,6)

Result: the almost constraint cannot be satisfied.

Suppose the domain{2,3,4,5,6}

The value 5 and 6 can be deleted from each domain.

Bound propagation:

Domains are represented by lower bound and upper bound and are managed by bound
propagation
Used practical constraint problem
EX:

Two flights=271,272

Planes have capacity=165,385

Initial domain

Flight271[0,165]

Flight272[0,385]

Additional constraint:

Two flights must carry 420 people so flight271+flight272[420,420]


Reduce the domain to
Flight271[35,165]
Flight272[255,385]

Finally we say that a CSP is bound constraint if for every variable X, and both the lower bound and
upper bound values of X, there exists some value of Y, that satisfies the constraint between X and Y
for every variable Y.

Arc consistency

Simplest form of propagation makes each arc consistent

110
X Y is consistent iff

for every value x of X there is some allowed y

Simplest form of propagation makes each arc consistent

X Y is consistent iff

for every value x of X there is some allowed y

If X loses a value, neighbors of X need to be rechecked

111
If X loses a value, neighbors of X need to be rechecked

Arc consistency detects failure earlier than forward checking

Can be run as a preprocessor or after each assignment

Arc consistency algorithm AC-3

112
Time complexity: O(n2d3), where n is the number of variables and d is the maximum variable
domain size, because:

At most O(n2) arcs

Each arc can be inserted into the agenda (TDA set) at most d times

Checking consistency of each arc can be done in O(d2) time

Generalized Arc Consistency Algorithm

Three possible outcomes:

1. One domain is empty => no solution

2. Each domain has a single value => unique solution

3. Some domains have more than one value => there may or may not be a solution

If the problem has a unique solution, GAC may end in state (2) or (3); otherwise, we would
have a polynomial-time algorithm to solve UNIQUE-SAT

UNIQUE-SAT or USAT is the problem of determining whether a formula known to have


either zero or one satisfying assignments has zero or has one. Although this problem seems
easier than general SAT, if there is a practical algorithm to solve this problem, then all
problems in NP can be solved just as easily [Wikipedia; L.G. Valiant and V.V. Vazirani, NP is as
Easy as Detecting Unique Solutions. Theoretical Computer Science, 47(1986), 85-94.]

LOCAL SEARCH FOR CSPS:

Use complete state formulation.


Initial state-assign a value to every variable
Successor function-works by changing the value of each variable

Hill-climbing, simulated annealing typically work with "complete" states, i.e., all variables assigned

To apply to CSPs:

allow states with unsatisfied constraints

operators reassign variable values

Variable selection: randomly select any conflicted variable

Value selection by min-conflicts heuristic:

choose value that violates the fewest constraints

i.e., hill-climb with h(n) = total number of violated constraints

113
Algorithm:

function MIN-CONFLICTS(csp, max_steps) return solution or failure

inputs: csp, a constraint satisfaction problem

max_steps, the number of steps allowed before giving up

current an initial complete assignment for csp

for i = 1 to max_steps do

if current is a solution for csp then return current

var a randomly chosen, conflicted variable from VARIABLES[csp]

value the value v for var that minimize CONFLICTS(var,v,current,csp)

set var = value in current

return failure

Example: 4-Queens

States: 4 queens in 4 columns (44 = 256 states)

Actions: move queen in column

Goal test: no attacks

Evaluation: h(n) = number of attacks

Given random initial state, can solve n-queens in almost constant time for arbitrary n with
high probability (e.g., n = 10,000,000)

Min-conflicts example 2

114
Use of min-conflicts heuristic in hill-climbing

Min-conflicts example 3

A two-step solution for an 8-queens problem using min-conflicts heuristic

At each stage a queen is chosen for reassignment in its column

The algorithm moves the queen to the min-conflict square breaking ties randomly

Advantages of local search

The runtime of min-conflicts is roughly independent of problem size.

Solving the millions-queen problem in roughly 50 steps.

Local search can be used in an online setting.

Backtrack search requires more time

Summary

CSPs are a special kind of problem:

115
states defined by values of a fixed set of variables

goal test defined by constraints on variable values

Backtracking = depth-first search with one variable assigned


per node

Variable ordering and value selection heuristics help


significantly

Forward checking prevents assignments that guarantee later


failure

Constraint propagation (e.g., arc consistency) does


additional work to constrain values and detect
inconsistencies

Iterative min-conflicts is usually effective in practice

STRUCTURE OF PROBLEMS:

Independent sub problems.


Connected Components.
Tree Algorithm.
Constraint graph can be reduced in two ways.
1. Removing nodes.
2. Collapsing nodes together.
Removing nodes.
1. Definition.
2. Cut set conditioning.
3. Cycle cut set.
4. Algorithm.
Tree decomposition.
1. Definition.
2. It satisfy 3 requirement.
3. Example.
4. Solution.
5. Complexity.
6. Tree width.

116
How can the problem structure help to find a solution quickly?

Subproblem identification is important:

Coloring Tasmania and mainland are independent subproblems

Identifiable as connected components of constrained graph.

Improves performance

Suppose each problem has c variables out of a total of n.

Worst case solution cost is O(n/c dc), i.e. linear in n

Instead of O(d n), exponential in n

E.g. n= 80, c= 20, d=2

280 = 4 billion years at 1 million nodes/sec.

4 * 220= .4 second at 1 million nodes/sec

The complete Algorithm runs in time 0(nd^2) time.

Tree-structured CSPs

Tree:

Any two variables are connected by atmost one path.

117
Theorem: if the constraint graph has no loops then CSP can be solved in O(nd 2) time

Compare difference with general CSP, where worst case is O(d n)

In most cases subproblems of a CSP are connected as a tree

DEFINITION:

Any tree-structured CSP can be solved in time linear in the number of variables.

ALGORITHM:

Choose a variable as root, order variables from root to leaves such that every nodes
parent precedes it in the ordering. (label var from X1 to Xn)

For j from n down to 2, apply REMOVE-INCONSISTENT-VALUES(Parent(Xj),Xj)

For j from 1 to n assign Xj consistently with Parent(Xj )

Steps:

1. Csp is arc consistent so the assignment of values in step.


2. Requires no backtracking.
The arc consistency is applied in reverse order to ensure the consistency of arcs that are
processed already.
The complete algorithm runs in 0(nd^2) time.

118
General constraint graph can be reduced to trees on two ways .They are

1. Removing nodes-cut set conditioning


2. Collapsing nodes together-Tree decomposition.

Cut set conditioning:

Finding the smallest cycle cut set is NP-hard but several efficient approximation algorithm
are known for this task. The overall algorithmic approach is called cut set conditioning.

Removing nodes:

This is the first approach involves assigning values to some variables so that the remaining
variables from a tree.

Tree width:

Tree decomposition techniques transform the csp into a tree of subproblem and are
efficient if the tree width of the constraint graph is small.

Nearly tree-structured CSPs

Conditioning:

Instantiate a variable, prune its neighbors domains

Cutset conditioning:

Instantiate (in all ways) a set of variables such that the remaining constraint graph is
a tree

119
Tree decomposition:

Three requirements that must be satisfied

Solve each problem independently

In any one has no solution the entire problem has no solution

Build general solution

CSP with constraint graphs of bounded tree width (w) are solvable in
polynomial time: O(ndw+1)

Finding the tree decomposition with minimal tree width is NP-hard

120
Summary

CSPs are a special kind of problem:

states defined by values of a fixed set of variables

goal test defined by constraints on variable values

Backtracking = depth-first search with one variable assigned


per node

Variable ordering and value selection heuristics help


significantly

Forward checking prevents assignments that guarantee


later failure

Constraint propagation (e.g., arc consistency) does


additional work to constrain values and detect
inconsistencies

Tree-structured CSPs can be solved in linear time

Iterative min-conflicts is usually effective in practice

GAMES:

multiagent environment.
Game.

121
Game theory
1. call.
2. History of Games.
pruning.
Evaluation functions.

Pruning:

It allows us to ignore the portions of the search tree that make no difference to the final
choice.

Heuristic evaluation function:

It allows us to approximate the true utility of a state without doing a complete search.

Optimal decision in games:

Games component.
Optimal strategy
1. Strategy.
2. Minimax value.
3. Minimax decision.
The minimax algorithm.
1. Minimax algorithm
2. Backed up.

Optimal decision in multilayer games

ADVERSARIAL SEARCH:/ often known as games

What are, why study games?

Games are a form of multi-agent environment

What do other agents do and how do they affect our success?

Competitive multi-agent environments give rise to adversarial search (games)

Problem solving agent is not alone any more

Multiagent, conflict

Default: deterministic, turn-taking, two-player, zero sum game of perfect information

Perfect info. vs. imperfect, or probability

Why study games?

Fun; historically entertaining

Interesting because they are hard

122
Easy to represent,agents restricted to small number of actions

Games vs. search problems

Search no adversary

Solution is (heuristic) method for finding goal

Heuristics and CSP techniques can find optimal solution

Evaluation function: estimate of cost


from start to goal through given node

Examples: path planning, scheduling activities

Games adversary

Solution is a strategy: specifies move for every possible opponent reply

Time limits force an approximate solution

Evaluation function: goodness of game position

Examples: chess, checkers, Othello, backgammon

Historical Contributions

computer considers possible lines of play

Babbage, 1846

algorithm for perfect play

Zermelo, 1912, Von Neumann, 1944

finite horizon, approximate evaluation

Zuse, 1945, Wiener, 1948, Shannon, 1950

first chess program

Turing, 1951

machine learning to improve evaluation accuracy

Samuel, 1952-57

pruning to allow deeper search

McCarthy, 1956

123
Types of Games

Optimal Decisions in Games:

Game Tree search Or Game formalization

Formally define a two-person game as:

Two players, called MAX and MIN.

Alternate moves

At end of game winner is rewarded and loser penalized.

Game has

Initial State: board position and player to go first

: e.g. board configuration of chess

Successor Function: returns (move, state) pairs

All legal moves from the current state

Resulting state

Terminal Test -: Is the game finished?

Utility function for terminal states. numerical value of terminal states.


e.g. win (+1), loose (-1) and draw (0) in tic-tac-toe

MAX uses search tree to determine next move

124
Initial state plus legal moves define game tree. Tic-tac-toe: Game tree (2-

player, deterministic, turns)

Optimal Strategy:

Find the contingent strategy for MAX assuming an infallible MIN opponent.

Assumption: Both players play optimally !!

Given a game tree, the optimal strategy can be determined by using the minimax value of
each node:

MINIMAX-VALUE(n)=

UTILITY(n) If n is a terminal

maxs successors(n) MINIMAX-VALUE(s) If n is a max node

mins successors(n) MINIMAX-VALUE(s) If n is a min node

Two-Ply Game Tree

125
Minimax maximizes the worst-case outcome for max

What if MIN does not play optimally?

126
Definition of optimal play for MAX assumes MIN plays optimally:
maximizes worst-case outcome for MAX.

But if MIN does not play optimally, MAX will do even better. [proven.]

Minimax Algorithm

Perfect play for deterministic games

Idea: choose move to position with highest minimax value


= best achievable payoff against best play

E.g., 2-ply game:

Two-Ply Game Tree

127
Minimax maximizes the worst-case outcome for max

Minimax-Decision:

select all the possible moves, finds the utility values returns.

Minimax-value:

The moves with high utility value returns the utility value depends on the state.

Terminal state.
Max player move in the state.
Min player move in the state.

128
Properties of minimax

Complete? Yes (if tree is finite)

Optimal? Yes (against an optimal opponent)

Time complexity? O(bm)

Space complexity? O(bm) (depth-first)

For chess, b 35, m 100 for "reasonable" games


exact solution completely infeasible

Even tic-tac-toe is much too complex to diagram here, although it's small enough to
implement.

Optimal Decision in Multiplayer games:

Games allow more than two players

Single minimax values become vectors

129
ALPHA-BETA PRUNING:

Alpha-Beta Pruning:

Pruning.
Alpha-Beta pruning.
Example.
Two parameters.
Effectiveness of alpha beta.
Alpha-Beta algorithm.
Transposition.
Transposition table.

Example:

Two ply game tree

130
The terminals nodes on the bottom level are already labeled with their utility value.

The first Min node labeled B has three successor with values 3,12,8,so its minimax value
3.

Similarily the other two min nodes have minimax value 2.

The root node is MAX node, its successor have minimax value 3,2,3.so its minimax value
is 3.

Steps are explained by the following figure:

(a)refer the textbook page 168.

The outcome of is that we can identify the minimax decision without ever evaluating
two of the leaf nodes.

Another way:

Simplification the formula for MINIMAX-VALUE. let the two unevaluated successors of
node C. have values x and y and let z be the minimum of x and y. the value of root node is given
by

MINIMAX_VALUE(root)=max(min(3,12,8),min(2,x,y),min(14,5,2))

=max(3,min(2,x,y),2)

131
=max(3,2,3) where z<=2

=3

In other words the value of the root and hence the minimax decision are
independent of the values of pruned leaves x and y.

It can be applied to tree of any depth and possible to prune entire sub tree rather than
just leaves.
It updates the value of alpha and beta as it goes along and prunes the remaining nodes.
The effecitiveness of alpha and beta pruning is highly dependent on the order in which
the successors are examined.

Transposition:

Different permutations of more than end up in the same sequence.


Ex.[a1,b1,a2,b2} &[a1,b2,a2,b1] both end up in the same position.
To store the evaluation of this position in a hash table that first time is encountered so
that we dont have to recomputed it on subsequent occurrences.
Transposition table:
The hash table of previously seen position is traditionally called transposition table.

Why is it called -?

is the value of the best (i.e., highestvalue)choice found sofar at any choice point along the path for
max

If v is worse than ,max will avoid it prune that branch

Define similarly for min Pruning

Minimax problem: number of games states is exponential in the number of moves.

132
Alpha-beta pruning

Remove branches that dont influence final decision

The - algorithm:

Example:

133
134
135
136
137
Properties of -

Pruning does not affect final result

Good move ordering improves effectiveness of pruning

With "perfect ordering," time complexity = O(bm/2)

doubles depth of search which can be carried out for a given level of resources.

A simple example of the value of reasoning about which


computations are relevant (a form of metareasoning)

IMPERFECT REAL TIME DECISIONS:

Imperfect Real time Decisions:

Evaluation function
Cut off test
Evaluation functions
1. Features.
2. Expected value.
3. Material value.
4. Weighted linear function.
Cutting off search
1. Quiescence.

138
2. Quiescence search.
3. Horizon effect.
4. Singular extensions.
5. Forward pruning.

Features:

Most evaluation function by calculating various features of the state.

Ex: chess game

Pawn have various features to win,loss,draw.

Expected value or weighted average:

It can be determined for each category, resulting in an evaluation function that works for
any state.
The evaluation function need not return actual expected values, as long as the ordering of
the state is the same.

Ex: experience suggest=72%

72% to win(+1)

20% to loss(-1)

8% to draw(0)

(0.72*+1)+(0.20*-1)+(0.08*0)=0.52

Cutting of search:

To perform a cut off test, an evaluation function should be applied to positions that are
quiexent ,that is a position that will not swing in bad value for long time in the search tree is known
as waiting for quiescence.

Quiescence search:

A search which is restricted to consider only certain type of moves such as capture moves,
that will quickly resolves the uncertainties in the position.

Suppose we have 100 secs, explore 104 nodes/sec


106 nodes per move

Standard approach:

139
cutoff test:

e.g., depth limit (perhaps add quiescence search)

evaluation function

= estimated desirability of position

Evaluation function

Evaluation function or static evaluator is used to evaluate the goodness of a game


position.

Contrast with heuristic search where the evaluation function was a non-negative
estimate of the cost from the start node to a goal and passing through the given
node

The zero-sum assumption allows us to use a single evaluation function to describe the
goodness of a board with respect to both players.

f(n) >> 0: position n good for me and bad for you

f(n) << 0: position n bad for me and good for you

f(n) near 0: position n is a neutral position

f(n) = +infinity: win for me

f(n) = -infinity: win for you

140
Min-Max

A typical evaluation function is a linear function in which some set of coefficients is used to weight a
number of "features" of the board position.

For chess, typically linear weighted sum of features

Eval(s) = w1 f1(s) + w2 f2(s) + + wn fn(s)

e.g., w1 = 9 with f1(s) = (number of white queens) (number of black queens), etc.

141
"material", : some measure of which pieces one has on the board.

A typical weighting for each type of chess piece is shown Other types of features try to encode
something about the distribution of the pieces on the board.

Cutting off search

MinimaxCutoff is identical to MinimaxValue except

1. Terminal? is replaced by Cutoff?

2. Utility is replaced by Eval

Change:

if TERMINAL-TEST(state) then return UTILITY(state)

into

if CUTOFF-TEST(state,depth) then return EVAL(state)

Introduces a fixed-depth limit depth

Is selected so that the amount of time will not exceed what the rules of the game
allow.

When cutoff occurs, the evaluation is performed.

Does it work in practice?

bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player!

4-ply human novice

8-ply typical PC, human master

142
12-ply Deep Blue, Kasparov

The key idea is that the more look ahead we can do, that is, the deeper in the tree we can look, the
better our evaluation of a position will be, even with a simple evaluation function. In some sense, if
we could look all the way to the end of the game, all we would need is an evaluation function that
was 1 when we won and -1 when the opponent won.

it seems to suggest that brute-force search is all that matters.

And Deep Blue is brute indeed... It had 256 specialized chess processors coupled into a 32 node
supercomputer. It examined around 30 billion moves per minute. The typical search depth was
13ply, but in some dynamic situations it could go as deep as 30.

Horizon problem:

When the program is facing a move by the opponent that causes serious damage and
ulitimately unavoidable.

Ex: A

Beginning of the search-1player


B C D

143
M max A

-4 B0 2C D

0 -4
E F

This diagram shows the situation of horizon problem that is when one level is generated from B,it
causes bad value for B. max
A

6 0 2
B C D
6

2E F

5 6 7 6
G H I J

When one more successor level is generated from E and F and situation comes down and the value
of B is retained as a good move.

The time B is waited for this situation is called waiting for quiescence.

Singular Extensions:

Used to avoid the horizon effect without adding to much search cost.

Forward pruning:

Some moves at a given node are pruned immediately without further consideration.

Imperfect ,Real-time Decisions

The minimax algorithm generates the entire game search space,whereas the alpha-beta
algorithm allows us to prune large parts of it. However,alpha-beta still has to search all the way to
terminal states for atleast a portion of search space. Shannons 1950 paper,Programming a
computer for playing chess,proposed that programs should cut off the search earlier and apply a

144
heuristic evaluation function to states in the search,effectively turning nonterminal nodes into
terminal leaves. The basic idea is to alter minimax or alpha-beta in two ways :
(1) The utility function is replaced by a heuristic evaluation function EVAL,which gives an estimate of
the positions utility,and
(2) the terminal test is replaced by a cutoff test that decides when to apply EVAL.

Realtime decisions

What if you dont have enough time to explore entire search tree?

We cannot search all the way down to terminal state for all decision sequences

Use a heuristic to approximate (guess) eventual terminal state

Evaluation Function

The heuristic that estimates expected utility

Cannot take too long (otherwise recurse to get answer)

It should preserve the ordering among terminal states

otherwise it can cause bad decision making

Define features of game state that assist in evaluation

what are features of chess?

Truncating minimax search

When do you recurse or use evaluation function?

Cutoff-Test (state, depth) returns 1 or 0

When 1 is returned, use evaluation function

Cutoff beyond a certain depth

Cutoff if state is stable (more predictable)

Cutoff moves you know are bad (forward pruning)

Benefits of truncation

Comparing Chess

Using minimax 5 ply

Average Human 6-8 ply

145
Using alpha-beta 10 ply

Intelligent pruning 14 ply

Games that include an element of chance:


Position evaluation in games with chance nodes
Complexity of expect minimax
Card games

Practical issues:

GAMES THAT INCLUDE AN ELEMENT OF CHANCE:

Backgammon is a two-player game with uncertainty.

Players roll dice to determine what moves to make.

White has just rolled 5 and 6 and has four legal moves:

5-10, 5-11

5-11, 19-24

146
5-10, 10-16

5-11, 11-16

Such games are good for exploring decision making in adversarial problems involving skill
and luck.

Backgammon: move all ones pieces off the board

Branches leading from each chance node denote the possible dice rolls

Labeled with roll and the probability

147
[1,1], [6,6] chance 1/36, all other chance 1/18

Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,11-16)

Cannot calculate definite minimax value, only expected value

Decision-Making in Non-Deterministic Games

Probable state tree will depend on chance as well as moves chosen

Add "chance" notes to the max and min nodes.

Compute expected values for chance nodes.

Game Trees with Chance Nodes

Chance nodes (shown as circles) represent random events

For a random event with N outcomes, each chance node has N distinct children; a
probability is associated with each

(For 2 dice, there are 21 distinct outcomes)

Use minimax to compute values for MAX and MIN nodes

Use expected values for chance nodes

For chance nodes over a max node, as in C:

expectimax(C) = i(P(di) * maxvalue(i))

For chance nodes over a min node:

expectimin(C) = i(P(di) * minvalue(i))

148
Meaning of the evaluation function

Dealing with probabilities and expected values means we have to be careful about the
meaning of values returned by the static evaluator.

Note that a relative-order preserving change of the values would not change the decision
of minimax, but could change the decision with chance nodes.

Linear transformations are OK

149
Expected minimax value

EXPECTED-MINIMAX-VALUE(n) =

UTILITY(n) If n is a terminal

maxs successors(n) MINIMAX-VALUE(s) If n is a max node

mins successors(n) MINIMAX-VALUE(s) If n is a max node

s successors(n) P(s) * EXPECTEDMINIMAX(s) If n is a chance node

These equations can be backed-up recursively all the way to the root of the game tree.

Position evaluation with chance nodes:

Left, A1 is best

Right, A2 is best

150
Outcome of evaluation function (hence the agent behavior) may change when values are
scaled differently.

Behavior is preserved only by a positive linear transformation of EVAL.

151
152
153
154
Summary

Games illustrate several important points


about AI

Perfection is unattainable approximation

Good idea what to think about what to think about

Uncertainty constrains the assignment of values to states

Games : AI
as
grand prix racing : automobile design

Glossary of Terms

A constraint satisfaction problem is a problem in which the goal is to choose a value for
each of a set of variables, in such a way that the values all obey a set of constraints.

A constraint is a restriction on the possible values of two or more variables. For example, a
constraint might say that A=a is not allowed in conjunction with B=b.

Backtracking search is a form of DFS in which there is a single representation of the state
that gets updated for each successor, and then must be restored when a dead end is
reached.

A directed arc from variable A to variable B in a CSP is arc consistent if, for every value in the
current domain of A, there is some consistent value of B.

Backjumping is a way of making backtracking search more efficient by jumping back more
than one level when a dead end is reached.

Min-conflicts is a heuristic for use with local search on CSP problems. The heuristic says that,
when given a variable to modify, choose the value that conflicts with the fewest number of
other variables.

155
UNIT-3

KNOWLEDGE REPRESENTATION

FIRST ORDER LOGIC:

Pros and cons of propositional logic:

Propositional logic is declarative

Propositional logic allows partial/disjunctive/negated information

(unlike most data structures and databases)

Propositional logic is compositional:

meaning of B1,1 P1,2 is derived from meaning of B1,1 and of P1,2

Meaning in propositional logic is context-independent

(unlike natural language, where meaning depends on context)

Propositional logic has very limited expressive power

(unlike natural language)

E.g., cannot say "pits cause breezes in adjacent squares

except by writing one sentence for each square

First-order logic

Whereas propositional logic assumes the world contains facts,

first-order logic (like natural language) assumes the world contains

Objects, which are things with individual identities

Properties of objects that distinguish them from other objects

Relations that hold among sets of objects

Functions, which are a subset of relations where there is only one value for any
given input

Examples:

Objects: Students, lectures, companies, cars ...

Relations: Brother-of, bigger-than, outside, part-of, has-color, occurs-after, owns,


visits, precedes, ...

Properties: blue, oval, even, large, ...

156
Functions: father-of, best-friend, second-half, one-more-than ...

Logics

Logics are characterized by what they commit to as "primitives".

Ontology

a specification of a conceptualization

A description of the objects and relationships that can exist

Propositional logic had only true/false relationships

First-order logic has many more relationships

The ontological commitment of languages is different

How much can you infer from what you know?

Temporal logic defines additional ontological commitments because


of timing constraints

Higher-order logic

First-order logic is first because you relate objects (the first-order entities that actually exist in the
world)

There are 10 chickens chickens.number=10

There are 10 ducks ducks.number=10

You cannot build relationships between relations or functions

There are as many chickens as ducks chickens.number = ducks.number

the number of objects belonging to a group must be a property of the group, and
not the objects themselves

Cannot represent Leibnizs law: If x and y share all properties, x is y

Another characterization of a logic

Epistemological commitments

The possible states of knowledge permitted with respect to each fact

In first-order logic, each sentence is a statement that is

True, false, or unknown

157
Formal structure of first-order logic

Models of first-order logic contain:

A set of objects (its domain)

Alice, Alices left arm, Bob, Bobs hat

Relationships between objects

Represented as tuples

Sibling (Alice, Bob), Sibling (Bob, Alice)

On head (Bob, hat)

Person (Bob), Person (Alice)

Some relationships are functions if a given object is related to exactly one


object in a certain way

Alice -> Alices left arm

Names of things are abitrary

Knowledge base adds meaning

Number of possible domain elements is unbounded

Number of models is unbounded

Checking enumeration by entailment is impossible

158
Logic What Exists Knowledge States
in World
Propositional Facts true/false/unknown
First-Order facts, objects, true/false/unknown
relations
Temporal facts, objects, true/false/unknown
relations,
times
Probability Facts degree of belief 0..1
Theory
Fuzzy Facts degree of truth 0..1

SYNTAX AND SEMANTICS OF FOL:

Syntax

Rules for constructing legal sentences in the logic

Which symbols we can use (English: letters, punctuation)

How we are allowed to combine symbols

Semantics

How we interpret (read) sentences in the logic

Assigns a meaning to each sentence

Example: All lecturers are seven foot tall

A valid sentence (syntax)

And we can understand the meaning (semantics)

This sentence happens to be false (there is a counterexample)

159
MODELS FOR FOL:

Example:

160

Das könnte Ihnen auch gefallen