Sie sind auf Seite 1von 19

COMP 424 - Artificial Intelligence Lecture 5: Game Playing

Instructors:

Joelle Pineau (jpineau@cs.mcgill.ca) Sylvie Ong (song@cs.mcgill.ca)

Class web page: www.cs.mcgill.ca/~jpineau/comp424

Search approaches for different domains


Standard search
deterministic actions, fully observable state of the world
BFS, DFS. best-first search, A* search.

Searching under uncertainty


non-deterministic actions, fully observable state of the world
AND-OR search.

deterministic actions, un-/partially-observable state of the world


search in belief space

All the above are 1-agent problems. Today: Adversarial search games with 2 players
COMP-424: Artificial intelligence 2 Joelle Pineau

Overview
Why games? Minimax search Evaluation functions Alpha-beta pruning State-of-the-art game playing programs

COMP-424: Artificial intelligence

Joelle Pineau

Game playing
One of the oldest, most well-studied domains in AI!
People like them! And are good at playing them. Many games are hard:
State spaces are very large and complicated. E.g. Chess has branching factor of 35, games go to 50 moves per player, so 35100 nodes in search tree! Sometimes there is stochasticity and imperfect information. Real-time constraints (e.g. fixed amount of time between moves).

Why?

Clear, clean description of the environment. Easy performance indicator.

COMP-424: Artificial intelligence

Joelle Pineau

Human or computer - who is better?


Checkers:
1994: Chinook (U.of A.) beat world champion Marion Tinsley, ending 40-yr reign.

Othello:
1997: Logistello (NEC research) beat the human world champion. Today: world champions refuse to play AI computer program (because its too good).

Chess:
1997: Deep Blue (IBM) beat world champion Gary Kasparov

Backgammon:
TD-Gammon (IBM) is world champion amongst humans and computers

Go:
Human champions refuse to play top AI player (because its too weak)

Bridge:
Still out of reach for AI players because of coordination issue.

COMP-424: Artificial intelligence

Joelle Pineau

Types of games
Perfect vs Imperfect information
Perfect: See the exact state of the game E.g. chess, backgammon, checkers, go, othello Imperfect: Information is hidden
E.g. scrabble, bridge, most card games

Deterministic vs Stochastic
Deterministic: Change in state is fully determined by player move. E.g. chess Stochastic: Change in state is partially determined by chance. E.g. backgammon

COMP-424: Artificial intelligence

Joelle Pineau

Game playing as search


Consider 2-player zero-sum game with deterministic actions, perfect information. Can we formulate them as a search problem?
State: Operators: Goal: Cost:

We want to find a strategy (i.e. way of picking moves) that wins the game.
COMP-424: Artificial intelligence 7 Joelle Pineau

Adversarial search definition


S0: The initial state of the game. Player(s): Which player has the next move in a state. Actions(s): Set of legal moves for that state. Results(s,a): Transition model, defining the result of a move. Terminal-Test(s): True if the game is over, False otherwise. Utility(s,p): Final numerical value for a game that ends in terminal state s for player p.

COMP-424: Artificial intelligence

Joelle Pineau

Game search challenge


Not quite the same as simple searching. There is an opponent! The opponent is malicious!
Opponent is trying to make things good for itself, and bad for us. We have to simulate the opponents decisions.

Key idea:
Define a max player (who wants to maximize its utility) And a min player (who wants to minimize it.)

COMP-424: Artificial intelligence

Joelle Pineau

Example: Game Tree for Tic-Tac-Toe

COMP-424: Artificial intelligence

10

Joelle Pineau

Minimax search
Expand complete search tree, until terminal states have been reached and their utilities computed. Go back up from leaves towards the current state of the game.
At each min node: backup the worst value among the children. At each max node: backup the best value among the children.

In detail:

COMP-424: Artificial intelligence

11

Joelle Pineau

Minimax evaluation at each node


Minimax (s) =
Utility(s) if Terminal-Test(s) = True maxa in Actions(s) Minimax(Results(s,a)) if Player(s) = Max-player mina in Actions(s) Minimax(Results(s,a)) if Player(s) = Min-player

COMP-424: Artificial intelligence

12

Joelle Pineau

Properties of Minimax search


Complete?
If the game tree is finite.

Optimal?
Against an optimal opponent (otherwise we dont know.)

Time complexity?
O(bm)

Space complexity?
O(bm) if we re-use the search tree between moves O(bm) if we use DFS

Why not use Minimax to solve chess? In chess: b35, m100, so an exact solution is impossible!

COMP-424: Artificial intelligence

13

Joelle Pineau

Coping with resource limitations


Suppose we have 100 seconds to make a move, and we can search 104 nodes per second.
Can only search 106 nodes. (Or even fewer, if we spend time deciding which nodes to search.)

Possible approach:
Use a cutoff test (e.g. based on depth limit) Use an evaluation function for the nodes where we cutoff the search.

COMP-424: Artificial intelligence

14

Joelle Pineau

Cutting the search effort


Use evaluation function to evaluate non-terminal nodes.
Helps us make a decision without searching until the end of the game.

Minimax cutoff algorithm: Same as standard Minimax, except stop at some maximum depth m and use the evaluation function on those nodes.

COMP-424: Artificial intelligence

15

Joelle Pineau

Minimax with cutoff


Minimax (s) =
Utility(s) if Terminal-Test(s) = True maxa in Actions(s) Minimax(Results(s,a)) if Player(s) = Max-player mina in Actions(s) Minimax(Results(s,a)) if Player(s) = Min-player

H-Minimax (s,d) =
EVAL(s) if Cutoff-Test(s,d) = True if Player(s) = Max-player mina in Actions(s) H-Minimax(Results(s,a),d+1) if Player(s) = Min-player
COMP-424: Artificial intelligence 16 Joelle Pineau

maxa in Actions(s) H-Minimax(Results(s,a),d+1)

Evaluation functions
An evaluation function v(s) represents the goodness of a board state (e.g. chance of winning from that position).

If the features of the board can be evaluated independently, use a weighted linear function:
v(s) = w1f1(s) + w2f2(s) + + wnfn(s) (where s is board state)

This function can be given by the designer or learned from experience.

COMP-424: Artificial intelligence

17

Joelle Pineau

Example: Chess

Black to move White slightly better

White to move Black winning

Linear evaluation function: v(s) = w1f1(s) + w2f2(s)


w1 = 9 w2 = 3 f1(s) = (# white queens) - (# black queens) f2(s) = (# white pawns) - (# black pawns)

COMP-424: Artificial intelligence

18

Joelle Pineau

How precise should the evaluation fn be?


Evaluation function is only approximate, and is usually better if we are close to the end of the game. Move chosen is the same if we apply a monotonic transformation to the evaluation function.

Only the order of the numbers matter: payoffs in deterministic games act as an ordinal utility function.

COMP-424: Artificial intelligence

19

Joelle Pineau

Minimax cutoff in Chess


How many moves ahead can we search in Chess?
>> 106 nodes with b=35 allows us to search 4 moves ahead!

Is this useful?
4-moves ahead novice player 8-moves ahead human master, typical PC 12-moves ahead Deep Blue, Kasparov

Key idea:
Search few lines of play, but search them deeply. Need pruning!

COMP-424: Artificial intelligence

20

Joelle Pineau

10

- Pruning example
3

12

COMP-424: Artificial intelligence

21

Joelle Pineau

- Pruning example
3

12

COMP-424: Artificial intelligence

22

Joelle Pineau

11

- Pruning example
3

X 3 12 8 2

COMP-424: Artificial intelligence

23

Joelle Pineau

- Pruning example
3

14

X 3 12 8 2

X 14

COMP-424: Artificial intelligence

24

Joelle Pineau

12

- Pruning example
3

14 5

X 3 12 8 2

X 14 5

COMP-424: Artificial intelligence

25

Joelle Pineau

- Pruning example
3 3

14 5 2

X 3 12 8 2

X 14 5 2

COMP-424: Artificial intelligence

26

Joelle Pineau

13

- Pruning
Simple idea: if a path looks worse than what we already have, discard it. Standard technique for deterministic, perfect information games. Algorithm is like Minimax, but keeps track of best leaf value for our player () and best one for the opponent () If the best move at a node cannot change (regardless of what we would find by searching) then no need to search further! Provably correct (i.e. wont cut off a good branch) given the evaluation function.
COMP-424: Artificial intelligence 27 Joelle Pineau

A closer look at our example

COMP-424: Artificial intelligence

28

Joelle Pineau

14

Properties of - pruning
Pruning does not affect the final result! But can greatly increase efficiency!! Good move ordering is key to the effectiveness of pruning.
With perfect ordering, time complexity is O(bm/2)
Means double the search depth, for same resources. In chess: this is difference between novice and expert player.

With bad move ordering, time complexity is O(bm)


Means nothing was pruned.

Evaluation function can be used to order the nodes.

COMP-424: Artificial intelligence

29

Joelle Pineau

- Pruning example
3 3

14 5 2

X 3 12 8 2

X 14 5 2

COMP-424: Artificial intelligence

30

Joelle Pineau

15

Properties of - pruning
Pruning does not affect the final result! But can greatly increase efficiency!! Good move ordering is key to the effectiveness of pruning.
With perfect ordering, time complexity is O(bm/2)
Means double the search depth, for same resources. In chess: this is difference between novice and expert player.

With bad move ordering, time complexity is O(bm)


Means nothing was pruned.

Evaluation function can be used to order the nodes.

The - pruning demonstrates the value of reasoning about which computations are important!
COMP-424: Artificial intelligence 31 Joelle Pineau

Forward pruning
Another simple idea:
Only explore n best moves for current state (according to the evaluation function).

Unlike - pruning, this can lead to sub-optimal solution. Can be very efficient with a good evaluation function.

COMP-424: Artificial intelligence

32

Joelle Pineau

16

Deep Blue (IBM)


Specialized chess processor, special-purpose memory architecture. Very sophisticated evaluation function (expert features, tuned weights). Database of standard openings/closings. Uses a version of - pruning (with undisclosed improvements)
Can search up to 40-deep in some branches.

Can search over 30 billion positions (depth ~ 14) per move.

COMP-424: Artificial intelligence

33

Joelle Pineau

Chinook (Schaeffer, U. of Alberta)


Best checkers player (since 1990s). Plain - search, performed on standard PCs. Evaluation function based on expert features of the board. Opening database, HUGE endgame database (39 trillion positions!) Only a few moves in the middle of the game are actually searched. Since 2007, Chinook can play perfectly.

COMP-424: Artificial intelligence

34

Joelle Pineau

17

Logistello (Buro, U. of Alberta)


Best Othello player since 1997. Thinks during the opponents time. Smaller search space than Chess (~5-15 legal moves)

- search with a linear evaluation function Evaluation function had to be developed from scratch.
hand-selected features and weights tuned by learning

Database of openings, continuously updated.

COMP-424: Artificial intelligence

35

Joelle Pineau

Why is Go so hard?
Computer Go players are at Master level for 9x9 board, but at advanced amateur level for full 19x19 board. Branching factor > 300, so search is very difficult. Difficult to write a good evaluation function until the endgame. Top programs, such a MoGo, avoid alpha-beta search and instead use Monte-Carlo rollouts.
Sample random moves in the first few iterations. Guide the sampling process to prefer moves that have led to wins in previous samples (but dont prune! i.e. still consider all moves) Use knowledge-based rules to select particular moves when certain patterns are detected.

COMP-424: Artificial intelligence

36

Joelle Pineau

18

Summary
Understand how to define game playing as a search problem. Be able to implement the Minimax algorithm with alpha-beta pruning. Understand the use of an evaluation function. Read all of Ch.5 in detail.

COMP-424: Artificial intelligence

37

Joelle Pineau

19

Das könnte Ihnen auch gefallen