Beruflich Dokumente
Kultur Dokumente
Instructors:
All the above are 1-agent problems. Today: Adversarial search games with 2 players
COMP-424: Artificial intelligence 2 Joelle Pineau
Overview
Why games? Minimax search Evaluation functions Alpha-beta pruning State-of-the-art game playing programs
Joelle Pineau
Game playing
One of the oldest, most well-studied domains in AI!
People like them! And are good at playing them. Many games are hard:
State spaces are very large and complicated. E.g. Chess has branching factor of 35, games go to 50 moves per player, so 35100 nodes in search tree! Sometimes there is stochasticity and imperfect information. Real-time constraints (e.g. fixed amount of time between moves).
Why?
Joelle Pineau
Othello:
1997: Logistello (NEC research) beat the human world champion. Today: world champions refuse to play AI computer program (because its too good).
Chess:
1997: Deep Blue (IBM) beat world champion Gary Kasparov
Backgammon:
TD-Gammon (IBM) is world champion amongst humans and computers
Go:
Human champions refuse to play top AI player (because its too weak)
Bridge:
Still out of reach for AI players because of coordination issue.
Joelle Pineau
Types of games
Perfect vs Imperfect information
Perfect: See the exact state of the game E.g. chess, backgammon, checkers, go, othello Imperfect: Information is hidden
E.g. scrabble, bridge, most card games
Deterministic vs Stochastic
Deterministic: Change in state is fully determined by player move. E.g. chess Stochastic: Change in state is partially determined by chance. E.g. backgammon
Joelle Pineau
We want to find a strategy (i.e. way of picking moves) that wins the game.
COMP-424: Artificial intelligence 7 Joelle Pineau
Joelle Pineau
Key idea:
Define a max player (who wants to maximize its utility) And a min player (who wants to minimize it.)
Joelle Pineau
10
Joelle Pineau
Minimax search
Expand complete search tree, until terminal states have been reached and their utilities computed. Go back up from leaves towards the current state of the game.
At each min node: backup the worst value among the children. At each max node: backup the best value among the children.
In detail:
11
Joelle Pineau
12
Joelle Pineau
Optimal?
Against an optimal opponent (otherwise we dont know.)
Time complexity?
O(bm)
Space complexity?
O(bm) if we re-use the search tree between moves O(bm) if we use DFS
Why not use Minimax to solve chess? In chess: b35, m100, so an exact solution is impossible!
13
Joelle Pineau
Possible approach:
Use a cutoff test (e.g. based on depth limit) Use an evaluation function for the nodes where we cutoff the search.
14
Joelle Pineau
Minimax cutoff algorithm: Same as standard Minimax, except stop at some maximum depth m and use the evaluation function on those nodes.
15
Joelle Pineau
H-Minimax (s,d) =
EVAL(s) if Cutoff-Test(s,d) = True if Player(s) = Max-player mina in Actions(s) H-Minimax(Results(s,a),d+1) if Player(s) = Min-player
COMP-424: Artificial intelligence 16 Joelle Pineau
Evaluation functions
An evaluation function v(s) represents the goodness of a board state (e.g. chance of winning from that position).
If the features of the board can be evaluated independently, use a weighted linear function:
v(s) = w1f1(s) + w2f2(s) + + wnfn(s) (where s is board state)
17
Joelle Pineau
Example: Chess
18
Joelle Pineau
Only the order of the numbers matter: payoffs in deterministic games act as an ordinal utility function.
19
Joelle Pineau
Is this useful?
4-moves ahead novice player 8-moves ahead human master, typical PC 12-moves ahead Deep Blue, Kasparov
Key idea:
Search few lines of play, but search them deeply. Need pruning!
20
Joelle Pineau
10
- Pruning example
3
12
21
Joelle Pineau
- Pruning example
3
12
22
Joelle Pineau
11
- Pruning example
3
X 3 12 8 2
23
Joelle Pineau
- Pruning example
3
14
X 3 12 8 2
X 14
24
Joelle Pineau
12
- Pruning example
3
14 5
X 3 12 8 2
X 14 5
25
Joelle Pineau
- Pruning example
3 3
14 5 2
X 3 12 8 2
X 14 5 2
26
Joelle Pineau
13
- Pruning
Simple idea: if a path looks worse than what we already have, discard it. Standard technique for deterministic, perfect information games. Algorithm is like Minimax, but keeps track of best leaf value for our player () and best one for the opponent () If the best move at a node cannot change (regardless of what we would find by searching) then no need to search further! Provably correct (i.e. wont cut off a good branch) given the evaluation function.
COMP-424: Artificial intelligence 27 Joelle Pineau
28
Joelle Pineau
14
Properties of - pruning
Pruning does not affect the final result! But can greatly increase efficiency!! Good move ordering is key to the effectiveness of pruning.
With perfect ordering, time complexity is O(bm/2)
Means double the search depth, for same resources. In chess: this is difference between novice and expert player.
29
Joelle Pineau
- Pruning example
3 3
14 5 2
X 3 12 8 2
X 14 5 2
30
Joelle Pineau
15
Properties of - pruning
Pruning does not affect the final result! But can greatly increase efficiency!! Good move ordering is key to the effectiveness of pruning.
With perfect ordering, time complexity is O(bm/2)
Means double the search depth, for same resources. In chess: this is difference between novice and expert player.
The - pruning demonstrates the value of reasoning about which computations are important!
COMP-424: Artificial intelligence 31 Joelle Pineau
Forward pruning
Another simple idea:
Only explore n best moves for current state (according to the evaluation function).
Unlike - pruning, this can lead to sub-optimal solution. Can be very efficient with a good evaluation function.
32
Joelle Pineau
16
33
Joelle Pineau
34
Joelle Pineau
17
- search with a linear evaluation function Evaluation function had to be developed from scratch.
hand-selected features and weights tuned by learning
35
Joelle Pineau
Why is Go so hard?
Computer Go players are at Master level for 9x9 board, but at advanced amateur level for full 19x19 board. Branching factor > 300, so search is very difficult. Difficult to write a good evaluation function until the endgame. Top programs, such a MoGo, avoid alpha-beta search and instead use Monte-Carlo rollouts.
Sample random moves in the first few iterations. Guide the sampling process to prefer moves that have led to wins in previous samples (but dont prune! i.e. still consider all moves) Use knowledge-based rules to select particular moves when certain patterns are detected.
36
Joelle Pineau
18
Summary
Understand how to define game playing as a search problem. Be able to implement the Minimax algorithm with alpha-beta pruning. Understand the use of an evaluation function. Read all of Ch.5 in detail.
37
Joelle Pineau
19