Beruflich Dokumente
Kultur Dokumente
Heiko Hotz
CONTENTS 1
Contents
1 Introduction 2
1.1 Game Theory – What is it? . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Game Theory – Where is it applied? . . . . . . . . . . . . . . . . . . . . . . 2
2 Definitions 3
2.1 Normal Form Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Extensive Form Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Nash Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3.1 Best Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3.2 Localizing a Nash Equilibrium in a Payoff-matrix . . . . . . . . . . . 4
2.4 Mixed Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Games 5
3.1 Prisoner’s Dilemma (PD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 Other Interesting Two-person Games . . . . . . . . . . . . . . . . . . 6
3.2 The Ultimatum Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Public Good Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 Rock, Paper, Scissors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5 Applications 12
5.1 Evolution of cooperation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.2 Biodiversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
A Mathematical Derivation 14
A.1 Normal form games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
A.2 Nash equilibrium and best answer . . . . . . . . . . . . . . . . . . . . . . . 14
A.3 Evolutionary stable strategies (ESS) . . . . . . . . . . . . . . . . . . . . . . 14
A.4 Replicator equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
A.5 Evolutionary stable state of the Hawk-Dove game . . . . . . . . . . . . . . . 15
B Pogram Code 17
B.1 Spatial PD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
B.2 Hawk-Dove game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1 INTRODUCTION 2
Despite the deep insights he gained from 3. A payoff function, which assigns a
game theory’s applications to economics, certain payoff to each player depend-
von Neumann was mostly interested in ing on his strategy and the strat-
applying his methods to politics and war- egy of the other players (e.g. in the
fare, perhaps descending from his favorite Prisoner’s Dilemma the time each of
childhood game, Kriegspiel, a chess-like the players has to spend in prison).
military simulation. He used his meth-
ods to model the Cold War interaction be- The payoff function assigns each player
tween the U.S. and the USSR, picturing a certain payoff depending on his strat-
them as two players in a zero-sum game. egy and the strategy of the other play-
He sketched out a mathematical model ers. If the number of players is limited
of the conflict from which he deduced that to two and if their sets of strategies con-
the Allies would win, applying some of sist of only a few elements, the outcome
the methods of game theory to his pre- of the payoff function can be represented
dictions. in a matrix, the so-called payoff matrix,
There are many more applications in which shows the two players, their strate-
the sciences, which have already been men- gies and their payoffs.
tioned, and in many more sciences like so-
ciology, philosophy, psychology and cul- Example:
tural anthropology. It is not possible to
list them all in this paper, more informa- Player1\Player2 L R
tion can be obtained in the references at U 1, 3 2, 4
the end of this paper. D 1, 0 3, 3
node that represents the start of the game. strategy choices and the corresponding pay-
Any node that has only one edge con- offs constitute a Nash equilibrium.
nected to it is a terminal node and rep- John Nash showed in 1950, that every
resents the end of the game (and also a game with a finite number of players and
strategy profile). Every non-terminal node finite number of strategies has at least one
belongs to a player in the sense that it mixed strategy Nash equilibrium.
represents a stage in the game in which it
is that player’s move. Every edge repre- 2.3.1 Best Response
sents a possible action that can be taken
by a player. Every terminal node has The best response is the strategy (or strate-
a payoff for every player associated with gies) which produces the most favorable
it. These are the payoffs for every player immediate outcome for the current player,
if the combination of actions required to taking other players’ strategies as given.
reach that terminal node are actually played. With this definition , we can now de-
termine the Nash equilibrium in a normal
Example: form game very easily by using the payoff
matrix.
The formal proof that this procedure
leads to the desired result is given in ap-
pendix A.2.
that player 2 plays D (the best answer Mixed strategy equilibrium points out
in this case is D, since C gains payoff 0 an aspect of Nash equilibrium that is of-
whereas D gains payoff 1), then we do the ten confusing for beginners. Nash equi-
same for player 2 by a given strategy of librium does not require a positive reason
player 1 and we will get: for playing the equilibrium strategy. In
matching pennies, the two players are in-
C D different: they have no positive reason to
C 3, 3 0, 5 randomize 50-50 rather than doing some-
D 5, 0 1, 1 thing else. However, it is only an equilib-
rium if they both happen to randomize
The Nash equilibrium is then determined 50-50. The central thing to keep in mind
by the matrix element, in which both play- is that Nash equilibrium does not attempt
ers marked their best answers. to explain why players play the way they
Thus, the strategies that constitute do. It merely proposes a way of playing
a Nash equilibrium is defection by both so that no player would have an incentive
players, because if any player changed his to play differently.
strategy to C whereas the other one stays If, for example, player 1 chooses to
with D, he would get a less payoff. play Head with a probability of 80 % and
play Tail with a probability of 20 % , then
player 2 will eventually anticipate the op-
2.4 Mixed Strategies
ponent’s strategy. Hence he will play Tail
Consider the following payoff matrix, which everytime. This will lead to a positive
corresponds to the game Matchin Pen- payoff of 0.8 · 1 + 0.2 · (−1) = 0.6 per
nies: game for player 2, respectively a payoff
of −0.6 for player 1. The best payoff one
Player1\Player2 Head Tail
can do in such a fair zero-sum game is
Head 1 , -1 -1 , 1
0, and this will be achieved by playing
Tail -1 , 1 1 , -1 1 1
2 Head + 2 Tail.
The best responses are already marked, Mathematically spoken, a mixed strat-
and it is obvious, that there is no matrix egy is just a linear combination of the
cell, in which both players marked their pure strategies.
best response.
What do game theorists make of a 3 Games
game without a Nash equilibrium? The
answer is that there are more ways to play Now I want to present some of the most
the game than are represented in the ma- studied games of game theory.
trix. Instead of simply choosing Head or
Tail, a player can just flip the coin to de- 3.1 Prisoner’s Dilemma (PD)
cide what to do. This is an example of
a mixed strategy, which simply means a We have already used the payoff matrix
particular way of choosing randomly among of the Prisoner’s Dilemma in 2.3.2 to lo-
the different strategies. calize the Nash equilibrium of this game,
The mixed strategy equilibrium of the and now I want to tell the story behind
matching pennies game is well known: each this famous dilemma:
player should randomize 50-50 between
the two alternatives. Two suspects are arrested by the police.
The police have insufficient evidence for
3 GAMES 6
a conviction, and, having separated both As we have already seen, the logical move
prisoners, an officer visits each of them for both players is defection (D). The dilemma
to offer the same deal: if one testifies for lies herein, that the best result for player
the prosecution against the other and the 1 and player 2 as a group (R = 3 for both)
other remains silent, the betrayer goes free can’t be achieved.
and the silent accomplice receives the full
10-year sentence. If both stay silent, the In defining a PD, certain conditions have
police can sentence both prisoners to only to hold. The values we used above, to
six months in jail for a minor charge. If demonstrate the game, are not the only
each betrays the other, each will receive values that could have been used, but they
a two-year sentence. Each prisoner must do adhere to the conditions listed below.
make the choice of whether to betray the Firstly, the order of the payoffs is im-
other or to remain silent. However, nei- portant. The best a player can do is T
ther prisoner knows for sure what choice (temptation to defect). The worst a player
the other prisoner will make. So the ques- can do is to get the sucker payoff, S. If
tion this dilemma poses is: How will the the two players cooperate then the reward
prisoners act? for that mutual cooperation, R, should
be better than the punishment for mu-
We will use the following abbreviations: tual defection, P. Therefore, the following
To testify means to betray the other sus- must hold:
pect and thus to defect (D), to remain
silent means to cooperate (C) with the T > R > P > S.
other suspect. And for the sake of clarity, In repeated interactions, another condi-
we want to use positive numbers in the tion it is additionally required:
payoff matrix. Players should not be allowed to get
out of the dilemma by taking it in turns to
C D
exploit each other. Or, to be a little more
C R=3, R=3 S=0, T=5
pedantic, the players should not play the
D T=5, S=0 P=1, P=1
game so that they end up with half the
time being exploited and the other half
of the time exploiting their opponent. In
• R is a Reward for mutual coopera- other words, an even chance of being ex-
tion. Therefore, if both players co- ploited or doing the exploiting is not as
operate then both receive a reward good an outcome as both players mutu-
of 3 points. ally cooperating. Therefore, the reward
for mutual cooperation should be greater
• If one player defects and the other
than the average of the payoff for the temp-
cooperates then one player receives
tation and the sucker. That is, the follow-
the Temptation to defect payoff (5
ing must hold:
in this case) and the other player
(the cooperator) receives the Sucker
R > (S + T )/2
payoff (zero in this case).
• If both players defect then they both 3.1.1 Other Interesting Two-person
receive the Punishment for mutual Games
defection payoff (1 in this case).
Depending on the order of R, T, S, and
P, we can have different games. Most are
3 GAMES 7
trivial, but two games stand out: Obviously, rational responders should
accept even the smallest positive offer, since
• Chicken (T > R > S > P ) the alternative is getting nothing. Pro-
posers, therefore, should be able to claim
C D almost the entire sum. In a large num-
C R=2, R=2 S=1, T=3 ber of human studies, however, conducted
D T=3, S=1 P=0, P=0 with different incentives in different coun-
tries, the majority of proposers offer 40 to
Example: Two drivers with something to 50 % of the total sum, and about half of
prove drive at each other on a narrow all responders reject offers below 30 %
road. The first to swerve loses faces among
his peers (the chicken). If neither swerves, 3.3 Public Good Game
however, the obvious worst case will oc-
A group of 4 people are given $ 200 each
cur.
to participate in a group investment project.
• Stag Hunt (R > T > P > S) They are told that they could keep any
money they do not invest. The rules of
C D the game are that every $1 invested will
C R=3, R=3 S=0, T=2 yield $ 2, but that these proceeds would
D T=2, S=0 P=1, P=1 be distributed to all group members. If
every one invested, each would get $ 400.
However, if only one person invested, that
Example: Two hunters can either jointly “sucker” would take home a mere $ 100.
hunt a stag or individually hunt a rabbit. Thus, the assumed Nash equilibrium could
Hunting stags is quite challenging and re- be the combination of strategies, where
quires mutual cooperation. Both need to no one invests any money. And we can
stay in position and not be tempted by show that this is indeed the Nash equilib-
a running rabbit. Hunting stags is most rium.
beneficial for society but requires a lot of We will not display this game in a pay-
trust among its members. The dilemma off matrix, since each player has a too big
exists because you are afraid of the oth- set of strategies (the strategy sn is given
ers’ defection. Thus, it is also called trust by the amount of money that player n
dilemma. wants to contribute, e.g. s1 = 10 means,
that player 1 invests 10 $). Nevertheless
3.2 The Ultimatum Game this is a game in normal form and there-
Imagine you and a friend of yours are fore it has a payoff function for each player.
walking down the street, when suddenly The payoff function for, let’s say, player 1
a stranger stops you and wants to play a is given by
game with you:
He offers you 100 $ and you have to 2 · (s1 + s2 + s3 + s4 )
agree on how to split this money. You, P = − s1
4
as the proposer, make an offer to your 2 · (s2 + s3 + s4 )
friend, the responder. If he accepts your = − 0, 5 · s1
4
offer, the deal goes ahead. If your friend
rejects, neither player gets anything. The But this means, that every investment
stranger will take back his money and the s1 of player 1 will diminish his payoff.
game is over. Therefore, a rational player will choose
4 EVOLUTIONARY GAME THEORY 8
Thus a mixed strategy with a proba- By setting ẋi = 0, we obtain the evo-
bility V /C of playing Hawk and a proba- lutionary stable states of a population.
bility 1 − V /C of playing Dove is evolu- A population is said to be in an evo-
tionary stable, i.e. that it can not be in- lutionarily stable state if its genetic com-
vaded by players playing one of the pure position is restored by selection after a
strategies Hawk or Dove. disturbance, provided the disturbance is
not too large.
4.3 The Replicator Dynamics If this equation is applied to the Hawk-
Dove game, the result will be the follow-
As mentioned before, the main difference ing:
of EGT to game theory is the investiga- For V > C, the only evolutionary sta-
tion of dynamic processes. In EGT, we ble state is a population consisting of hawks.
are interested in the dynamics of a pop- For V < C, a mixed population with a
ulation, i.e. how the population evolves fraction V /C of hawks and a fraction 1 −
over time. V /C of doves is evolutionary stable.
This result will be derived in appendix
Let us consider now a population consist- A.5.
ing of n types, and let xi (t) be the fre- At this point, one may see little differ-
quency of type i. Then the state of the ence between the two concepts of evolu-
population is given by the vector x(t) = tionary game theory. We have confirmed
x1 (t), . . . , xn (t). that, for the Hawk-Dove game and for
We want now to postulate a law of V > C, the strategy Hawk is the only
motion for x(t). If individuals meet ran- ESS. Since this state is also the only sta-
domly and then engage in a symmetric ble equilibrium under the replicator dy-
game with payoff matrix A, then (Ax)i is namics, the two notions fit together quite
the expected payoff for an individual of neatly: the only stable equilibrium under
type i and xT Ax is the average payoff in the replicator dynamics occurs when ev-
the population state x. eryone in the population follows the only
The evolution of x over time is de- ESS. In general, though, the relationship
scribed by the replicator equation: between ESSs and stable states of the repli-
cator dynamics is more complex than this
ẋi = xi [(Ax)i − xT Ax] (1)
example suggests.
The replicator equation describes a se- If only two pure strategies exist, then
lection process: more succesful strategies given a (possibly mixed) evolutionarily sta-
spread in the population. ble strategy, the corresponding state of
A derivation of the replicator equation the population is a stable state under the
is given in appendix A.4 replicator dynamics. (If the evolutionar-
ily stable strategy is a mixed strategy S,
the corresponding state of the population
4.4 ESS and Replicator Dynam-
is the state in which the proportion of
ics
the population following the first strategy
We have seen, that D is no ESS at all and equals the probability assigned to the first
for V > C, H is an ESS. We have also strategy by S, and the remainder follow
seen, that for V < C the ESS is a mixed the second strategy.)
ESS with a probability of V /C for playing However, this can fail to be true if
H and with a probability of 1 − V /C for more than two pure strategies exist.
playing D.
5 APPLICATIONS 12
The connection between ESSs and stable tions. Their analysis can be traced back
states under an evolutionary dynamical to the Ising model. The application of
model is weakened further if we do not methods developed in statistical mechan-
model the dynamics by the replicator dy- ics to interactions in spatially structured
namics. populations has turned out to be very fruit-
In 5.1 we use a local interaction model ful. Interesting parallels between non equi-
in which each individual plays the Pris- librium phase transitions and spatial evo-
oner’s Dilemma with his or her neighbors. lutionary game theory have added another
Nowak and May, using a spatial model in dimension to the concept of universality
which local interactions occur between in- classes.
dividuals occupying neighboring nodes on We have already seen, that the Nash
a square lattice, showed that stable pop- equilibrium of PD is to defect. But to
ulation states for the Prisoner’s Dilemma overcome this dilemma, we consider spa-
depend upon the specific form of the pay- tially structured populations where indi-
off matrix. viduals interact and compete only within
a limited neighborhood. Such limited lo-
cal interactions enable cooperators to form
5 Applications clusters and thus individuals along the
5.1 Evolution of cooperation boundary can outweigh their losses against
defectors by gains from interactions within
As mentioned before, the evolution of co- the cluster. Results for different popula-
operation is a fundamental problem in bi- tion structures in the PD are discussed
ology because unselfish, altruistic actions and related to condensed matter physics.
apparently contradict Darwinian selection.
Nevertheless, cooperation is abundant This problem has been investigated by
in nature ranging from microbial interac- Martin Nowak (Nature, 359, pp. 826-
tions to human behavior. In particular, 829, 1992)
cooperation has given rise to major tran- I programmed this scenario based on
sitions in the history of life. Game theory the investigations of Nowak. The pro-
together with its extensions to an evolu- gram is written in NetLogo, the program
tionary context has become an invaluable code is given in appendix B.
tool to address the evolution of coopera-
tion. The most prominent mechanisms of 5.2 Biodiversity
cooperation are direct and indirect reci-
procity and spatial structure. One of the central aims of ecology is to
The mechanisms of reciprocity can be identify mechanisms that maintain bio-
investigated very well with the Ultima- diversity. Numerous theoretical models
tum game and also with the Public Good have shown that competing species can
game. coexist if ecological processes such as dis-
But the prime example to investigate persal, movement, and interaction occur
spatially structured populations is the Pris- over small spatial scales. In particular,
oner’s Dilemma. this may be the case for nontransitive com-
Investigations of spatially extended sys- munities, that is, those without strict com-
tems have a long tradition in condensed petitive hierarchies. The classic non-transitive
matter physics. Among the most impor- system involves a community of three com-
tant features of spatially extended sys- peting species satisfying a relationship sim-
tems are the emergence of phase transi- ilar to the children’s game rock-paper-scissors,
5 APPLICATIONS 13
gies current generation according to the Since Hawk is denoted with x1 , we will
rule: use the first component of the vector Ax.
(Ax)i The second term xT Ax delivers
xi (t + ∆t) = xi (t) ∆t
xT Ax p2 V
(V − C) + p V (1 − p) + (1 − p)2
for xT Ax 6= 0. Thus 2 2
Thus:
(Ax)i − xT Ax
xi (t+∆t)−xi (t) = xi (t) ∆t p2
xT Ax ṗ = p [ p2 (V − C) + V (1 − p) − 2 (V − C)
This yields the differential equation for −p V (1 − p) − V2 (1 − p)2 ]
∆t → 0: = p [ C2 p2 − 12 (V + C)p + V
2 ]
xi [(Ax)i − xT Ax] = p [p2 − V +C
C p + V
C]
ẋi = (2)
xT Ax
In order to be a population evolutionary
for i = 1, . . . , n with ẋi denoting the deriva- stable we set the changes of the popula-
tive of xi after time. tion per time to zero, so that there is no
The simplified equation change in time. This gives:
ẋi = xi [(Ax)i − xT Ax] (3) ṗ = 0
V +C V
has the same trajectories than (2), since ⇒ p [p2 − C p + C] =0
every solution x(t) of (2) delivers accord-
ing to the time transformation This is certainly true for p = 0. This is
Z s the trivial solution. Two other solutions
t(s) = x(t)T Ax(t)dt can be obtained by evaluating the term
so in the brackets:
a solution y(s) := x(t(s)) of (3). V +C V
p2 − p+ =0
Equation (3) is called the replicator C C
equation.
This gives
q
A.5 Evolutionary stable state of p1/2 = V +C
± V 2 +2V C+C 2
− V
2C 4C 2 C
the Hawk-Dove game q
V +C V 2 −2V C+C 2
= 2C ± 4C 2
We want to show, that the replicator dy-
namics and ESS yield to the same result V +C V −C
= 2C ± 2C
A MATHEMATICAL DERIVATION 16
Thus:
V
p1 = 1, p2 =
C
p1 = 1 is another trivial solution, thus
the only relevant result is p2 = VC .
B POGRAM CODE 17
B Pogram Code
B.1 Spatial PD
globals [movie_on?]
patches-own [num_D z z_prev score d score_h neighbor_h]
to setup
ca
;;1/3 will be defectors (z=1, red); 2/3 cooperators (z=0; blue)
ask patches [ifelse ((random 3) < 1) [set pcolor red set z 1][set pcolor blue set z 0]]
;;d=delta*(2.0*(random(1.0)-1.0))
ask patches [set d delta * (2.0 * (random-float 1.0) - 1.0)]
set movie_on? false
end
to single-D
ca
ask patches [set pcolor blue set z 0]
ask patches [set d delta * (2.0 * (random-float 1.0) - 1.0)]
ask patch 0 0 [set pcolor red set z 1]
set movie_on? false
end
to go
play-game
update
if (movie_on?) [movie-grab-view]
end
to play-game
ask patches [set z_prev z]
ask patches
[
; num_D = number of neighbors that are defectors
set num_D nsum z
ifelse z = 1
[
; if patch is defector: score is T times number of cooperators (8-num_D)
set score (T * (8 - num_D)) + (P * (num_D + 1))
]
[
; if patch is cooperator: score is 1=R times number of cooperators (+1=itself)
set score (R * (9 - num_D)) + (S * num_D)
]
]
end
to update
; find the neighbor with the highest payoff
ask patches
[
; neighbor_h = the neighbor with the highest score
set neighbor_h max-one-of neighbors [score]
; score_h = highest score
set score_h score-of neighbor_h
]
[
; if this patch is cooperator:
; if neighbor with highest score is defector: set patch to defector (z=1, yellow)
; else: stay cooperator
ifelse (z_prev-of neighbor_h) = 1 [set pcolor yellow set z 1][set pcolor blue]
]
set d delta * (2.0 * (random-float 1.0) - 1.0)
]
[
; if own score is the highest:
; if cooperator: set color blue, otherwise red
ifelse z = 0 [set pcolor blue][set pcolor red]
]
]
end
to perturb
if mouse-down?
[
ask patch-at mouse-xcor mouse-ycor [set z 1 - z ifelse z = 0 [set pcolor blue][set pcolor red]]
;need to wait for a while; otherwise the procedure is run a few times after a mouse click
wait 0.5
]
end
to movie_start
movie-start "out.mov"
set movie_on? true
end
to movie_stop
movie-close
set movie_on? false
end
B POGRAM CODE 19
to setup
ca
set-default-shape hawks "hawk"
set-default-shape doves "butterfly"
createH n_hawks
createD n_doves
to go
ask turtles
[
move
fight
reproduce
]
do-plot
end
to createH [num_hawks]
create-custom-hawks num_hawks
[
set color red
set size 1.0
setxy random-xcor random-ycor
]
end
to createD [num_doves]
create-custom-doves num_doves
[
set color white
set size 1.0
setxy random-xcor random-ycor
]
end
to fight
end
to reproduce
if energy > reproduce_limit
[
set energy (energy / 2)
hatch 1 [rt random 360 fd 1]
]
if energy < 0
[die]
end
to do-plot
set-current-plot "ratio"
set-current-plot-pen "ratio"
if any? turtles
[plot count hawks / count turtles]
;
end
References
• Maynard Smith, J. Evolution and the Theory of Games, Cambridge University Press
(1982).
• Hauert, C. and Szabó, G. American Journal of Physics, 73, pp. 405-414 (2005).
• Nowak, M., Page, K. and Sigmund, K. Science, 289, pp. 1773-1775 (2000).
• Kerr, B., Riley, M. A., Feldman, M. W. and Bohannan, B. J. M. Nature, 418, pp.
171-174 (2000).
• http://cse.stanford.edu/class/sophomore-college/projects-98/game-theory/
neumann.html.
• http://plato.stanford.edu/entries/game-evolutionary/.
• http://www.economist.com/printedition/displayStory.cfm?Story ID=1045223.
• http://www.dklevine.com/general/cogsci.htm