Sie sind auf Seite 1von 20

Chapter II: Monte Carlo Simulations

Modelling and Simulation, M.Biehl, 2005

1. Random Walks
space/time discrete processes
continuous formulation
2. Diffusion Limited Aggregation (DLA)
Witten Sander algorithm
fractal properties of the aggregate
the growth site probability distribution
electrostatic analagon, Dielectric Breakdown Model (DBM)
3. Simulated Annealing and Monte Carlo (MC) simulations
Stochastic optimization
The Travelling Salesperson Problem (TSP)
MC simulations in statistical physics

II.3. Stochastic Optimization


conventional optimization problems can be solved by standard methods
e.g. linear or quadratic programming problems of the type
n
minimize f (~x) subject to {gi(~x) = 0}ki=1, {hj (~x) 0}m
for
~
x

I
R
j=1

by exact or numerical methods, e.g. gradient based search

difficult, hard, complex optimization problems: (one or several of the features)


highly non-linear functions f
f with many local minima
many variables, i.e. high-dimensional vectors ~x
n
discrete variables, e.g. ~x {0, 1}
standard methods (gradient based, etc.) often fail,
search for and identification of global minima is very difficult

An example: The Travelling Salesperson Problem n


o
consider a map with N cities at given (irregular) positions ~ri = (xi, yi)T
i=1,...,N
and distances d(i, j) = |~ri ~rj |
Fig. 1:
Left: 50 cities, at random positions in [0, 1] [0, 1]
Right: a path connecting all
cities in randomized order

Optimization Problem:
find the shortest round trip which visits every city exactly once
the length ` of a trip is a function of the vector ~o = (o1, o2, . . . , oN )T
which contains each i (1 i N ) exactly once and defines the order
PN
in which the cities are visited:
`(~o) = k=1 d(ok , ok+1) + d(oN , o1)

for N cities, one expects ` to be on the order N

wherever numbers are given, they correspond to `/ N .

difficulties:
the number of possible (different) paths is (N 1)!/2, grows like N N for large N
exhaustive search of all paths is unrealistic for large N

for given {~ri}, the function `(~o) can have many local minima (see below)
greedy algorithms (steepest descent) will fail

Heuristic strategy: stochastic descent procedure


0.

start with a random path ~o(0) with length `(0)

1.

given ~o(t) with `(t), suggest a small change, e.g.:


select two cities j and k randomly
cut the segment of the path between j and k
re-insert the segment in reverse order

here, ` increases from left to right

2.

calculate the potential new path length `(t + 1)


if `(t + 1) < `(t) accept the move (descent algorithm)
else, let the path unchanged

iterate 1,2 until the path does not change anymore

Matlab code: stochastic optimization (TSP)


%
%
%
%
%
%

function anneal = tsp(n,maxsteps,temp,met)


%
tsp(n,ms,temp,method) tries to find the shortest path
that connects n randomly placed cities
method=1 (2) corresponds to Metropolis (threshold) algorithm
ms*100 is the total number of performed steps
temp is the initial temperature, after each 100 steps it
is decreased by 1%.
temps = temp;
% intial temperature
lt = zeros(1,ceil(maxsteps));
tt = 1:ceil(maxsteps);

close all;
initialize random number generator and draw coordinates
rand(state,0); cities = rand(n,2);
order = [1:n]; op = path(order,cities);
for jstep=1:ceil(maxsteps);
lower temperature by 0.1 percent
temp = temp*0.999;
for ins = 1:100
j = ceil(rand*n); len = ceil(rand*(n/2));
cand = reverse(order,j,len);
evaluate change of path length
diff = delta(order,cities,j,j+len);
np
= op + diff;
met=1: threshold, met=2: metropolis
if ( (met==1 && (rand<exp(-diff/temp))||(diff<0)) || ...
(met==2 && diff<temp))
order = cand;
op = np;
end
end
rescale length of path by sqrt(n) for output purposes

lt(jstep) = op/sqrt(n);
curlen = path(order,cities)/sqrt(n);
plot map, cities and path
figure(1); plotcities(order,cities);
title([n =,num2str(n,%3.0f),
...

t =,num2str(jstep*100,%8.0f), ...

l =,num2str(curlen,%4.4f), ...

T =,num2str(temp,%6.6f)],
...
fontsize,16);
if (met==1)
xlabel([Metropolis algorithm, annealing],fontsize,16);
else
xlabel([Threshold algorithm, ...

T(0)=,num2str(temps,%4.4f)], ...
fontsize,16);
end
end
plot evolution of length versus iteration step
figure(2); plot(0,0); hold on;
plot(tt,lt,k.);
title([n =,num2str(n,%3.0f),
...

l =,num2str(curlen,%4.4f), ...

T =,num2str(temps,%4.4f)],
...
fontsize,16);
if (met==1)
xlabel([Metropolis steps / 100],fontsize,16);
else
xlabel([Threshold steps /100],fontsize,16);
end
ylabel([l],fontsize,16);

______________________________________________________________

function t = path(order,cities)
% function path(...) returns the length
% of the trip when visiting the cities
% in the order specified by order

piece = cities(order(jl),:)-cities(order(jj),:);
p1 = sqrt(piece*piece);
piece = cities(order(kr),:)-cities(order(kk),:);
p2 = sqrt(piece*piece);
piece = cities(order(jl),:)-cities(order(kk),:);
p1new = sqrt(piece*piece);
piece = cities(order(kr),:)-cities(order(jj),:);
p2new = sqrt(piece*piece);

trip = 0;
% cities 1 to n and n to 1
for i=1:length(order)-1
piece = cities(order(i+1),:)-cities(order(i),:);
trip = trip + sqrt(piece*piece);
diff = p2new- p2 + p1new - p1;
end
piece = cities(order(1),:)-cities(order(length(order)),:) ;
______________________________________________________________
trip = trip + sqrt(piece*piece);
function kt = plotcities(order,cities)
t = trip;
______________________________________________________________ % function plotcities(order,cities) displays
% the cities and the current path in [0,1]x[0,1]
function diff = delta(order,cities,j,k)
ordcit = cities; nn = length(order);
citx = zeros(nn+1,1); city = zeros(nn+1,1);
% function delta(...) determines the change
% of the path length when the sub-path between
hold off; plot(0,0); box on;
% j and k is reversed
axis square; hold on;
nn = length(order); jj = j; kk=k;
for i=1:nn
jl = jl-1; kr = kk+1;
citx(i) = cities(order(i),1);
city(i) = cities(order(i),2);
if (kk>nn)
end
kk = kk-nn;
citx(nn+1) = cities(order(1),1);
end
city(nn+1) = cities(order(1),2);
if (jl ==0)
jl=nn;
plot(citx,city,k.,MarkerSize,15);
end
plot(citx,city);
if (kr ==nn+1)
figure(1);
kr=1;
end

Fig. 4: TSP with 100 random cities; evolution of the path when applying the descent
algorithm. After about 20000 steps a stationary configuration is found. The lower right
graph displays the monotonic decrease of `.

Fig. 6: Stationary paths (not verified) of the descent alg., i.e. local minima,
for the same set of cities, but with different initialization of the randomized search
procedure. Although the values of ` are similar, paths can differ significantly.

idea: allow for changes that (temporarily) increase `(~o)


hope: algorithm can leave local minima and get closer to the global minimum
Metropolis algorithm : (probabilistic acceptance, statistical physics)
...
2. calculate the potential new path length `(t + 1) and ` = `(t + 1) `(t)
if ` < 0, accept the move
else, accept it with a probability P (`) = exp(`/T )
...
The temperature parameter T controlls the acceptance of ` > 0:
T =0
T

only decreasing moves are accepted, descent version


all moves are accepted, no minimization at all

Note:
The Metropolis algorithm is frequently used to simulate physical systems at real temperature T , see next section.

Fig7: Metropolis algorithm with T = const. Example paths after 100000 iteration
steps (upper) and evolution of ` (lower). The temperature controlls the equilibrium
value of `, also fluctuations decrease with decreasing T . Note the different scalings of
`-axes in the plots.

Simulated Annealing
start with relatively high temperature, large fluctuations
decrease T during the iterations, i.e. cool down or anneal the system
problem: the decrease of T must be
slow enough, so that local minima can be left
fast enough to achieve solutions in realisting computing time
attempt: start with, say, T = 0.1,

after each 100 steps, reduce T by 0.1%


Fig. 8: Annealing procedure
for N = 100 cities, starting
temperature T = 0.1. Left:
the path after 800000 steps.
Right: the evolution of ` in
the iteration. Note that the
achieved value is significantly
lower than for the descent algorithm (T = 0).

Deterministic acceptance:
simpler (faster) rule for acceptance of moves:
...
2. calculate the potential new path length `(t + 1) and ` = `(t + 1) `(t)
if ` < T , accept the move
...
requires less computational effort (no random number, no if(...))
yields (here) comparable or better results, if T is reduced as in simulated annealing
is not as well-founded theoretically; for instance, T is not a temperature

Fig. 9: Threshold algorithm


for N = 100 cities, starting
threshold was T = 0.12. Left:
the path after 100000 steps.
Right: the evolution of `.

3.2 MC simulations and statistical physics


Stationary distribution of the Metropolis algorithm
consider N (discrete) degrees of freedom ~x, e.g. ~x {1, +1}N
the Metropolis algorithm is a special case of a Markov process:
it generates a sequence o configurations
where ~xt only depends on ~xt1

~x0, ~x1, ~x2, . . . , ~xt . . .

a configuration ~xt = ~y changes into ~xt+1 = ~z with transition probability w(~


y~
z)
Note: w(~y ~z) may be zero for many transitions!
We assume: every configuration ~x can be transformed in any other ~z
by a sequence of changes with non-zero transition probability
Example TSP:
w(~y ~z) 6= 0 only if ~z can be obtained from ~x by inversion of a subpath, but
every possible configuration can be reached by a sequence of allowed changes

Let Pt(~x) be the probability of finding configuration ~x at time t


In the next step:
Pt+1(~x) = Pt(~x)

w(~x ~z) Pt(~x) +

~z

w(~z ~x) Pt(~z)

~z

condition for stationarity, i.e. Pt(~x) P(~x) with t :


X

w(~x ~z) P(~x) =

~z

w(~z ~x) P(~z)

~z

one possibility to satisfy the condition is detailed balance:


w(~x ~z) P(~x) = w(~z ~x) P(~z) i.e.

Note: detailed balance


does not uniquely determine w(~x ~z)
is not a necessary condition for stationarity

P(~z)
w(~x ~z)
=
w(~z ~x)
P(~x)

Metropolis algorithm:
(

w(~x ~z) =
(

w(~z ~x) =

configurations ~x with cost function (energy)

eH/T

if H = H(~z) H(~x) < 0


else

e+H/T
1

if H < 0
else

H(~x)

H(~z)/T
e
w(~x ~z)
= eH/T = H(~x)/T

w(~z ~x)
e

Metropolis dynamics yields the stationary distribution


eH(~x)/T

P(~
x) = P

~
x

eH(~x)/T

Gibbs-Boltzmann probability ...

describes the prob. of finding a configuration ~x of a physical system at temperature T


(see Appendix)

A popular example: Ising-Spin systems


N
~
degrees of freedom: S {1, +1} , i.e. Si = 1, i = 1, 2, . . . N .
magnetic moments or discrete spins (E. Ising, 1920)

energy function:

~ =
H(S)

Jij Si Sj

i,j(j<i)

pairwise interaction Jij contributes Si Jij Sj to the total energy


short-range models:
long-range models:

e.g. Jij 6= 0 only for neighboring sites in a lattice


e.g. Jij =
6 0 for all pairs (i, j 6= i)

Example systems:
Ising-Ferromagnet: Jij > 0 favors aligned pairs, i.e. Si = Sj
Anti-Ferromagnet:

Jij < 0 favors pairs with Si = Sj

Ising-Spinglass:

~
e.g. random Jij from N (0, 1) frustration, complex H(S)

Hopfield-network:

~
formal neurons Si, synapses Jij stabilize memories S

oP
=1

Appendix: statistical physics in a nutshell


consider a system with N degrees of freedom: e.g.
where the function

~x IRN

gives the energy of a configuration ~x

H(~
x)

Gibbs-Boltzmann statistics:
in a system at constant temperature T = 1/ one observes the config. ~x with
e H(~x)
P (~x) =
Z
(

with the normalization Z =

( 0)
0 ( )

dN x e H(~x)

(partition function)

energy is irrelevant, every ~x is equally probable


only ~x with minimal energy Hmin has non-zero P (~x)

the mean energy is controlled by T, :


thermal average

hHi =

1
d x H(~x) P (~x) =
Z
N

H(~x)

d x H(~x) e

Note: in the case of discrete variables, e.g. ~x {+1, 1} we have to replace integrals

=
ln Z

d x . . . by sums

X
~x

...

define

F = 1 ln Z

and consider:

1
1
hHi F = hHi + ln Z =

Z
1 1
=
Z
free energy

dN x eH(~x)

microcanonical entropy
re-write

Z =

"

1
d x H(~x) + ln Z eH(~x)

H(~x)

ln

1 Z N
=
d x P (~x) ln P (~x)

F = hHi T
=

entropy

dN x P (~
x) ln P (~
x)

(consider systems with fixed energy)

H(~x)

d xe

(canonical entropy)

dE

Z
|

d x [E H(~x)] e
{z

(E)

phase space volume of configurations with energy E:


define microcanonical entropy

S(E) = ln (E)

(E) =

dN x [E H(~x)]

energy and entropy are (for large N ) extensive quantities:


Z =

dN x eH(~x) =

dE (E) eE =

E = e N and S = s N.

dE eN (es(e))

saddle point integration:


for N the integral
1
lim
ln
N N

dz eN f (z)

dz eN f (z) is dominated by the maximum of f :

ln N
= f (zo) + O
N

with


f
z

z=zo

= 0

partition sum and free energy in the thermodynamic limit N :


F = ln Z = Eo S(Eo)

where Eo is given by


S(E)
E

E=Eo

= 0

(this condition defines )

compare with previous page:


F = ln Z = hHi

i.e. Eo hHi

and

equivalence of microcanonical and canonical picture

Interpretation:
Large systems at constant temperature (in equilibrium) display macroscopic properties (here: energy) which minimize the free energy.
more precisely: ... minimize the free energy functional F (E) = E T S.
The resulting F = F (Eo) is the free energy of the system.

T = 0:
T :
T finite:

F = E, energy is minimized
energy irrelevant, entropy is maximized
competition between energy and entropy
Lagrange parameter controlls the importance of E

Remark: treatment of other macroscopic quantities, e.g. A(~x)


Z
Z
consider modified energy
H(~x) = H(~x) + A(~x),
Z =
dN x eH(~x) =
dN x eH(~x) A(~x)
Z
1
N
H
(~
x
)

thermal average in the modfied system:


hAi =
d x A(~x) e
=
ln Z

... in the undisturbed system:

hAi = lim hAi


0

(Lagrange parameter or auxiliary field )