Beruflich Dokumente
Kultur Dokumente
Abstra
t
Here pra
ti
al aspe
ts of
ondu
ting resear
h via
omputer simulations
are dis
ussed. The following issues are addressed: software engineering,
obje
t-oriented software development, programming style, ma
ros, make
les, s
ripts, libraries, random numbers, testing, debugging, data plotting,
urve tting, nite-size s
aling, information retrieval, and preparing
presentations.
Be
ause of the limited spa
e, usually only short introdu
tions to the
spe
i
areas are given and referen
es to more extensive literature are
ited. All examples of
ode are in C/C++.
Contents
1 Software Engineering
10
3 Programming Style
16
4 Programming Tools
20
Taken from the book: A.K. Hartmann and H. Rieger, Optimization Algorithms in Physi
s,
(Wiley-VCH, Berlin, Weinheim 2001), ISBN 3-527-40307-8, with permission of Wiley-VCH,
see http://www.wiley-v
h.de. This do
ument may be distributed freely in ele
troni
and
non-ele
troni
form, provided that no
hanges are performed to it.
5 Libraries
29
6 Random Numbers
6.1
6.2
6.3
6.4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
34
35
38
40
41
49
57
Software Engineering
When you are
reating a program, you should never just start writing the
ode.
In this way only tiny software proje
ts su
h as s
ripts
an be
ompleted su
essfully. Otherwise your
ode will probably be very in
exible and
ontain several
hidden errors whi
h are very hard to nd. If several people are involved in a
proje
t, it is obvious that a
onsiderable amount of planning is ne
essary.
But even when you are programming alone, whi
h is not unusual in physi
s,
the rst step you should undertake is just to sit down and think for a while.
This will save you a lot of time and eort later on. To emphasize the need
for stru
turing in the software development pro
ess, the art of writing good
programs is usually
alled software engineering . There are many spe
ialized
books in this elds, see e.g. Refs. [1, 2. Here just the steps that should be
undertaken to
reate a sophisti
ated software development pro
ess are stated.
The following des
riptions refer to the usual situation you nd in physi
s: one
or a few people are involved in the proje
t. How to manage the development of
big programs involving many developers is explained in literature.
You should write down whi
h problem you would like to solve. Drawing
diagrams is always helpful! Dis
uss your problem with others and tell
them how you would like to solve it. In this
ontext many questions may
appear, here some examples are given:
{ What is the input you have to supply? In ase you have only a few
nished. You should foresee later extensions of the program and set
up everything in a way it
an be reused easily.
{ Do you have existing programs available whi
h
an be in
luded into
the software proje
t? If you have implemented your previous proje
ts
in the above mentioned fashion, it is very likely that you
an re
y
le some
ode. But this requires experien
e and is not very easy to
a
hieve at the beginning. But over the years you will have a growing library of programs whi
h enables you to nish future software
proje
ts mu
h qui
ker.
Has somebody else
reated a program whi
h you
an reuse? Sometimes you
an rely on external
ode like libraries. Examples are the
Numeri
al Re
ipes [3 and the LEDA library [4 whi
h are
overed
in Se
. 5.
{ Whi
h algorithms are known? Are you sure that you
an solve the
problem at all? Many other te
hniques have been invented already.
You should always sear
h the literature for solutions whi
h already
exist. How sear
hes
an be simplied by using ele
troni
data bases
is
overed more deeply in Se
. 9.
Sometimes it is ne
essary to invent new methods. This part of a
proje
t may be the most time
onsuming.
On
e you have identied the basi
obje
ts in your systems, you have to
think about how to represent them in the
ode. Sometimes it is su
ient
to dene some stru
t types in C (or simple
lasses in C++). But usually
you will need to design a large set of data stru
tures, referen
ing ea
h
other in a
ompli
ated way.
A sophisti
ated design of the data stru
tures will lead to a better organized
program, usually it will even run faster. For example,
onsider a set of
verti
es of a graph. Then assume that you have several lists Li ea
h
ontaining elements referen
ing the verti
es of degree i. When the graph
is altered in your program and thus the degrees of the verti
es
hange,
it is sometimes ne
essary to remove a vertex from one list and insert it
into another. In this
ase you will gain speed, when your verti
es data
stru
tures also
ontain pointers to the positions where they are stored in
the lists. Hen
e, removing and inserting verti
es in the lists will take only
a
onstant amount of time. Without these additional pointers, the insert
and delete operations have to s
an partially through the lists to lo
ate the
elements, leading to a linear time
omplexity of these operations.
Again, you should perform the design of the data stru
tures in a way,
that later extensions are fa
ilitated. For example when treating latti
es of
Ising spins, you should use data stru
tures whi
h are independent of the
dimension or even of the stru
ture of the latti
e, an example is given in
Se
. 4.1.
4
When you are using external libraries, usually they have some data types
in
luded. The above mentioned LEDA library has many predened data
types like arrays, sta
ks, lists or graphs. You
an have e.g. arrays of
arbitrary obje
ts, for example arrays of strings. Furthermore, it is possible
to
ombine the data types in
ompli
ated ways, e.g. you
an dene a sta
k
of graphs having strings atta
hed to the verti
es.
After setting up the basi
data types, you should think about whi
h basi
and
omplex operations, i.e. whi
h subroutines, you need to manipulate
the obje
ts of your simulation. Sin
e you have already thought a lot about
your problem, you have a good overview, whi
h operations may o
ur.
You should break down the nal task \perform simulation" into small
subtasks, this means you use a top down approa
h in the design pro
ess.
It is not possible to write a program in a sequential way as one
ode. For
the a
tual implementation, a bottom up approa
h is re
ommended. This
means you should start with the most basi
operations. Later on you
an
use them to
reate more
ompli
ated operations. As always, you should
dene the subroutines in a way that they
an be applied in a
exible way
and extensions are easy to perform.
But it is not ne
essary that you must identify all basi
operations at
the beginning. During the development of the
ode, new appli
ations
may arise, whi
h lead to the need for further operations. Also it may be
required to
hange or extend the data stru
tures dened before. However,
the more you think in advan
e, the less you need to
hange the program
later on.
As an example, the problem of nding ground states in Ising spin glasses
via simulated annealing is
onsidered. Some of basi
operations are:
{ Set up the data stru
tures for storing the realizations of the intera
{
{
{
{
{
{
{
{
{
al
ulation of the energy of one spin in the example above. In this
ase,
su
h operations
an be written dire
tly in the
ode, or a ma
ro (see Se
.
4.1)
an be used.
Distributing work
In
ase several people are involved in a proje
t, the next step is to split up
the work between the
oworkers. If several types of obje
ts appear in the
program design, a natural approa
h is to make everyone responsible for one
or several types of obje
ts and the related operations. The
ode should be
broken up into several modules (i.e. sour
e les), su
h that every module
is written by only one person. This makes the implementation easer and
also helps testing the
ode (see below). Nevertheless, the partitioning
of the work requires mu
h
are, sin
e quite often some modules or data
types depend on others. For this reason, the a
tual implementation of a
data type should be hidden. This means that all intera
tions should be
performed through exa
tly dened interfa
es whi
h do not depend on the
internal representation, see also Se
. 2 on obje
t-oriented programming.
When several people are editing the same les, whi
h is usually ne
essary
later on, even when initially ea
h le was
reated by only one person,
then you should use a sour
e-
ode management system . It prevents several
people from performing
hanges on the same le in parallel, whi
h would
ause a lot of trouble. Additionally, a sour
e-
ode management system
enables you to keep tra
k of all
hanges made. An example of su
h a
system is the Revision Control System (RCS), whi
h is freely available
through the GNU proje
t [5 and part of the free operating system Linux .
Testing
always try to nd spe
ial and rare
ases as well when testing a subroutine.
Consider for example a pro
edure whi
h inserts an element into a list.
Then not only inserting in the middle of the list, but also at the beginning,
at the end and into an empty list must be tested. Also, it is strongly
re
ommended to read your
ode
arefully on
e again before
onsidering it
nished. In this way many bugs
an be found easily whi
h otherwise must
be tra
ked down by intensive debugging.
The a
tual debugging of the
ode
an be performed by pla
ing print instru
tions at sele
ted positions in the
ode. But this approa
h is quite
time
onsuming, be
ause you have to modify and re
ompile your program
several times. Therefore, it is advisable to use debugging tools like a
sour
e-
ode debugger and a program for
he
king the memory management. More about these tools
an be found in Se
. 7. But usually you
also need spe
ial operations whi
h are not
overed by an available tool.
You should always write a pro
edure whi
h prints out the
urrent instan
e
of the system that is simulated, e.g. the nodes and edges of a graph or
the intera
tion
onstants of an Ising system. This fa
ilitates the types of
tests, whi
h are des
ribed in the following.
After the raw operation of the subroutines has been veried, more
omplex
tests
an be performed. When e.g. testing an optimization routine, you
should
ompare the out
ome of the
al
ulation for a small system with
the result whi
h
an be obtained by hand. If the out
ome is dierent
from the expe
ted result, the small size of the test system allows you to
follow the exe
ution of the program step by step. For ea
h operation you
should think about the expe
ted out
ome and
ompare it with the result
originating from the running program.
Furthermore, it is very useful to
ompare the out
ome of dierent methods
applied to the same problem. For example, you know that there must be
something wrong, in
ase an approximation method nds a better value
than your \exa
t" algorithm. Sometimes analyti
al solutions are available, at least for spe
ial
ases. Another approa
h is to use invariants.
For example, when performing a Mole
ular Dynami
s simulation of an
atomi
/mole
ular system (or a galaxy), energy and momentum must be
onserved; only numeri
al rounding errors should appear. These quantities
an be re
orded very easily. If they
hange in time there must be
a bug in your
ode. In this
ase, usually the formulas for the energy and
the for
e are not
ompatible or the integration subroutine has a bug.
You should test ea
h pro
edure, dire
tly after writing it. Many developers
have experien
ed that the larger the interval between implementation and
tests is, the lower the motivation be
omes for performing tests, resulting
in more undete
ted bugs.
The nal stage of the testing pro
ess o
urs when several modules are integrated into one large running program. In the
ase where you are writing
the
ode alone, not many surprises should appear, if you have performed
7
many tests on the single modules. If several people are involved in the
proje
t, at this stage many errors o
ur. But in any
ase, you should always remember: there is probably no program, unless very small, whi
h is
bug free. You should know the following important result from theoreti
al
omputer s
ien
e [6: it is impossible to invent a general method, whi
h
an prove automati
ally that a given program obeys a given spe
i
ation.
Thus, all tests must be designed to mat
h the
urrent
ode.
In
ase a program is
hanged or extended several times, you should always
keep the old versions, be
ause it is quite
ommon that by editing new bugs
are introdu
ed. In that
ase, you
an
ompare your new
ode with the
older version. Please note that editors like ema
s only keep the se
ond
latest version as ba
kup, so you have to take
are of this problem yourself
unless you use a sour
e-
ode management system, where you are lu
ky,
be
ause it keeps all older version automati
ally.
For C programmers, it is always advisable to apply the -Wall (warning
level: all) option. Then several bugs already show up during the
ompiling
pro
ess, for example the
ommon mistake to use '=' in
omparisons instead
of '==', or the a
ess to uninitialized variables2 .
In C++, some bugs
an be dete
ted by dening variables or parameter as
onst, when they are
onsidered to stay un
hanged in a blo
k of
ode or
subroutine. Here again, already the
ompiler will
omplain, if attempts
to alter the value of su
h a variable are tried.
This part nishes with a warning: never try to save time when performing
tests. Bugs whi
h appear later on are mu
h mu
h harder to nd and you
will have to spend mu
h more time than you have \saved" before.
Writing do umentation
This part of the software development pro
ess is very often disregarded,
espe
ially in the
ontext of s
ienti
resear
h, where no dire
t
ustomers
exist. But even if you are using your own
ode, you should write good
do
umentation. It should
onsist of at least three parts:
beginning of ea
h module, in front of ea
h subroutine or ea
h selfdened data stru
ture, for blo
ks of the
ode and for sele
ted lines.
Additionally, meaningful names for the variables are
ru
ial. Following these rules makes later
hanges and extension of the program
mu
h more straightforward. You will nd in more hints on how a
good programming style
an be a
hieved Se
. 3.
{ On-line help : You should in
lude a short des
ription of the program,
its parameters and its options in the main program. It should be
printed, when the program is
alled with the wrong number/form of
the parameters, or when the option -help is passed. Even when you
2 But
this is not true for some C++ ompilers when ombining with option -g.
are the author of the program, after it has grown larger it is quite
hard to remember all options and usages.
{ External do
umentation : This part of the do
umentation pro
ess is
important, when you would like to make the program available to
other users or when it grows really
omplex. Writing good instru
tions is really a hard job. When you remember how often you have
omplained about the instru
tions for a video re
order or a word
pro
essor, you will understand why there is a high demand for good
authors of do
umentation in industry.
{ How long will the dierent runs take? You should perform simula-
The steps given do not usually o
ur in linear order. It is quite
ommon that
after you have written a program and performed some simulations, you are not
satised with the performan
e or new questions arise. Then you start to dene
new problems and the program will be extended. It may also be ne
essary to
extend the data stru
tures, when e.g. new attributes of the simulated models
have to be in
luded. It is also possible that a nasty bug is still hidden in the
program, whi
h is found later on during the a
tual simulations and be
omes
obvious by results whi
h
annot be explained. In this
ase
hanges
annot be
ir
umvented either.
In other words, the software development pro
ess is a
y
le whi
h is traversed
several times. As a
onsequen
e, when planning your
ode, you should always
keep this in mind and set up everything in a
exible way, so that extensions and
ode re
y
ling
an be performed easily.
2
stru
tures, together with the methods whi
h a
ess/alter the
ontent of the
obje
ts. The syntax of the
lass denition depends on the programming
language you use. Sin
e implementational details are not relevant here,
the reader is referred to the literature.
When you take the viewpoint of a pure obje
t-oriented programmer, then
all programs
an be organized as
olle
tions of obje
ts
alling methods of
ea
h other. This is derived from the stru
ture the real world has: it is a
large set of intera
ting obje
ts. But for writing good programs it is as in
real life, taking an orthodox position imposes too many restri
tions. You
should take the best of both worlds, the obje
t-oriented and the pro
edural
world, depending on the a
tual problem.
Data apsuling
When using them later on, they just appear as a bla
k box fullling
some duties.
{ You
an
hange the implementation later on without the need to
hange the rest of the program. Changes of the implementation may
be useful e.g. when you want to in
rease the performan
e of the
ode
or to in
lude new features.
{ Furthermore, you
an have
exible data stru
tures : several dierent
types of implementations may
oexist. Whi
h one is
hosen depends
on the requirements. An example are graphs whi
h
an be implemented via arrays, lists, hash tables or in other ways. In the
ase
of sparse graphs, the list implementation has a better performan
e.
When the graph is almost
omplete, the array representation is favorable. Then you only have to provide the basi
a
ess methods,
su
h as inserting/removing/testing verti
es/edges and iterating over
them, for the dierent internal representations. Therefore, higherlevel algorithms like
omputing a spanning tree
an be written in a
simple way to work with all internal implementations. When using
su
h a
lass, the user just has to spe
ify the representation he wants,
the rest of the program is independent of this
hoi
e.
{ Last but not least, software debugging is made easier. Sin
e you
have only dened ways the data
an be
hanged, undesired side11
Inheritan
e
This inheritan e of methods to lower level lasses is an example of operator overloading . It just means that you an have methods for dierent
lasses having the same name, sometimes the same
ode applies to several
lasses. This applies also to
lasses, whi
h are not
onne
ted by inheritan
e. For example you
an dene how to add integers, real numbers,
omplex numbers or larger obje
ts like lists, graphs or do
uments. In language like C or Pas
al you
an dene subroutines to add numbers and
subroutines to add graphs as well, but they must have dierent names.
In C++ you
an dene the operator \+" for all dierent
lasses. Hen
e,
the operator-overloading me
hanisms of obje
t-oriented languages is just
a tool to make the
ode more readable and
learer stru
tured.
Software reuse
e.g. treating lists, you
an in
lude them in other programs as well. This is
easy, be
ause later on you do not have to
are about the implementation.
With a
lass designed in a
exible way, mu
h time
an be saved when
realizing new software proje
ts.
As mentioned before, for obje
t-oriented programming you do not ne
essarily
have to use an obje
t-oriented language. It is true that they are helpful for the
implementation and the resulting programs will look slightly more elegant and
lear, but you
an program everything with a language like C as well. In C an
obje
t-oriented style
an be a
hieved very easily. As an example a
lass histo
implementing histograms is outlined, whi
h are needed for almost all types of
omputer simulations as evaluation and analysis tools.
First you have to think about the data you would like to store. That is the
histogram itself, i.e. an array table of bins. Ea
h bin just
ounts the number of
events whi
h fall into a small interval. To a
hieve a high degree of
exibility, the
range and the number of bins must be variable. From this, the width delta of
ea
h bin
an be
al
ulated. For
onvenien
e delta is stored as well. To
ount the
number of events whi
h are outside the range of the table, the entries low and
high are introdu
ed. Furthermore, statisti
al quantities like mean and varian
e
should be available qui
kly and with high a
ura
y. Thus, several summarized
moments sum of the distribution are stored separately as well. Here the number
of moments HISTO NOM is dened as a ma
ro,
onverting this ma
ro to variable
is straightforward. All together, this leads to the following C data stru
ture:
#define _HISTO_NOM_
Here, the postx t is used to stress the fa
t that the name histo t denotes a
type. The bins are double variables, whi
h allows for more general appli
ations.
Please note that it is still possible to a
ess the internal stru
tures from outside,
but it is not ne
essary and not re
ommended. In C++, you
ould prevent this
by de
laring the internal variables as private. Nevertheless, everything
an
be done via spe
ial subroutines. First of all one must be able to
reate and
delete histograms, please note that some simple error-
he
king is in
luded in
the program:
13
**/
**/
**/
**/
14
**/
All histogram obje
ts are
reated dynami
ally by
alling histo new(), this
orresponds to a
all of the
onstru
tor or new in C++. The obje
ts are addressed
via pointers. Whenever a method, i.e. a pro
edure in C, of the histo
lass
is
alled, the rst argument will always be a pointer to the
orresponding histogram. This looks slightly less elegant than writing histo.method() in C++,
but it is really the same. When avoiding dire
t a
ess, the realization using C is
perfe
tly equivalent to C++ or other obje
t-oriented languages. Inheritan
e
an
be implemented, by in
luding pointers to histo t obje
ts in other type denitions. When these higher level obje
ts are
reated, a
all to histo new() must
be in
luded, while a
all to histo delete(),
orresponding to the destru
tor in
C++, is ne
essary, to implement a
orre
t deletion of the more
omplex obje
ts.
As a nal example, the pro
edures for inserting an element into the table and
al
ulating the mean are presented. It is easy to gure out how other subroutines
for e.g.
al
ulating the varian
e/higher moments or printing a histogram
an be
realized. The
omplete library
an be obtained for free [10.
/** inserts a 'number' into a histogram 'his'. **/
void histo_insert(histo_t *his, double number)
{
int t;
double value;
value = 1.0;
for(t=0; t< _HISTO_NOM_; t++)
{
his->sum[t+= value;;
/* raw statisti
s */
value *= number;
}
if(number < his->from)
/* insert into histogram */
his->low++;
else if(number > his->to)
his->high++;
else if(number == his->to)
his->table[his->n_bask-1++;
else
his->table[(int) floor( (number - his->from) / his->delta)++;
}
15
Programming Style
The
ode should be written in a style that enables the author, and other people
as well, to understand and modify the program even years later. Here brie
y
some prin
iples you should follow are stated. Just a general style of des
ription
is given. Everybody is free to
hoose his/her own style, as long as it is pre
ise
and
onsistent.
Split your ode into several modules. This has several advantages:
{ When you perform hanges, you have to re ompile only the modules
To keep your program logi
ally stru
tured, you should always put data
stru
tures and implementations of the operations in separate les. In
C/C++ this means you have to write the data stru
tures in a header (.h)
le and the
ode into a sour
e
ode (.
/ .
pp) le.
Try to nd meaningful names for your variables and subroutines. Therefore, during the programming pro
ess it is mu
h easier to remember their
meanings, whi
h helps a lot in avoiding bugs. Additionally, it is not ne
essary to look up the meaning frequently. For lo
al variables like loop
ounters, it is su
ient and more
onvenient to have short (e.g. one letter) names.
In the beginning this might seem to take additional time (writing e.g.
'kineti
energy' for a variable instead of 'x10'). But several months
16
after you have written the program, you will appre
iate your eort, when
you read the line
kineti
_energy += 0.5*atom[i.mass*atom[i.velo
*atom[i.velo
;
instead of
x10 += 0.5*x34[i.a*x34[i.b*x34[i.b;
You should use proper indentation of your lines. This helps a great deal
in re
ognizing the stru
ture of a program. Many bugs are
aused by
misaligned bra
es forming a blo
k of
ode. Furthermore, you should pla
e
at most one
ommand per line of
ode. The reader will probably agree
that
for(i=0; i<number_nodes; i++)
{
degree[i = 0;
for(j=0; j<number_nodes; j++)
if(edge[i[j > 0)
degree[i++;
}
Avoid jumping to other parts of a program via the \goto"
ommand. This
is bad style originating from programming in assembler or BASIC. In
modern programming languages, for every logi
al programming
onstru
t
there are
orresponding
ommands. \Goto"
ommands make a program
harder to understand and mu
h harder to debug if it does not work as it
should.
In
ase you want to break out of a loop, you
an use a while/until loop
with a
ag that indi
ates if the loop is to be stopped. In C, if you are
lazy, you
an use the
ommands break or
ontinue.
Do not use global variables. At rst sight the use of global variables may
seem tempting: you do not have to
are about parameters for subroutines,
everywhere the variables are a
essible and everywhere they have the same
name. Programming is done mu
h faster.
But later on you will have a bad time: many bugs are
reated by improper
use of global variables. When you want to
he
k for a denition of a
variable you have to sear
h the whole list of global variables, instead of
17
Finally, an issue of utmost importan
e: Do not be e
onomi
al with
omments in your sour
e
ode! Most programs, whi
h may appear logi
ally
stru
tured when writing them, will be a sour
e of great
onfusion when
being read some weeks later. Every minute you spend on writing reasonable
omments you will save later on several times over. You should
onsider dierent types of
omments.
its name, what the module does, who wrote it and when it was written. It is a useful pra
ti
e to in
lude a version history, whi
h lists the
hanges that have been performed. A module
omment might look
like this:
**********************************************************/
/*** Fun
tions for spin glasses.
***/
/*** 1. loading and saving of
onfigurations
***/
/*** 2. initialization
***/
/*** 3. evaluation fun
tions
***/
/***
***/
/*** A.K. Hartmann January 1996
***/
/*** Version 7.0
03.07.2000
***/
/***
***/
/*********************************************************/
/***
/***
/***
/***
/***
/***
/***
/***
Vers. History:
***/
1.0 feof-
he
k in lsg_load...() in
luded 02.03.96 ***/
2.0
omment for
s2html added
12.05.96 ***/
3.0 lsg_load_bond_n() added
03.03.97 ***/
4.0 lsg_invert_plane() added
12.08.98 ***/
5.0 lsg_write_gen() added
15.09.98 ***/
6.0 lsg_energy_B_hom() added
20.11.98 ***/
7.0 lsg_fra
_frust() added
03.07.00 ***/
18
19
{ Line omments: They are the lowest level omments. Sin e you are
using (hopefully) sound names for data types, variables and subroutines, many lines should be self explanatory. But in
ase the meaning
is not obvious, you should add a small
omment at the end of a line,
for example:
C(t, SOURCE) =
ap_s2t[t;
Aligning all
omments to the right makes a
ode easier to read. Please
avoid unne
essary
omments like
ounter++;
/* in rease ounter */
The line
ontaining C(t, SOURCE) is an example of the appli
ation of a ma
ro.
This subje
t is
overed in the following se
tion.
4
Programming Tools
Using Ma ros
Ma
ros are short
uts for
ode sequen
es in programming languages. Their primary purpose is to allow
omputer programs to be written more qui
kly. But
the main benet
omes from the fa
t that a more
exible software development be
omes possible. By using ma
ros appropriately, programs be
ome better
stru
tured, more generally appli
able and less error-prone. Here it is explained
how ma
ros are dened and used in C, a detailed introdu
tion
an be found in
C textbooks su
h as Ref. [11. Other high-level programming languages exhibit
similar features.
In C a ma
ro is
onstru
ted via the #define dire
tive. Ma
ros are pro
essed in
the prepro
essing stage of the
ompiler. This dire
tive has the form
#define name
denition
20
You
an use the same sorts of names for ma
ros as for variables. It is
onvention
to use only upper-
ase letters for ma
ros. A ma
ro
an be deleted via the #undef
dire
tive.
When s
anning the
ode, the prepro
essor just repla
es literally every o
urren
e of a ma
ro by its denition. If you have for example the expression 2.0*PI*omega in your
ode, the prepro
essor will
onvert it into
2.0*3.1415926536*omega. You
an use ma
ros also in the denition of other
ma
ros. But ma
ros are not repla
ed in strings, i.e. printf("PI"); will print
PI and not 3.1415926536 when the program is running.
It is possible to test for the (non)existen
e of ma
ros using the #ifdef and
#ifndef dire
tives. This allows for
onditional
ompiling or for platformindependent
ode, su
h as e.g. in
#ifdef UNIX
...
#endif
#ifdef MSDOS
...
#endif
After the body of the header le has been read the rst time during a
ompilation
pro
ess, the ma
ro _MYFILE_H_ is dened, thus the body will never read be
again.
So far, ma
ros are just
onstants. You will benet from their full power when
using ma
ros with arguments. They are given in bra
es after the name of the
ma
ro, su
h as e.g. in
21
You do not have to worry more than usual about the names you
hoose for the
arguments, there
annot be a
on
i
t with other variables of the same name,
be
ause they are repla
ed by the expression you provide when a ma
ro is used,
e.g. MIN(4*a, b-32) will be expanded to (4*a)<(b-32) ? (4*a):(b-32).
The arguments are used in bra
es () in the ma
ro, be
ause the
omparison <
must have the lowest priority, regardless whi
h operators are in
luded in the
expressions that are supplied as a
tual arguments. Furthermore, you should
take
are of unexpe
ted side ee
ts. Ma
ros do not behave like fun
tions. For
example when
alling MIN(a++,b++) the variable a or b may be in
reased twi
e
when the program is exe
uted. Usually it is better to use inline fun
tions (or
sometimes templates in C++) in su
h
ases. But there are many appli
ations
of ma
ros, whi
h
annot be repla
ed by in
line fun
tions, like in the following
example, whi
h
loses this se
tion.
dire
tion). A spin at the boundary may intera
t with fewer neighbors when free
boundary
onditions are assumed. With periodi
boundary
onditions (pb
),
all spins have exa
tly 4 neighbors. In this
ase, a spin at the boundary intera
ts
also with the nearest mirror images, i.e. with the sites that are neighbors if you
onsider the system repeated in ea
h dire
tion. For a 10 10 system spin 5,
whi
h is in the rst row, intera
ts with spins 5 + 1 = 6, 5 1 = 4, 5 + 10 = 15
and through the pb
with spin 95, see Fig. 1. The spin in the upper left
orner,
spin 1, intera
ts with spins 2; 11; 10 and 91. In a program pb
an be realized
by performing all
al
ulations modulo L (for the x-dire
tions) and modulo L2
(for the y -dire
tions), respe
tively.
This way of realizing the neighbor relations in a program has several disadvantages:
You have to write the
ode everywhere where the neighbor relation is
needed. This makes the sour
e
ode larger and less
lear.
When swit
hing to free boundary
onditions, you have to in
lude further
ode to
he
k whether a spin is at the boundary.
Your
ode works only for one latti
e type. If you want to extend the
program to latti
es of higher dimension you have to rewrite the
ode or
provide extra tests/
al
ulations.
Even more
ompli
ated would be an extension to dierent latti
e stru
tures su
h as triangle or fa
e-
enter
ubi
. This would make the program
look even more
onfusing.
An alternative is to write the program dire
tly in a way it
an
ope with almost
arbitrary latti
e types. This
an be a
hieved by setting up the neighbor relation
in one spe
ial initialization subroutine (not dis
ussed here) and storing it in an
array next[. Then, the
ode outside the subroutine remains the same for all
latti
e types and dimensions. Sin
e the
ode should work for all possible latti
e
dimensions, the array next is one dimensional. It is assumed that ea
h site has
num n neighbors. Then the neighbors of site i
an be stored in next[i*num n,
next[i*num n+1, : : :, next[i*num n+num n-1. Please note that the sites are
numbered beginning with 1. This means, a system with N spins needs an array NEXT of size (N+1)*num n. When using free boundary
onditions, missing
neighbors
an be set to 0. The a
ess to the array
an be made easier using a
ma
ro NEXT:
#define NEXT(i,r) next[(i)*num_n + r
NEXT(i,r)
ontains the neighbor of spin i in dire
tion r. For e.g. a quadrati
system, r=0 is the +x-dire
tion, r=1 the x-dire
tion, r=2 the +y -dire
tion and
r=3 the y -dire
tion. However, whi
h
onvention you use depends on you, but
you should make sure you are
onsistent. For the
ase of a quadrati
latti
e,
it is num n=4. Please note that whenever the ma
ro NEXT is used, there must
be a variable num_n dened, whi
h stores the number of neighbors. You
ould
23
in
lude num_n as a third parameter of the ma
ro, but in this
ase a
all of the
ma
ro looks slightly more
onfusing. Nevertheless, the way you dene su
h a
ma
ro depends on your personal preferen
es.
Please note that the NEXT ma
ro
annot be realized by an inline fun
tion, in
ase you want to set values dire
tly like in NEXT(i,0)=i+1. Also, when using an
inline fun
tion, you would have to in
lude all parameters expli
itly, i.e. num_n
in the example. The last requirement
ould be
ir
umvented by using global
variables, but this is bad programming style as well.
When the system is an Ising spin glass, the sign and magnitude of the intera
tion
may be dierent for ea
h pair of spins. The intera
tion strengths
an be stored
in a similar way to the neighbor relation, e.g. in an array j[. The a
ess
an
be simplied via the ma
ro J :
#define J(i,r) j[(i)*num_n + r
A subroutine for
al
ulating the energy H = hi;j i Jij i j may look as follows,
please note that the parameter N denotes the number of spins and the values of
the spins are stored in the array sigma[:
double spinglass_energy(int N, int num_n, int *next, int *j,
short int *sigma)
{
double energy = 0.0;
int i, r;
/*
ounters */
for(i=1; i<=N; i++)
/* loop over all latti
e sites */
for(r=0; r<num_n; r++)
/* loop over all neighbors */
energy += J(i,r)*sigma[i*sigma[NEXT(i,r);
}
return(energy/2);
For this pie
e of
ode the
omments explaining the parameters and the purpose
of the
ode are just missing for
onvenien
e. In the a
tual program it should be
in
luded.
The
ode for spinglass energy() is very short and
lear. It works for all
kinds of latti
es. Only the subroutine where the array next[ is set up has
to be rewritten when implementing a dierent type of latti
e. This is true for
all kinds of
ode realizing e.g. a Monte Carlo s
heme or the
al
ulation of a
physi
al quantity. For free boundary
onditions, additionally sigma[0=0 must
be assigned to be
onsistent with the
onvention that missing neighbors have
the id 0. This is the reason, why the spin site numbering starts with index 1
while C arrays start with index 0.
4.2
Make
Files
If your software proje
t grows larger, it will
onsist of several sour
e-
ode les.
Usually, there are many dependen
ies between the dierent les, e.g. a data
24
target : sour
es
<tab>
ommand(s)
The rst line
ontains the dependen
ies, the se
ond one the
ommands. The
ommand line must begin with a tabulator symbol <tab>. It is allowed to have
several targets depending on the same sour
es. You
an extend the lines with
the ba
kslash \n" at the end of ea
h line. The
ommand line is allowed to be
left empty. An example of a dependen
y/
ommand pair is
simulation.o: simulation.
simulation.h
<tab>
-
simulation.
The order of the rules is not important, ex
ept that make always starts with
the rst target. Please note that the make tool is not just intended to manage
the software development pro
ess and toggle
ompile
ommands. Any proje
t
where some output les depend on some input les in an arbitrary way
an
be
ontrolled. For example you
ould
ontrol the setting of a book, where you
have text-les, gures, a bibliography and an index as input les. The dierent
hapters and nally the whole book are the target les.
Furthermore, it is possible to dene variables, sometimes also
alled ma
ros.
They have the format
variable=denition
Also variables belonging to your environment like $HOME
an be referen
ed in
the makele . The value of a variable
an be used, similar to shells variables, by
pla
ing a $ sign in front of the name of the variable, but you have to embra
e
26
the name by (: : :) or f: : :g. There are some spe
ial variables, e.g. $ holds
the name of the target in ea
h
orresponding
ommand line, here no bra
es are
ne
essary. The variable CC is predened to hold the
ompiling
ommand, you
an
hange it by in
luding for example
CC=g
in the makele . In the
ommand part of a rule the
ompiler is
alled via $(CC).
Thus, you
an
hange your
ompiler for the whole proje
t very qui
kly by altering just one line of the makele .
Finally, it will be shown what a typi
al makele for a small software proje
t
might look like. The resulting program is
alled simulation. There are two
additional modules init.
, run.
and the
orresponding header .h les. In
datatypes.h types are dened whi
h are used in all modules. Additionally, an
external pre
ompiled obje
t le analysis.o in the dire
tory $HOME/lib is to be
linked, the
orresponding header le is assumed to be stored in $HOME/in
lude.
For init.o and run.o no
ommands are given. In this
ase make applies the
predened standard
ommand for les having .o as sux, whi
h reads like
<tab>
$(CC) $(CFLAGS) - $
where the variable CFLAGS may
ontain options passed to the
ompiler and is
initially empty. The makele looks like this, please note that lines beginning
with \#" are
omments.
#
# sample make file
#
OBJECTS=simulation.o init.o run.o
OBJECTSEXT=$(HOME)/lib/analysis.o
CC=g
CFLAGS=-g -Wall -I$(HOME)/in
lude
LIBS=-lm
simulation: $(OBJECTS) $(OBJECTSEXT)
<tab> $(CC) $(CFLAGS) -o $ $(OBJECTS) $(OBJECTSEXT) $(LIBS)
$(OBJECTS): datatypes.h
lean:
<tab> rm -f *.o
The rst three lines are
omments, then ve variables OBJECTS, OBJECTSEXT,
CC, CFLAGS and LIBS are assigned. The nal part of the makele are the rules.
Please note that sometimes bugs are introdu
ed, if the makele is in
omplete.
For example
onsider a header le whi
h is in
luded in several
ode les, but
this is not mentioned in the makele . Then, if you
hange e.g. a data type in the
27
header le, some of the
ode les might not be
ompiled again, espe
ially those
you did not
hange. Thus the same obje
ts les
an be treated with dierent
formats in your program, yielding bugs whi
h seem hard to explain. Hen
e,
in
ase you en
ounter mysterious bugs, a make
lean might help. But most
of the time, bugs whi
h are hard to explain are due to errors in your memory
management. How to tra
k down those bugs is explained in Se
. 7.
The make tool exhibits many other features. For additional details, please
onsult the referen
es given above.
4.3
S ripts
S
ripts are even more general tools than make les. They are in fa
t small
programs, but they are usually not
ompiled, i.e. they are qui
kly written but
they run slowly. S
ripts
an be used to perform many administration tasks like
ba
king up data, installing software or running simulation programs for many
dierent parameters. Here only an example
on
erning the last task is presented.
For a general introdu
tion to s
ripts, please refer to a book on UNIX/Linux.
Assume that you have a simulation program
alled
oversim21 whi
h
al
ulates
vertex
overs of graphs. In
ase you do not know what a vertex
over is, it does
not matter, just regard it as one optimization problem
hara
terized by some
parameters. You want to run the program for a xed graph size L, for a xed
on
entration
of the edges, average over num realizations and write the results
to a le, whi
h
ontains a string appendix in its name to distinguish it from
other output les. Furthermore, you want to iterate over dierent relative sizes
x. Then you
an use the following s
ript run.s
r:
#!/bin/bash
L=$1
=$2
num=$3
appendix=$4
shift
shift
shift
shift
for x
do
${HOME}/
over/
oversim21 -mag $L $
$x $num > \
mag_${
}_${x}${appendix}.out
done
The rst line starting with \#" is a
omment line, but it has a spe
ial meaning.
It tells the operating system the language in whi
h the s
ript is written. In this
ase it is for the bash shell, the absolute pathname of the shell is given. Ea
h
UNIX shell has its own s
ript language, you
an use all
ommands whi
h are
allowed in the shell. There are also more elaborate s
ript languages like perl or
phyton , but they are not
overed here.
28
S
ripts
an have
ommand line arguments, whi
h are referred via $1, $2, $2
et
., the name of the s
ript itself is stored in $0. Thus, in the lines 2 to 5, four
variables are assigned. In general, you
an use the arguments everywhere in the
s
ript dire
tly, i.e. it is not ne
essary to store them in other variables. It is done
here be
ause in the next four lines the arguments $1 to $4 are thrown away by
four shift
ommands. Then, the argument whi
h was on position ve at the
beginning is stored in the rst argument. Argument zero,
ontaining the s
ript
name, is not ae
ted by the shift.
Next, the s
ript enters a loop, given by \for x; do ... done". This
onstru
tion means that iteratively all remaining arguments are assigned to the
variable \x" and ea
h time the body of the loop is exe
uted. In this
ase, the
simulation is started with some parameters and the output dire
ted to a le.
Please note that you
an state the loop parameters expli
itly like in \for size
in 10 20 40 80 160; do ... done".
The above s
ript
an be
alled for example by
run.s
r 100 0.5 1000 testA 0.20 0.22 0.24 0.26 0.28 0.30
whi
h means that the graph size is 100, the fra
tion of edges is 0.5, the number
of realizations per run is 100, the string testA appears in the output le name
and the simulation is performed for the relative sizes 0.20, 0.22, 0.24, 0.26, 0.28,
0.30.
5
Libraries
Libraries are
olle
tions of subroutines and data types, whi
h
an be used in
other programs. There are libraries for numeri
al methods su
h as integration
or solving dierential equations, for storing, sorting and a
essing data, for
fan
y data types like lists or trees, for generating
olorful graphi
s and for
thousands of other appli
ations. Some
an be obtained for free, while other,
usually spe
ialized libraries have to be pur
hased. The use of libraries speeds
up the software development pro
ess enormously, be
ause you do not have to
implement every standard method by yourself. Hen
e, you should always
he
k
whether someone has done the jobs for you already, before starting to write a
program. Here, two standard libraries are brie
y presented, providing routines
whi
h are needed for most
omputer simulations.
Nevertheless, sometimes it is inevitable to implement some methods by yourself.
In this
ase, after the
ode has been proven to be reliable and useful for some
time, you
an put it in a self-
reated library. How to
reate libraries is explained
in the last part of this se
tion.
5.1
Numeri al Re ipes
The algorithms in
luded are all state of the art. There are several libraries dedi
ated to similar problems, e.g. the library of the Numeri
al Algorithms Group
[14 or the subroutines whi
h are in
luded with the Maple software pa
kage [15.
To give you an impression how the subroutines
an be used, just a short example
is presented. Consider the
ase that a symmetri
al matrix is given and that all
eigenvalues are to be determined. For more information on the library the
reader should
onsult Ref. [3. There it is not only shown how the library
an
be applied, but also all algorithms are explained.
The program to
al
ulate the eigenvalues reads as follows.
#in
lude
#in
lude
#in
lude
#in
lude
<stdio.h>
<stdlib.h>
"nrutil.h"
"nr.h"
/*
/*
/*
/*
30
*/
*/
*/
*/
/* give memory ba k */
In the rst part of the program, an n n matrix is allo
ated via the subroutine
matrix() whi
h is provided by Numeri
al Re
ipes . It is standard to let a ve
tor
start with index 1, while in C usually a ve
tor starts with index 0.
In the se
ond part a matrix is initialized randomly. Sin
e the following subroutines work only for symmetri
real matri
es, the matrix is initialized symmetri
ally. The Numeri
al Re
ipes also provide methods to diagonalize arbitrary
matri
es, for simpli
ity this spe
ial
ase is
hosen here .
In the third part the main work is done by the Numeri
al Re
ipes subroutines tred2() and tqli(). First, the matrix is written in tridiagonal form by
a Householder transformation (tred2()) and then the a
tual eigenvalues are
al
ulated by
alling tqli(d, e, n, m). The eigenvalues are returned in the
ve
tor d[ and the eigenve
tors in the matrix m[[ (not used here), whi
h is
overwritten. Finally the memory allo
ated for the matrix and the ve
tors is
freed again.
This small example should be su
ient to show how simply the subroutines
from the Numeri
al Re
ipes
an be in
orporated into a program. When you
have a problem of this kind you should always
onsult the NR library rst,
before starting to write
ode by yourself.
5.2
LEDA
While the Numeri
al Re
ipes are dedi
ated to numeri
al problems, the Library
of E
ient Data types and Algorithms (LEDA) [4
an help a great deal in
writing e
ient programs in general. It is written in C++, but it
an be used
by C style programmers as well via mixing C++
alls to LEDA subroutines
within C
ode. LEDA
ontains many basi
and advan
ed data types su
h as:
strings
numbers of arbitrary pre
ision
one- and two-dimensional arrays
lists and similar obje
ts like sta
ks or queues
sets
trees
graphs (dire
ted and undire
ted, also labeled)
31
di
tionaries, there you
an store obje
ts with arbitrary key words as indi
es
data types for two and three dimensional geometries, like points, segments
or spheres
For most data types, it is possible to
reate arbitrary
omplex stru
tures by using
templates. For example you
an make lists of self dened stru
tures or sta
ks of
trees. The most e
ient implementations known in literature so far are taken for
all data stru
tures. Usually, you
an
hoose between dierent implementations,
to mat
h spe
ial requirements. For every data type, all ne
essary operations
are in
luded; e.g. for lists:
reating, appending, splitting, printing and deleting
lists as well as inserting, sear
hing, sorting and deleting elements in a list, also
iterating over all elements of a list. The major part of the library is dedi
ated to
graphs and related algorithms. You will nd for example subroutines to
al
ulate
strongly
onne
ted
omponents, shortest paths, maximum
ows, minimum
ost
ows and (minimum) mat
hings.
Here again, just a short example is given to illustrate how the library
an be
utilized and to show how easy LEDA
an be used. A list of a self dened
lass Mydatatype is
onsidered. Ea
h element
ontains the data entries info
and flag. In the rst part of the program below, the
lass Mydatatype is
partly dened. Please note that input and output stream operators <</>> must
be provided to be able to
reate a list of Mydatatype elements, otherwise the
program will not
ompile. In the main part of the program a list is dened via
the LEDA data type list. Elements are inserted into the list with append().
Finally an iteration over all list elements is performed using the LEDA ma
ro
forall. The program leda test.
reads as follows:
#in
lude <iostream.h>
#in
lude <LEDA/list.h>
lass Mydatatype
// self defined example
lass
{
publi
:
int
info;
// user data 1
short int flag;
// user data 2
Mydatatype() {info=0; flag=0;};
//
onstru
tor
~Mydatatype() {};
// destru
tor
friend ostream& operator<<(ostream& O,
onst Mydatatype& dt)
{ O << "info: " << dt.info << " flag: " << dt.flag << "\n";
return(O);};
// output operator
friend istream& operator>>(istream &I, Mydatatype& dt)
{return(I);};
// dummy
};
32
// reate list
The -I
ag spe
ies where the
ompiler sear
hes for header les like LEDA/list.h,
the -L
ag tells where the libraries (-lG -lL) are lo
ated. The environment
variable LEDAROOT must point to the dire
tory where LEDA is stored in your
system.
Please note that using Numeri
al Re
ipes and LEDA together results in
on
i
ts,
sin
e the obje
ts ve
tor and matrix are dened in both libraries. You
an
ir
umvent this problem by taking the sour
e
ode of Numeri
al Re
ipes (here:
nrutil.
, nrutil.h) and rename the subroutines matrix() and ve
tor(),
ompile again and in
lude nrutil.o dire
tly in your program.
Here, it should be stressed: Before trying to write everything by yourself, you
should
he
k whether someone else has done it for you already. LEDA is a
highly ee
tive and very
onvenient tool. It will save you a lot of time and
eort when you use it for your program development.
5.3
Although many useful libraries are available, sometimes you have to write some
ode by yourself. Over the years you will
olle
t many subroutines, whi
h {
if properly designed {
an be in
luded in other programs, in whi
h
ase it is
onvenient to put these subroutines in a library. Then you do not have to in
lude
the obje
t le every time you
ompile one of your programs. If your self-
reated
33
library is put in a standard sear
h path, you
an a
ess it like a system library,
you even do not have to remember where the obje
t le is stored.
To
reate a library you must have an obje
t le, e.g. tasks.o, and a header le
tasks.h where all data types and fun
tion prototypes are dened. Furthermore,
to fa
ilitate the use of the library, you should write a man page, whi
h is not
ne
essary for te
hni
al reasons but results in a more
onvenient usage of your
library, parti
ularly should other people want to benet from it. To learn how
to write a man page you should
onsult man man and have a look at the sour
e
ode of some man pages, they are stored e.g. in /usr/man.
A library is
reated with the UNIX
ommand ar. To in
lude tasks.o in your
library libmy.a you have to enter
ar r libmy.a tasks.o
In a library several obje
t les may be
olle
ted. The option \r" repla
es the
given obje
t les, if they already belong to the library, otherwise they are added.
If the library does not exist yet it is
reated. For more options, please refer to
the man page of ar.
After in
luding an obje
t le, you have to update an internal obje
t table of the
library. This is done by
ar s libmy.a
In
ase libmy.a
ontains several obje
t les, it saves some typing by just writing
libmy.a, furthermore you do not have to remember the names of all your obje
t
les.
To make the handling of the library more
omfortable, you
an
reate a dire
tory,
e.g. /lib and put your libraries there. Additionally, you should
reate the
dire
tory /in
lude where all personal header les
an be
olle
ted. Then
your
ompile
ommand may look like this:
-o prog prog.
-I$HOME/in
lude -L$HOME/lib -lmy
The option -I states the sear
h path for additional header les, the -L option
tells the linker where your libraries are stored and via -lmy the library libmy.a
is a
tually in
luded. Please note that the prex lib and the postx .a are
omitted with the -l option. Finally, it should be pointed out, that the
ompiler
ommand given above works in all dire
tories, on
e you have set up the stru
ture
as explained. Hen
e, you do not have to remember dire
tories or names of obje
t
les.
6
Random Numbers
For many simulations in physi
s, random numbers are ne
essary. Quite often
the model itself exhibits random parameters whi
h remain xed throughout
34
the simulation, one speaks of quen
hed disorder . A famous example are spin
glasses. In this
ase one has to perform an average over dierent realizations of
the disorder, to obtain physi
al quantities.
But even when the system whi
h is treated is not random, very often random
numbers are required by the algorithms, e.g. to realize a nite-temperature ensemble or when using randomized algorithms. In this se
tion an introdu
tion to
the generation of random numbers is given. First it is explained how they
an
be generated at all on a
omputer. Then, dierent methods for obtaining numbers are explained, whi
h obey a given distribution: the inversion method , the
Box-Muller method and the reje
tion method . More
omprehensive information
about these and similar te
hniques
an be found in Refs. [3, 16.
In this se
tion it is assumed that you are familiar with the basi
on
epts of
probability theory and statisti
s.
6.1
First, it should be pointed out that standard
omputers are deterministi
ma
hines. Thus, it is
ompletely impossible to generate true random numbers, at
least not without the help of the user. It is for example possible to measure
the time interval between su
essive keystrokes, whi
h is randomly distributed
by nature. But they depend heavily on the
urrent user and it is not possible
to reprodu
e an experiment in exa
tly the same way. This is the reason why
pseudo random numbers are usually taken. They are generated by deterministi
rules, but they look like and have many of the properties of true random numbers. One would like to have a random number generator rand(), su
h that
ea
h possible number has the same probability of o
urren
e. Ea
h time rand()
is
alled, a new random number is returned. Additionally, if two numbers ri ; rk
dier only slightly, the random numbers ri+1 ; rk+1 returned by the respe
tive
subsequent
alls should have a low
orrelation.
The simplest methods to generate pseudo random numbers are linear
ongruential generators . They generate a sequen
e I1 ; I2 ; : : : of integer numbers between
0 and m 1 by a re
ursive re
ipe:
(1)
To generate random numbers r distributed in the interval [0; 1) one has to divide
the
urrent random number by m. It is desirable to obtain equally distributed
values in the interval, i.e. a uniform distribution. Below, you will see, how
random numbers obeying other distributions
an be generated from uniformly
distributed numbers.
The real art is to
hoose the parameters a;
; m in a way that \good" random
numbers are obtained, where \good" means \with less
orrelations". In the past
several results from simulations have been turned out to be wrong, be
ause of
the appli
ation of bad random number generators [17.
35
p(x)
1.2
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0.8
xi+1(xi)
0.6
0.4
0.2
0.2
0.4
0.6
0.8
0.6
0.8
xi
1
0.8
xi+1(xi)
0.6
0.4
0.2
0.2
0.4
xi
Figure 3: Two point
orrelations xi+1 (xi ) between su
essive random numbers xi ; xi+1 . The top
ase is generated using a linear
ongruential generator with the parameters a = 12351;
= 1; m = 215 ,
the bottom
ase has instead a = 12349.
mu
h more irregular, but poor
orrelations may be
ome visible for
higher k -tuples.
2
37
Inversion Method
P (z ) Prob(Z z )
dz 0p(z 0 )
(2)
g (z ))
1
(3)
Sin
e the distribution fun
tion F (u) = Prob(U u) for a uniformly distributed
variable is just F (u) = u (u 2 [0; 1), one obtains P (z ) = g 1 (z ). Thus, one just
has to
hoose g (z ) = P 1 (z ) for the transformation fun
tion, in order to obtain
random numbers, whi
h are distributed a
ording the probability distribution
P (z ). Of
ourse, this only works if P
an be inverted.
38
10
10
p(z)
10
10
10
10
39
6.3
As mentioned above, the inversion method works only when the distribution
fun
tion P
an be inverted. For distributions not fullling this
ondition, sometimes this problem
an be over
ome by drawing several random numbers and
ombining them in a
lever way, see e.g. the next subse
tion.
0.2
p(z)
0.2
0.1
0.1
0.0
10
Figure 5: The reje
tion method: points (x; y ) are s
attered uniformly over a
bounded re
tangle. The probability that y p(x) is proportional to p(x).
The reje
tion method , whi
h is presented in this se
tion, works for random variables where the probability distribution p(z ) ts into a box [x0 ; x1 ) [0; zmax ),
i.e. p(z ) = 0 for z 62 [x0 ; x1 and p(z ) zmax . The basi
idea of generating a
random number distributed a
ording to p(z ) is to generate random pairs (x; y ),
whi
h are distributed uniformly in [x0 ; x1 [0; zmax and a
ept only those values x where y p(x) holds, i.e. the pairs whi
h are lo
ated below p(x), see Fig.
5. Therefore, the probability that x is drawn is proportional to p(x), as desired.
The algorithm for the reje
tion method is:
40
The probability density for the Gaussian distribution with mean m and width
pG (z ) =
p1
2
exp
(z m)2
2 2
(5)
It is, apart from uniform distributions, the most
ommon distribution being
applied in simulations.
Here, the
ase of a normal distribution (m = 0; = 1) is
onsidered. If
you want to realize the general
ase, you have to draw a normally distributed
number z and then use z + m whi
h is distributed as desired.
Sin
e the normal distribution extends over an innite interval and
annot be
inverted, the methods from above are not appli
able. The simplest te
hnique to
generate random numbers distributed a
ording to a normal distribution makes
use of the
entral limit theorem. It tells us that any sum of N independently
distributed random variables ui (with mean m and varian
e v ) will
onverge
to a Gaussian distribution with mean Nm and varian
e Nv . If again ui is
taken take to be uniformly distributed in [0; 1) (whi
h hasPmean m = 0:5 and
6 will be
varian
e v = 1=12), one
an
hoose N = 12 and Z = 12
i=1 ui
distributed approximately normally. The drawba
k of this method is that 12
41
0.5
0.4
pG(x)
0.3
0.2
0.1
Figure 6: Gaussian distribution with zero mean and unit width. The
ir
les
represent a histogram obtained from 104 values drawn with the Box-Muller
method.
random numbers are needed to generate one nal random number and that
values larger than 6 never appear.
In
ontrast to this te
hnique the Box-Muller method is exa
t. You need two uniformly in [0; 1) distributed random variables U1 ; U2 to generate two independent
normal variables N1 ; N2 . This
an be a
hieved by setting
2 log(1 u1 )
os(2u2 )
N1 =
p
2 log(1 u1 ) sin(2u2 )
N2 =
A proof that N1 and N2 are indeed distributed a
ording to (5)
an be found
in Refs. [3, 16, where also other methods for generating Gaussian random
numbers, some even more e
ient, are explained. A method whi
h is based on
the simulation of parti
les in a box is explained in Ref. [18. In Fig. 6 a histogram
of 104 random numbers drawn with the Box-Muller method is shown.
7
front-end to gdb, and
he
kerg
, whi
h nds bugs resulting from bad memory
management.
7.1
gdb
The gdb gnu debugger tool is a sour
e
ode debugger . Its main purpose is
that you
an wat
h the exe
ution of your
ode. You
an stop the program
at arbitrarily
hosen points by setting breakpoints at lines or subroutines in the
sour
e
ode, inspe
t variables/data stru
tures,
hange them and let the program
ontinue (e.g. line by line). Here some examples for the most basi
operations
are given, detailed instru
tions
an be obtained within the program via the help
ommand.
As an example of how to debug, please
onsider the following little program
gdbtest.
:
#in
lude <stdio.h>
#in
lude <stdlib.h>
int main(int arg
,
har *argv[)
{
int t, *array, sum = 0;
When
ompiling the
ode you have to in
lude the option -g to allow debugging:
-o gdbtest -g gdbtest.
Now you
an enter
ommands, e.g. list the sour
e
ode of the program via the
list
ommand, it is su
ient to enter just l. By default always ten lines at
the
urrent position are printed. Therefore, at the beginning the rst ten lines
are shown (the rst line shows the input, the other lines state the answer of the
debugger)
43
(gdb) l
1
#in
lude <stdio.h>
2
#in
lude <stdlib.h>
3
4
int main(int arg
,
har *argv[)
5
{
6
int t, *array, sum = 0;
7
8
array = (int *) mallo
(100*sizeof(int));
9
for(t=0; t<100; t++)
10
array[t = t;
When entering the
ommand again the next ten lines are listed. Furthermore,
you
an refer to program lines of the
ode in the form list <from>, <to> or
to subroutines by typing list <name of subroutine>. More information
an
be obtained by typing help list.
To let the exe
ution stop at a spe
i
line one
an use the break
ommand
(abbreviation b). To stop the program before line 11 is exe
uted, one enters
(gdb) b 11
Breakpoint 1 at 0x80484b0: file gdbtest.
, line 11.
Now you
an inspe
t for example the
ontent of variables via the print
ommand:
(gdb) p array
$1 = (int *) 0x8049680
(gdb) p array[99
$2 = 99
You
an
ontinue the program at ea
h stage by typing next, then just the next
sour
e-
ode line is exe
uted:
44
(gdb) n
12
sum += array[t;
Subroutines are regarded as one sour
e-
ode line as well. If you want to debug the subroutine in a step-wise manner as well you have to enter the step
ommand. By entering
ontinue, the exe
ution is
ontinued until the next
breakpoint, a severe error, or the end of the program is rea
hed, please note the
the output of the program appears in the gdb window as well:
(gdb)
Continuing.
sum= 4949
Program exited normally.
As you
an see, the nal value (4949) the program prints is ae
ted by the
hange of the variable array[99.
The above given
ommands are su
ient for most of the standard debugging
tasks. For more spe
ialized
ases gdb oers many other
ommands, please have
a look at the do
umentation [5.
7.2
ddd
Some users may nd graphi
al user interfa
es more
onvenient. For this reason
there exists a graphi
al front-end to the gdb, the data display debugger (ddd) .
On UNIX operating systems it is just invoked by typing ddd (see also man page
for options). Then a ni
e windows pops up, see Fig. 7. The lower part of the
window is an ordinary gdb interfa
e, several other windows are available. By
typing file <program> you
an load a program into the debugger. Then the
sour
e
ode is shown in the main window of the debugger. All gdb
ommands
are available, the most important ones
an be entered via menus or buttons
using the mouse. For example to set a breakpoint it is su
ient to pla
e the
ursor in a sour
e-
ode line in the main ddd window and
li
k on the break
button. A good feature is that the
ontent of a variable is shown when moving
the mouse onto it. For more details, please
onsult the online help of ddd.
7.3
he kerg
Most program bugs are revealed by systemati
ally running the program and
ross-
he
king with the expe
ted results. But other errors seem to appear in a
rather irregular and unpredi
table fashion. Sometimes a program runs without a
problem, in other
ases it
rashes with a Segmentation fault at rather puzzling
lo
ations in the
ode. Very often a bad memory management is the
ause of su
h
a behavior. Writing beyond the boundaries of an array, reading uninitialized
memory lo
ations or addressing data whi
h has been freed already are the most
ommon bugs of this
lass. Sin
e the operating system organizes the memory in
a dierent way ea
h time a program is run, it is rather unpredi
table whether
45
Figure 7: The data display debugger (ddd). In the main window the sour
e
ode is shown. Commands
an be invoked via a mouse or by entering them into
the lower part of the window.
these errors be
ome apparent or not. Furthermore it is very hard to tra
k them
down, be
ause the ee
t of su
h errors most of the time be
omes visible at
positions dierent from where the error has o
urred.
As an example, the
ase where it is written beyond the boundary of an array
is
onsidered. If in the heap, where all dynami
ally allo
ated memory is taken
from, at the lo
ation behind the array another variable is stored, it will be
overwritten in this
ase. Hen
e, the error be
omes visible the next time the other
variable is read. On the other hand, if the memory blo
k behind the array is
not used, the program may run that time without any problems. Unfortunately,
the programmer is not able to in
uen
e the memory management dire
tly.
To dete
t su
h types of nasty bugs, one
an take advantage of several tools. A
list of free and
ommer
ial tools
an be found in Ref. [19. Here
he
kerg
is
onsidered, whi
h is a very
onvenient tool and freely available. It works under
46
Starting the program produ
es the following output, the program terminates
normally:
Sisko:seminar>gdbtest
Che
ker 0.9.9.1 (i686-p
-linux-gnu) Copyright (C) 1998 Tristan Gingold.
This program has been
ompiled with '
he
kerg
' or '
he
kerg++'.
Che
ker is a memory a
ess dete
tor.
Che
ker is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Publi
Li
ense for more details.
For more information, set CHECKEROPTS to '--help'
From Che
ker (pid:30448): `gdbtest' is running
From Che
ker (pid:30448): (bvh) blo
k bounds violation in the heap.
When Writing 4 byte(s) at address 0x0805fad
, inside the heap (sbrk).
0 byte(s) after a blo
k (start: 0x805f950, length: 396, mdes
: 0x0).
47
Two errors are reported, ea
h message starts with \From
he
ker". Both errors
onsist of a
esses to an array beyond the border (blo
k bound violation).
For ea
h error both the lo
ation in the sour
e
ode where the memory has
been allo
ated and the lo
ation where the error o
urred (Sta
k frames) are
given. In both
ases the error is
on
erned with what was allo
ated at line
8 (p
=0x08048863 in main at gdbtest.
:8). The bug appeared during the
loops over the array, when the array is initialized (line 10) and read out (line
12).
Other
ommon types of errors are memory leaks. They appear when a previously
used blo
k of memory has been forgotten to be freed again. Assume that this
happens in a subroutine whi
h is
alled frequently in a program. You
an
imagine that you will qui
kly run out of memory. Memory leaks are not dete
ted
using
he
kerg
by default. This kind of test
an be turned on by setting
a spe
ial environment variable CHECKEROPTS, whi
h
ontrols the behavior of
he
kerg
. To enable
he
king for memory leaks at the end of the exe
ution,
one has to set
export CHECKEROPTS="-D=end"
Let us assume that the bug from above is removed and instead the free(array);
ommand at the end of the program is omitted. After
ompiling with
he
kerg
,
running the program results in:
48
Obviously, the memory leak has been found. Further information on the various
features of
he
kerg
an be found in Ref. [20. A last hint: you should always
test a program with a memory
he
ker, even if everything seems to be ne.
8
Evaluating Data
To analyze and plot data, several
ommer
ial and non-
ommer
ial programs are
available. Here three free programs are dis
ussed, gnuplot , xmgr and fss
ale .
Gnuplot is small, fast, allows two- and three-dimensional
urves to be generated
and to t arbitrary fun
tions to the data. On the other hand xmgr is more
exible and produ
es better output. It is re
ommended that gnuplot is used
for viewing and tting data online, while xmgr is to be preferred for produ
ing
gures to be shown in talks or publi
ations. The program fss
ale has a spe
ial
purpose. It is very
onvenient for performing nite-size s
aling plots.
First, gnuplot and xmgr are introdu
ed with respe
t to drawing gures. In
the next subse
tion, data tting is
overed. Finally, it is shown how nite-size
s
aling plots
an be
reated. In all three
ases only very small examples
an be
presented. They should serve just as a motivation to study the do
umentation,
then you will learn about the manifold potential the programs oer.
8.1
Data Plotting
giving an example, it should be pointed out that gnuplot s
ripts
an be generated by simply writing the
ommands into a le, e.g.
ommand.gp, and
alling
gnuplot
ommand.gp.
The typi
al
ase is that you have a data le of x y data and you want to
plot the gure. Your le might look like this, it is the ground-state energy of a
three-dimensional J spin glass as a fun
tion of the linear system size L. The
lename is sg e0 L.dat. The rst
olumn
ontains the L values, the se
ond the
energy values and the third the standard error of the energy, please note that
lines starting with \#" are
omment lines whi
h are ignored on reading:
# ground state energy of +-J spin glasses
# L
e_0 error
3 -1.6710 0.0037
4 -1.7341 0.0019
5 -1.7603 0.0008
6 -1.7726 0.0009
8 -1.7809 0.0008
10 -1.7823 0.0015
12 -1.7852 0.0004
14 -1.7866 0.0007
displays the fourth
olumn as a fun
tion of the rst, with error bars given by the
5th
olumn. Among other options, it is possible to redire
t the output, for example to an en
apsulated posts
ript le (by setting set terminal posts
ript
and redire
ting the output set output "test.eps"). Also several les
an be
ombined into one gure. You
an set axis labels of the gure by typing e.g.
set xlabel "L", whi
h be
omes a
tive when the next plot
ommand is exe
uted. Online help on the plot
ommand and its manifold options is available
via entering help plot. Also three-dimensional plotting is possible using the
splot
ommand (enter help splot to obtain more information). For a general
introdu
tion you
an type just help. Sin
e gnuplot
ommands
an be entered
very qui
kly, you should use it for online viewing data and tting (see Se
. 8.2).
The xmgr (x motiv graphi
) program is mu
h more powerful than gnuplot and
produ
es ni
er output,
ommands are issued by
li
king on menus and buttons.
50
Curve Fitting
Figure 9: The xmgr program, just after a data le has been loaded, and the AS
button has been pressed to adjust the gure range automati
ally.
mations for the unknown parameters, please note that the exponential operator
is denoted by ** and the standard argument for a fun
tion denition is x, but
this depends only on your
hoi
e:
gnuplot>
gnuplot>
gnuplot>
gnuplot>
f(x)=e+a*x**b
e=-1.8
a=1
b=-1
The a
tual t is performed via the fit
ommand. The program uses the nonlinear least-squares Marquardt-Levenberg algorithm [3, whi
h allows a t a
ording to almost all arbitrary fun
tions. To issue the
ommand, you have to
state the t fun
tion, the data set and the parameters whi
h are to be adjusted.
For our example you enter:
gnuplot> fit f(x) "sg_e0_L.dat" via e,a,b
52
Then gnuplot writes log information to the output des
ribing the tting pro
ess.
After the t has
onverged it prints for the given example:
After 17 iterations the fit
onverged.
final sum of squares of residuals : 7.55104e-06
rel.
hange during last iteration : -2.42172e-09
degrees of freedom (ndf) : 5
rms of residuals
(stdfit) = sqrt(WSSR/ndf)
: 0.00122891
varian
e of residuals (redu
ed
hisquare) = WSSR/ndf : 1.51021e-06
Final set of parameters
=======================
e
a
b
+/- 0.0008548
+/- 0.2282
+/- 0.08265
= -1.78786
= 2.5425
= -2.80103
(0.04781%)
(8.976%)
(2.951%)
e
a
b
1.000
0.708 1.000
-0.766 -0.991 1.000
The most interesting lines are those where the results for your parameters along
with the standard error are printed. Additionally, the quality of the t
an be
estimated by the information provide in the three lines beginning with \degree
of freedom". The rst of these lines states the number of degrees of freedom,
whi
h is just the number of data points minus the number of parameters in the
t. The deviation of the t fun
tion f (x) from the data points (xi ; yi i )
P h yi f (xi) i2
, whi
h is denoted by WSSR
(i = 1; : : : ; N ) is given by 2 = N
i=1
i
in the gnuplot output. A measure of the quality of the t is the probability Q
that the value of 2 is worse than in the
urrent t, given the assumption that
the datapoints yi are Gaussian distributed with mean f (xi ) and varian
e one
[3. The larger the value of Q, the better is the quality of the t. To
al
ulate
Q you
an use the little program Q.
53
whi
h uses the gammaq fun
tion from Numeri
al Re
ipes [3. The program is
alled in the form Q <ndf> <WSSR/ndf>, whi
h
an be taken from the gnuplot
output.
To wat
h the result of the t along with the original data, just enter
gnuplot> plot "sg_e0_L.dat" w e, f(x)
Figure 10: Gnuplot window showing the result of a t
ommand along with the
input data.
Please note that the
onvergen
e depends on the initial
hoi
e of the parameters.
The algorithm may be trapped into a lo
al minimum in
ase the parameters are
54
too far away from the best values. Try the initial values e=1, a=-3 and b=1!
Furthermore, not all fun
tion parameters have to be subje
ted to the tting.
Alternatively, you
an set some parameters to xed values and omit them from
the list at the end of the t
ommand. You should also know that in the example
given above all data points enter into the result with the same weight. You
an
tell the algorithm to
onsider the error bars by typing fit f(x) "sg e0 L.dat"
using 1:2:3 via a,b,
. Then, data points with larger error bars have less
in
uen
e on the results. More on how to use the fit
ommand
an be found
out when entering help fit.
8.3
Finite-size S aling
Statisti
al physi
s des
ribes the behavior of systems with many parti
les. Usually, realisti
system sizes
annot be simulated on
urrent
omputers. To
ir
umvent this problem, the te
hnique of nite-size s
aling has been invented,
for an introdu
tion see e.g. Ref. [21. The basi
idea is to simulate systems of
dierent sizes and extrapolate to the large volume limit. Here it is shown how
nite-size s
aling
an be performed with the help of gnuplot [13 or with the
spe
ial-purpose program fss
ale [22
1
m(p,L)
0.8
0.6
L=3
L=5
L=14
0.4
0.2
0
0.1
0.15
0.2
0.25
0.3
ferromagneti
ally ordered state. This
an be observed in Fig. 11, where the
results [23 for dierent system sizes L = 3; 5; 14 are shown.
The
riti
al
on
entration p
, where the magnetization m vanishes, and the
riti
al behavior of m near the transition are to be obtained. From the theory
of nite-size s
aling, it is known that the average magnetization m hM i obeys
the nite-size s
aling form [24
m(p; L) = L
= m
~ (L1= (p
p ))
(6)
where m
~ is a universal, i.e. non size-dependent, fun
tion. The exponent
hara
terizes the algebrai
behavior of the magnetization near p
, while the exponent
des
ribes the divergen
e of the
orrelation length when p
is approa
hed. From
Eq. (6) you
an see that when plotting L= m(p; L) against L1= (p p
) with
orre
t parameters ; the data points for dierent system sizes should
ollapse onto a single
urve. A good
ollapse
an be obtained by using the values
p
= 0:222, = 1:1 and = 0:27. The determination of p
and the exponents
an be performed via gnuplot . For that purpose you need a le m s
aling.dat
with three
olumns, where the rst
olumn
ontains the system sizes L, the
se
ond the values of p and the third
ontains magnetization m(p; L) for ea
h
data point. First, assume that you know the values for p
; and . In this
ase,
the a
tual plot is done by entering:
gnuplot>
gnuplot>
gnuplot>
gnuplot>
b=0.27
n=1.1
p
=0.222
plot [-1:1 "m_s
ale.dat" u (($2-p
)*$1**(1/n)):($3*$1**(b/n))
The plot
ommand makes use of the feature that with the u(sing) option you
an transform the data of the input in an arbitrary way. For ea
h data set,
the variables $1,$2 and $3 refer to the rst, se
ond and third
olumns, e.g.
$1**(1/n) raises the system size to the power 1= . The resulting plot is shown
in Fig. 12. Near the transition p p
0 a good
ollapse of the data points
an
be observed.
In
ase you do not know the values of p
; ; you
an start with some estimated
values, perform the plot, resulting probably in a bad
ollapse. Then you may
alter the parameters iteratively and wat
h the resulting
hanges by plotting
again. In this way you
an
onverge to a set of parameters, where all data
points show a satisfying
ollapse.
The pro
ess of determining the nite-size s
aling parameters
an be performed
more
onveniently by using the spe
ial purpose program fss
ale . It
an be
obtained free of
harge from [22. This tool allows the s
aling parameters to
be
hanged intera
tively by pressing buttons on the keyboard, making a nitesize s
aling t very
onvenient to perform. Several dierent s
aling forms are
available. To obtain more information, start the program, with fss
ale -help.
A sample s
reen-shot is shown in Fig. 13
Please note that the data have to be presented to fss
ale in a le
ontaining
three
olumns, where the rst
olumn
ontains the system size, the se
ond the
56
Figure 12: Gnuplot output of a nite-size s
aling plot. The ground-state magnetization of a three-dimensional J spin glass as a fun
tion of the
on
entration p of the antiferromagneti
bonds is shown. For the t, the parameters
p
= 0:222; = 0:27 and = 1:1 have been used.
x-value and the third the y-value. If you have only data les with more
olumns,
you
an use the standard UNIX tool awk to proje
t out the relevant
olumns.
For example, assume that your data le results.dat has 10
olumns, and your
are interested in
olumns 3; 8; and 9. Then you have to enter:
awk '{print $3,$8,$9}' results.dat > proje
ted.dat
You
an also use awk to perform
alulations with the values in the
olumns,
similar to gnuplot , as in
awk '{print $1+$2, 2.0*$7, $8*$1}' results.dat
In this se
tion some basi
information regarding sear
hing for literature and
preparing your own presentations and publi
ations is given.
57
Figure 13: S reen-shot from a window running the fss ale tool.
9.1
Before
ontributing to the physi
al
ommunity and even publishing your results,
you should be aware of what exists already. This prevents you from redoing
something whi
h has been done before by someone else. Furthermore, knowing
previous results and many simulation te
hniques allows you to
ondu
t your own
resear
h proje
ts better. Unfortunately, mu
h information
annot be found
in textbooks. Thus, you must start to look at the literature. With modern
te
hniques like CD-ROMs and the Internet this
an be a
hieved very qui
kly.
Within this se
tion, it is assumed that you are familiar with the Internet and
are able to use a browser. In the following list several sour
es of information
are
ontained.
Literature databases
In
ase you want to obtain all arti
les from a spe
i
author or all arti
les on a
ertain subje
t, you should
onsult a literature database. In
physi
s the INSPEC [25 database is the appropriate sour
e of information. Unfortunately, the a
ess is not free of
harge. But usually your
library should allow a
ess to INSPEC, either via CD-ROMS or via the
Internet. If your library/university does not oer an a
ess you should
omplain.
INSPEC frequently surveys almost all s
ienti
journals in the areas of
physi
s, ele
troni
s and
omputers. For ea
h paper that appears, all bibliographi
information along with the abstra
t are stored. You
an sear
h
the database for example for author names, keywords (in the abstra
t or
title), publi
ation years or journals. Via INSPEC it is possible to keep
tra
k of re
ent developments happening in a
ertain eld.
There are many other spe
ialized databases. You should
onsult the web
page of your library, to nd out to whi
h of them you
an a
ess. Modern s
ienti
work is not possible without regularly
he
king literature
databases.
Preprint server
S ienti journals
{
{
{
{
{
{
{
{
Citation databases
In every s
ienti
paper some other arti
les are
ited. Sometimes it is
interesting to get the reverse information, i.e. to obtain all papers whi
h
are
iting a given arti
le A. This
an be useful, if one wants to learn about
the most re
ent developments whi
h are triggered by arti
le A. In that
ase you have to a
ess a
itation index . For physi
s, probably the most
important is the S
ien
e Citation Index (SCI) whi
h
an be a
essed via
the Web of S
ien
e [35. You have to ask your system administrator or
your librarian whether and how you
an a
ess it from your site.
The Ameri
an Physi
al So
iety (APS) [28 also in
ludes links to
iting
arti
les with the online versions of re
ent papers. If the
iting arti
le is
available via the APS as well, you
an immediately a
ess the arti
le from
the Internet. This works not only for
iting papers, but also for
ited
arti
les.
Phys Net
If you want to have a
ess to the web pages of a
ertain physi
s department, you should go via your web browser to the Phys Net pages [36.
They oer a list of all physi
s departments in the world. Additionally,
you will nd lists of forth
oming
onferen
es, job oers and many other
useful links. Also, the home page of your department probably oers many
interesting links to other web pages related to physi
s.
Web browsing
you should ask a sear
h engine . There are some very popular all purpose
engines like Yahoo [37 or Alta Vista [38. A very
onvenient way to start
a query on several sear
h engines in parallel is a meta sear
h engine , e.g.
Meta
rawler [39. To nd out more, please
onta
t a sear
h engine.
9.2
In this se
tion tools for two types of presenting your results are
overed: via an
arti
le/report or in a talk. For writing papers, it is re
ommended that you use
TEX/LATEX . Data plots
an be produ
ed using the programs explained in the
last se
tion. For drawing gures and making transparen
ies, the program xg
oers a large fun
tionality. To
reate three-dimensional perspe
tive images, the
program Povray
an be used. LATEX, xg and Povray are introdu
ed in this
se
tion.
First, TEX/LATEX is explained. It is a typesetting system rather than a word
pro
essor. The basi
program is TEX, LATEX is an extension to fa
ilitate the
appli
ation. In the area of theoreti
al
omputer s
ien
e, the
ombination of TEX
and LATEX is a widespread standard. When submitting an arti
le ele
troni
ally
to a s
ienti
journal usually LATEX has to be used. Unlike the
onventional o
e
pa
kages, with LATEX you do not see the text in the form it will be printed, i.e.
LATEX is not a WYSIWYG (\What you see is what you get") program. The
text is entered in a
onventional text editor (like Ema
s ) and all formatting
is done via spe
ial
ommands. An introdu
tion to the LATEX language
an be
found e.g. in Refs. [40, 41. Although you have to learn several
ommands, the
use of LATEX has several advantages:
The quality of the typesetting is ex
ellent. It is mu
h better than selfmade formats. You do not have to
are about the layout. But still, you
are free to
hange everything a
ording to your requirements.
Type setting of formulae is very
onvenient and fast. You do not have to
are about sizes of indi
es of indi
es et
. Furthermore, in
ase you want
for example to repla
e all in your formulae with , this
an be done
with a
onventional repla
e, by repla
ing all \alpha strings by a \beta
strings. For the
ase of an o
e system, please do not ask how to do this
onveniently.
Sin
e you
an use a
onventional editor, the writing pro
ess is very fast.
You do not have to wait for a huge pa
ket to
ome up.
61
On the other hand, if you still prefer a WYSIWYG (\what you see is
what you get") system, there is a program
alled lyx [42 whi
h operates like a
onventional word pro
essor but
reates LATEX les as output.
Nevertheless, on
e you get used to LATEX, you will never want to loose it.
Please note that this text was written entirely with LATEX. Sin
e LATEX is a type
setting language, you have to
ompile your text to
reate the a
tual output.
Now, an example is given of what a LATEX text looks like and how it
an be
ompiled. This example will just give you an impression of how the system
operates. For a
omplete referen
e, please
onsult the literature mentioned
above.
The following le example.tex produ
es a text with dierent fonts and a formula:
\do
ument
lass[12pt{arti
le}
\begin{do
ument}
This is just a small sample text. You
an write some words {\em
emphasized}\/, or in {\bf bold fa
e}. Also different {\small sizes}
are possible.
An empty line generates a new paragraph. \LaTeX\ is very
onvenient
for writing formulae, e.g.
\begin{equation}
M_i(t) = \fra
{1}{L^3} \int_V x_i \rho(\ve
{x},t) d^3\ve
{x}
\end{equation}
\end{do
ument}
The rst line introdu
es the type of the text (arti
le, whi
h is the standard)
and the font size. You should note that all tex
ommands begin with a ba
kslash
(n), in
ase you want to write a ba
kslash in your text, you have to enter
$\ba
kslash$. The a
tual text is written between the lines starting with
nbeginfdo
umentg and ending with nendfdo
umentg. You
an observe some
ommands su
h as nem, nbf or nsmall. The f g bra
es are used to mark blo
ks
of text. Mathemati
al formulae
an be written e.g. with nbeginfequationg
and nendfequationg. For the mathemati
al mode a huge number of
ommands
exists. Here only examples for Greek letters (nalpha), subs
ripts (x i), fra
tions
(nfra
), integrals (nint) and ve
tors (nve
) are given.
The text
an be
ompiled by entering latex example.tex. This is the
ommand for UNIX, but LATEX exists for all operating systems. Please
onsult the
do
umentation of your lo
al installation.
The output of the
ompiling pro
ess is the le example.dvi, where \dvi" means
\devi
e independent". The .dvi le
an be inspe
ted on s
reen by a viewer
via entering xdvi example.dvi or
onverted into a posts
ript le via typing
dvips -o example.ps example.dvi and then transferred to a printer. On
many systems it
an be printed dire
tly as well. The result will look like this:
This is just a small sample text. You
an write some words emphasized , or in bold fa
e. Also dierent sizes are possible.
62
This example should be su
ient to give you an impression of what the philosophy of LATEX is. Comprehensive instru
tions are beyond the s
ope of this
se
tion, please
onsult the literature [40, 41.
Under UNIX/Linux, the spell
he
ker ispell is available. It allows a simple spell
he
k to be performed. The tool is built on a di
tionary, i.e. a huge list of
known words. The program s
ans any given text, also a spe
ial LATEX mode is
available. Every time a word o
urs, whi
h is not
ontained in the list, ispell
stops. Should similar words exist in the list, they are suggested. Now the user
has to de
ide whether the word should be repla
ed,
hanged, a
epted or even
added to the di
tionary. The whole text is treated in this way. Please note
that many mistakes
annot be found in this way, espe
ially when the misspelled
word is equal to another word in the di
tionary. However, at least ispell nds
many spelling mistakes qui
kly and
onveniently, so you should use the tool.
Most s
ienti
texts do not only
ontain text, formulae and
urves, but also
s
hemati
gures showing the models, algorithms or devi
es
overed in the publi
ation. A very
onvenient but also simple tool to
reate su
h gures is xg .
It is a window based ve
tor-oriented drawing program. Among its features are
the
reation of simple obje
ts like lines, arrows, polylines, splines, ar
s as well
as re
tangles,
ir
les and other
losed, possibly lled, areas. Furthermore you
an
reate text or in
lude arbitrary (eps, jpg) pi
tures les. You may pla
e the
obje
ts on dierent layers whi
h allows
omplex s
eneries to be
reated. Dierent simple obje
ts
an be
ombined into more
omplex obje
ts. For editing you
an move,
opy, delete, rotate or s
ale obje
ts. To give you an impression what
xg looks like, in Fig. 14 a s
reen-shot is shown, displaying xg with the pi
ture
that is shown in Fig. 1. Again, for further help, please
onsult the online help
fun
tion or the man pages.
The gures
an be saved in the internal g format, and exported in several le
formats su
h as (en
apsulated) posts
ript , LATEX, Jpeg , Ti or bitmap. The
xg program
an be
alled in a way that it produ
es just an output le with
a given g input le. This is very
onvenient when you have larger proje
ts
where some small pi
ture obje
ts are
ontained in other pi
tures and you want
to
hange the appearan
e of the small obje
ts in all other les. With the help
of the make program pretty large proje
ts
an be realized.
Also, xg is very
onvenient when
reating transparen
ies for talks, whi
h is
the standard method of presenting results in physi
s. With dierent
olors,
text sizes and all the obje
ts mentioned before, very
lear transparen
ies
an
be
reated qui
kly. The possibility of in
luding pi
ture les, like posts
ript les
whi
h were
reated by a data plotting program su
h as xmgr , is very helpful.
In the beginning it may seem that more eort is ne
essary than when
reating
the transparen
ies by hand. However, on
e you have a solid base of transparen
ies you
an reuse many parts and preparing a talk may be
ome a question of
63
65
pigment { Blue } }
plane { <0, 1, 0>, -5
pigment {
he
ker
olor White,
olor Bla
k}}
light_sour
e { <10, 30, -3>
olor White}
amera {lo
ation <0, 8, -20>
look_at <0, 2, 10>
aperture 0.4}
The
reation of the pi
ture is started by
alling (here on a Linux system via
ommand line) x-povray +I test1.pov. The resulting pi
ture is shown in Fig.
15, please note the shadows on the plane.
Povray is really powerful. You an reate almost arbitrarily shaped obje ts,
ombine them into
omplex obje
ts and impose many transformations. Also
spe
ial ee
ts like blurring or fog are available. All features of Povray are des
ribed in a 400 page manual. The use of Povray is widespread in the artists
ommunity. For s
ientists it is very
onvenient as well, be
ause you
an easily
onvert e.g.
onguration les of mole
ules or three-dimensional domains
of magneti
systems into ni
e looking perspe
tive pi
tures. This
an be a
omplished by writing a small program whi
h reads e.g your
onguration le
ontaining a list of positions of atoms and a list of links, and puts for every
atom a sphere and for every link a
ylinder into a Povray s
ene le. Finally
the program must add suitable
hosen light sour
es and a
amera. Then, a
three-dimensional pi
tures is
reated by
alling Povray .
The tools des
ribed in this se
tion, should allow all te
hni
al problems o
urring
66
[28 http://publish.aps.org/
[29 http://www.elsevier.nl
[30 http://www.eps.org/publi
ations.html
[31 http://www.iop.org/Journals/
[32 http://www.springer.de/
[33 http://www.wiley-v
h.de/journals/index.html
[34 http://ejournals.wsp
.
om.sg/journals.html
[35 http://wos.isiglobalnet.
om/
[36 http://physnet.uni-oldenburg.de/PhysNet/physnet.html
[37 http://www.yahoo.
om/
[38 http://www.altavista.
om/
[39 http://www.meta
rawler.
om/index.html
[40 L. Lamport and D. Bibby, LaTeX : A Do
umentation Preparation System User's Guide and Referen
e Manual , (Addison Wesley, Reading (MA)
1994)
[41 http://www.tug.org/
[42 http://www.lyx.org/
[43 http://www.povray.org/
69