Sie sind auf Seite 1von 67

Probability and Statistics

(ENM 503)
Michael A. Carchidi
June 30, 2015
Chapter 10 - Simulation and Monte-Carlo Methods
The following notes are based on the textbook entitled: A First Course in
Probability by Sheldon Ross (9th edition) and these notes can be viewed at
https://canvas.upenn.edu/
after you log in using your PennKey user name and Password.
1. Introduction and Motivation for Simulation
In this chapter, we want to discuss simulation and specifically Monte-Carlo
methods for computing probabilities. A more detailed discussion of simulation
methods, of which Monte-Carlo methods is just one part of, is discussed in the
ESE 603 course that is offered during the spring semester and this course serves
as a good continuation of the ENM 503 course.
Let us motivate the ideas behind simulation and Monte-Carlo methods in
probability by considering the following geometric probability problem. Suppose
that two coins of radii R1 and R2 are thrown on a rectangular sheet of paper
having length L > 0 and width W > 0 so that the position of each coins center
uniformly lands somewhere on the sheet of paper. Note that this does not require
that the entire coin lands on the paper, only its center. Given these conditions, we
would like to compute (in terms of the inputs: L, W , R1 and R2 ) the probability
that the two coins overlap. Such a problem is known as a geometric probability
problem.

Without any lost in generality, we may assume that the rectangle is fixed on
an xy plane as the region
R = {(x, y)|0 x L, 0 y W },

(1)

which is shown in the following figure.

R = {(x, y)|0 x L, 0 y W }
If (X1 , Y1 ) give the coordinates of the center of coin 1 and if (X2 , Y2 ) give the
coordinates of the center of coin 2, then, under the conditions of the problem, we
have X1 and X2 , both independent and uniform random variables in the continuous interval [0, L), and we have Y1 and Y2 , both independent and uniform random
variables in the continuous interval [0, W ), i.e.,
X1 U [0, L)

Y1 U [0, W )

(2a)

X2 U[0, L)

Y2 U [0, W ).

(2b)

and
From the geometry of the problem, we then see that the two coins will overlap
when the distance between their centers,

D = (X2 X1 )2 + (Y2 Y1 )2
(3)

is less then or equal to the sum of their radii, i.e., when


D R1 + R2 ,
2

(4)

as illustrated in the following two figures.

Here we have D R1 + R2
and the two coins overlap
and

Here we have D > R1 + R2


and the two coins do not overlap
As mentioned earlier, only the centers of the coins are required to lie in the
rectangular region, not the entire coins themselves, as illustrated in the next two

figures.

Here we have D R1 + R2
and the two coins overlap
and

Here we have D > R1 + R2


and the two coins do not overlap
To compute the probability that the two coins overlap requires that we compute
P = Pr(D R1 + R2 )

(5)

where D is some random variable that could be as small as zero, when the two
centers coincide, or as large as (L2 + W 2 )1/2 , when the two centers are on opposite
corners of the rectangle. This is somewhat difficult to compute analytically since
the random variables X1 and X2 are from U[0, L) and the random variables Y1
and Y2 are from U[0, W ), making it difficult to determine the random nature of

the random variable D as defined in Equation (4), even through stating the range
space of D as

0 D L2 + W 2
is somewhat obvious.
We shall see that simulation offers a way to estimate the probability in Equation (5) using the computer and without requiring that much more work than we
have already done. Such an estimate will be provided in the last section of this
chapter. Before we see how this is accomplished, it should first be noted that
since
X1 U [0, L)
,
Y1 U [0, W )
and
X2 U [0, L)

Y2 U [0, W )

we have
X1 = LZ11

, Y1 = W Z12

, X2 = LZ21

, Y2 = W Z22

where Z11 , Z12 , Z21 and Z22 are all independent standard uniform random variables, U[0, 1). Then

P = Pr(D R1 + R2 ) = Pr( (X2 X1 )2 + (Y2 Y1 )2 R1 + R2 ),
becomes

P = Pr(
= Pr(
or
where

(LZ21 LZ11 )2 + (W Z22 W Z12 )2 R1 + R2 )

(Z21 Z11 )2 + (W/L)2 (Z22 Z12 )2 (R1 + R2 )/L)


P = Pr( (Z21 Z11 )2 + 2 (Z22 Z12 )2 )

(6a)

R1 + R2
W
and
=
,
(6b)
L
L
thereby showing that P is not a function of the four parameters: L, W , R1 and
R2 , but is rather a function of only the two parameters and , and in the special
case when W = L (i.e., when = 1), then P depends only on the single value
of . These results will serve as a way of checking the simulation for accuracy by
=

seeing if P stays fixed when one changes the values of L, W , R1 and R2 in a way
that keeps the values of and fixed.
At the heart of all simulations are random numbers so let us now discuss these.
A more detailed discussion is found in the ESE 603 course.
2. The Definition of Random Numbers
A random number (denoted by R) is simply a sample from the standard uniform distribution U [0, 1), whose pdf and cdf are given by

0,
x0
0,
x<0

x, 0 x 1 ,
1, 0 x < 1
&
F (x) =
f (x) =

1,
1x
0,
1x

respectively. It is easily seen that the mean and variance of the standard uniform
distribution are given by
2
 1
 1
1
1
1
E(X) =
and
V (X) =
xdx =
dx = ,
x
2
2
12
0
0
respectively, and these will help when it comes to checking a random sequence for
accuracy.
Random numbers are a necessary basic ingredient in simulation because from
a sample R U[0, 1), we shall see that in theory, it will be possible to generate a
sample from any other random variable X. The reader may never have to write
a computer program to generate random numbers because all well-written simulation software have built-in subroutines, objects, or functions that will generate
random numbers. For example, Microsoft Excel, which we shall use later to solve
the problem proposed in the introduction, has a routine called RAND() which
generates a random number. However, it is still important to understand the
basic ideas behind the generation and testing of random numbers.

3. Some Basic Properties of Random Numbers


Before we look at a common method for the generation of random numbers,
let us first discuss some basic properties of random numbers that can be used in
the testing of random numbers. Because a sequence of random numbers
{R1 , R2 , R3 , . . . , RN }
is a sample from the standard uniform distribution U[0, 1), it must satisfy the
following three important properties.
Uniformity, which says that if the interval [0, 1) is partitioned into n classes,
or subintervals of equal length, then the expected number of observations
in each class must be N/n, where N is the total number of observations.
Independence, which says that the probability of observing a value Rk in a
particular interval must be independent of any of the earlier values R1 , R2 ,
..., Rk1 .
Sample Mean and Variance, which says that the average sample value
R1 + R2 + R3 + + RN
N
in the limit of large N must approach 1/2, and the variance in this sample,

2
2
R1 + R2 + R3 + + RN
R12 + R22 + R32 + + RN

,
N
N
in the limit of large N must approach 1/12. In addition, these limits should
be approached in an oscillatory (or non-monotonic) manner so that sometimes they are too high and sometimes they are too low and sometimes they
are too high and sometimes they are too low, and so on.
4. Generation of Pseudo-Random Numbers
This section describes the common method for the generation of random numbers and some methods for testing these for randomness. Since a computer
algorithm must be used to generate random numbers, they are technically not
7

really random. For this reason, they are call pseudo random since the word pseudo
implies that the very act of generating random numbers by any known method
removes the potential for true randomness because if the method is known, the
set of random numbers can be replicated over and over again. Therefore a philosophical argument could be made that it is impossible to construct a computer
algorithm that generates truly random numbers.
Therefore, the real goal of any random-number generation scheme is to
produce a sequence of numbers between zero and one which simulates (or mimics)
the necessary properties of uniformity and independence as closely as possible, so
that if just the sequence of numbers
{R1 , R2 , R3 , . . . , RN }
is provided to a user, it should be virtually impossible for the user to reconstruct
the computer algorithm that produced this sequence of numbers.
When generating pseudo-random numbers, certain problems or errors can occur which should be avoided by a good algorithm. Some of these errors (but
certainly not all) include the following:
the generated numbers may really not be uniformly distributed,
the generated numbers may really be discrete-valued instead of continuous
valued,
the sample mean of the generated numbers may be consistently too high
above 1/2 or too low below 1/2,
the sample variance of the generated numbers may be consistently too high
above 1/12 or too low below 1/12, and
the numbers may not really be independent in that there may be dependence
in any of the following ways:
autocorrelation between numbers, e.g., every fifth number is larger
than the mean of 1/2, and so on,
numbers successively higher or lower than adjacent numbers,
8

several numbers are found above the mean followed by several numbers
below the mean, and so on.
Any departures from uniformity and independence for a particular generation
scheme may be detected by tests such as those we shall describe later. Generators,
such as RAND() in Microsoft Excel, have pass many of these tests as well as more
stringent tests and so there is really no excuse for using a generator that is later
been found to be deficient.
In most cases, random numbers are generated as part of a subroutine (or
function) for a given simulation and most generators of random numbers should
satisfy the following practical conditions:
the generator routine should be fast since good statistics requires a large
sample size of random numbers,
the generator routine should be portable to different computers, and ideally
to different programming languages,
the generator routine should have a long cycle length or period, which is
the length of the random-number sequence before previous numbers begin
to repeat themselves, (and what this means will be discussed in more detail
later),
the random numbers generated should be replicable so that it should be
possible to generate the same sequence of random numbers given the same
starting point in the sequence,
the generated random numbers should closely approximate the ideal statistical properties of uniformity and independence.
Note that constructing algorithms that seem to generate random numbers is
easy, but constructing algorithms that really do produce sequences of random
numbers that are independent and uniformly distributed in the interval between
0 and 1 is much more difficult.
One purpose of this section is to discuss the central issues in random-number
generation in order to enhance ones understanding in the generation of random

numbers and to show some of the techniques that are used to test a sequence of
numbers for independence and uniformity.
First we discuss the techniques for generating random numbers and then we
shall discuss some tests used to see if these sequences are random.
A seemingly simple way to generate a sequence of N random numbers
{R0 , R1 , R2 , R3 , . . . , RN }
is to start with a continuous function f that maps the interval [0, 1) onto the
interval [0, 1), i.e.,
f : [0, 1) [0, 1).
Then an initial value (called the seed) R0 in the interval [0, 1) is chosen and the
iteration scheme Rn+1 = f (Rn ), for n = 0, 1, 2, . . . , N 1, is used to generate
R1 , R2 , R3 , . . ., RN . This is known as an iteration (or recursive) method. Lets
illustrate the idea with two examples.
Example #1: f(R) = 4R(1 R)
The function f (R) = 4R(1 R), which is plotted below,
1
0.8
0.6
0.4
0.2

0 0

0.2

0.4

0.6

0.8

Plot of f (R) = 4R(1 R) for 0 R 1


does map the unit interval [0, 1) into itself and if we let R0 = 0.6, then
Rn+1 = f (Rn ) = 4Rn (1 Rn )
10

produces the sequence


{R0 , R1 , R2 , R3 , . . .} = {0.6, 0.95, 0.1536, 0.5200, . . .}
which may, on the surface look random but tests for uniformity will reveal that
such a sequence is not very random at all. Choosing the seed R0 = 0.75 instead
produces the sequence
{R0 , R1 , R2 , R3 , . . .} = {0.75, 0.75, 0.75, 0.75, . . .}
which is certainly not random, or choosing the seed

5 5
0.34549
R0 =
8
produces the sequence
{R0 , R1 , R2 , R3 , . . .} =

5 5 5+ 5 5 5 5+ 5
,
,
,
, ...
8
8
8
8

or
{R0 , R1 , R2 , R3 , . . .} = {0.34549, 0.90451, 0.34549, 0.90451, ...}
which is also certainly not random, showing that such a recursive method is very
much dependent on the value of the seed R0 . In addition, we should note that
the sequence
generated using R0 = 0.6 may never contain the numbers 0.75 or

(5 5)/8. 
Example #2: f(R) = R2
It should be noted that some choices of the mapping function f : [0, 1)
[0, 1) will never produce a sequence that looks random for any choice of R0 . For
example, the mapping function f (R) = R2 , which is plotted below,

11

1
0.8
0.6
0.4
0.2

0 0

0.2

0.4

0.6

0.8

Plot of f (R) = R2 for 0 R 1


yields (using any value of 0 < R0 < 1) the monotonically decreasing sequence
{R0 , R1 , R2 , R3 , . . .} = {R0 , R02 , R04 , R08 , . . .}
which tends to zero and is definitely not random. 
Another method for possibly generating random numbers, which is a little
more direct, is to let Rn equal to some function f (n) which maps the positive
integers (N ) to the interval [0, 1), and let us illustrate this with an example.
Example #3: Using a Function f : N [0, 1)
Consider the function f (n) = | sin(n)| which maps the positive integers onto
the unit interval [0, 1), and let
Rn = f (n) = | sin(n)|
for n = 1, 2, 3, . . .. This is plotted in the following figure,

12

1
0.8
0.6
0.4
0.2

0 0

10

Plot of Rn = |sin(n)| versus n


and it produces the sequence
{R1 , R2 , R3 , R4 , R5 , . . .} = {0.8415, 0.9093, 0.1411, 0.7568, 0.9589, . . .}
which may, on the surface look random, but tests for uniformity will reveal that
such a sequence is not very random at all. In addition, there is no seed R0 that
can be adjusted so that the same sequence will result every time the algorithm is
used. Of course, this problem can be removed by setting
Rn = |sin(nR0 )|
for n = 1, 2, 3, . . ., where R0 is an adjustable parameter which acts like a seed.

The biggest problem with both of the approaches,
f : [0, 1) [0, 1)

and

f : N [0, 1),

is that they rely on real-number arithmetic which can sometimes be unpredictable when performed by a computer. To illustrate this statement, the reader
is directed to the 4R(1 R) worksheet in the Microsoft Excel file that accompanies this chapter. This worksheet illustrates what is commonly known as the
butterfly effect, which says that a small change at the beginning of an iteration scheme could very quickly propagate into a very large effect later on. It is
sometimes dramatically worded to say that a single butterfly flapping its wings in
South America could result in a tornado being formed in Texas. Specifically, this
worksheet shows that the sequence generated using
R0 = 0.60000001

and
13

Rn+1 = 4Rn (1 Rn )

is very different from the sequence generated using


R0 = 0.60000002

and

Rn+1 = 4Rn (1 Rn )

even as soon as in the values of R18 . This is mainly due to the limited storage
capability of a computer and these effects can sometimes not be avoided and is
the subject of a branch of mathematics known as Chaos Theory. A better scheme,
which uses mostly integer arithmetic (and hence avoids this type of chaotic behavior) is now described.
Linear Congruential Method
The linear congruential method is the most widely used method for generating
random numbers. The major advantage of this method is that it uses mostly
integer arithmetic and hence can be implemented easily on a computer with very
dependable outcomes. The linear congruential method first produces a sequence
of integers
{X0 , X1 , X2 , X3 , . . . , Xn , . . .}
between 0 and m 1 according to the following linear recursive relationship
Xn+1 = (aXn + c) mod(m)

(7)

for n = 0, 1, 2, 3, . . .. The initial integer value X0 is called the seed, the integer a
is called the constant multiplier, the integer c is the increment, and the integer
m is the modulus with m > 1. From this sequence of integers, the sequence of
random numbers in the interval [0, 1),
{R0 , R1 , R2 , R3 , . . . , Rn , . . .},
is then computed using Rn = Xn /m for n = 0, 1, 2, 3, . . ., and hence involves
a single division. This is the only real-number arithmetic needed and all other
arithmetic is integer.
Modular Arithmetic
By definition, we say that a = b mod(m) when the integer a b is evenly
divisible by m. In fact, the notation b mod(m) is used to represent the remainder
one gets when one divides b by m. For example, 7 mod(3) = 1 since 3 divided
14

into 7 gives 2 with a remainder of 1. Also, 10 mod(3) = 2 since 3 divided into


10 gives 4 with a remainder of 2, and 15 mod(3) = 0 since 3 divided into 15
gives 5 with a remainder of 0. It should be clear that b mod(3) can equal any one
of three possible values: 0, 1 or 2.
It should be clear from the definition of mod(m) that each Xn computed in
Equation (7) must equal an integer from 0 to m 1, inclusive, and hence the
sequence of integers
{X0 , X1 , X2 , X3 , . . . , Xn , . . .}
must eventually become repetitive, which then says that the resulting sequence of
random numbers
{R0 , R1 , R2 , R3 , . . . , Rn , . . .},
with Rn = Xn /m for n = 0, 1, 2, 3, . . ., must also eventually become repetitive.
The cycle length of the sequence
{X0 , X1 , X2 , X3 , . . . , Xn , . . .}

and hence

{R0 , R1 , R2 , R3 , . . . , Rn , . . .},

equals the number of entries in the repetitive part of the sequence. This would
then suggest that the sequence
{R0 , R1 , R2 , R3 , . . . , Rn , . . .},
is really not random unless the cycle length of the sequence is large enough so that
the repetitive nature of the sequence is very well hidden. We shall see that the
selection of the values for a, c, m, and X0 can drastically affect the cycle length
but first lets look at an example.
Example #4: A Linear Congruence
Let us use the linear congruential method to generate a sequence of random
numbers using a = 17, c = 43, m = 100, and X0 = 27 (along with X0 = 20) in
the equation
Xi+1 = (aXi + c) mod(m) = (17Xi + 43) mod(100)
for i = 0, 1, 2, 3, . . .. Here the Xi s will be integers from 0 to 99, inclusive, and
so the Ri s will be two decimal-place random numbers between 0.00 and 0.99,
15

inclusive. The following two tables of results (one using X0 = 27 and one using
X0 = 20) are obtained.
i
0
1
2
3
4
5
6
7
8
9
10

Xi
27
2
77
52
(27)
2
77
52
27
2
77

Ri
0.27
0.02
0.77
0.52
0.27
0.02
0.77
0.52
0.27
0.02
0.77

i Xi
0 20
1 83
2 54
3 61
4 80
5
3
6 94
7 41
8 40
9 23
10 34

Ri
0.20
0.83
0.54
0.61
0.80
0.03
0.94
0.41
0.40
0.23
0.34

i
11
12
13
14
15
16
17
18
19
20
21

Xi
21
0
43
74
1
60
63
14
81
(20)
83

Ri
0.21
0.00
0.43
0.74
0.01
0.60
0.63
0.14
0.81
0.20
0.83

The numbers in parenthesis show where the sequence starts to repeat. Note that
using the seed X0 = 27 gives in the numbers 0.27, 0.02, 0.77, and 0.52 and these
continually repeat resulting in a cycle length of 4, but using the seed X0 = 20
does a little better resulting in a cycle length of 20, but note that the numbers
0.27, 0.02, 0.77 and 0.52 can never appear in this sequence of 20 numbers.  One
should note that the resulting sequence generated by
Xi+1 = (aXi + c) mod(m)
does depends on the seed X0 and the repetitive part of two different sequences
can have no elements in common.
The ultimate test of the linear congruential method, as of any generation
scheme, is how closely the generated numbers approximate uniformity and independence. Other important properties include maximum density and maximum
period. By maximum density, it is meant that the values assumed by the Ri s
leave no large gaps on [0, 1).
Gaps
With regards to these gaps, note that the sequence of random numbers gen-

16

erated by the linear congruential method can only come from the set


1 2 3
m1
0, , , , ,
m m m
m
which means that the Ri s are discrete (not continuous) on the interval [0, 1) and
the gap is no smaller than 1/m. However, all of this is of little consequence if the
modulus m is very large. Values of m as large as
m = 248 = 281, 474, 976, 710, 656
are in common use these days making
1
3.5527 1015
m
so that the discreteness of such a sequence, and the resulting gap produced, is
well hidden.
Periods
With regards to the period, we again note that the sequence of random numbers generated by the linear congruential method can only come from the set


1 2 3
m1
0, , , , ,
m m m
m
which means that maximum period of the sequence of Ri s can be no larger than
m and a maximum period equal to m can be achieved by proper choices of a, c
and X0 (for a given value of m). Specifically, the following general results from
number theory can be utilized to insure maximum periods when m is either a
power of 2 (which is good when it comes to computers) or when m is a prime
number.
For m a power of 2, and c = 0, the longest possible period that can be
achieved is m, and this is accomplished whenever c is odd and a = 1 mod(4).
Furthermore, it should be obvious that this does not depend on the choice
of seed X0 since every integer from 0 to m 1, inclusive will be represented
somewhere in the sequence of Xi s.
17

A decrease in the number of operations in Xi+1 = (aXi + c) mod(m) can be


accomplished by making c = 0, so that Xi+1 = (aXi ) mod(m), and for m
a power of 2, and c = 0, the longest possible period that can be achieved
is m/4, and this is accomplished whenever the seed X0 is odd and the
multiplier a satisfies a = 3 mod(8) or a = 5 mod(8).
For m a prime number and c = 0, the longest possible period that can be
achieved is m 1 which is accomplished whenever the multiplier a has the
property that the smallest integer k such that ak = 1 mod(m) is k = m 1.
In other words we want a so that ak 1 is not divisible by m for all values
of k = 1, 2, 3, . . . , m 2, yet ak 1 is divisible by m for k = m 1. In
addition, we must have X0 = 0, since X0 = 0 generates only the sequence
{0, 0, 0, . . . , 0, . . .}.
Example #5: A Linear Congruence in Which m is a Power of 2
Lets assume that a = 13, m = 26 = 64 (a power of 2), c = 0 and X0 = 1, 2, 3,
and 4. Then
Xi+1 = (13Xi ) mod(64)
and we generate the following table.
i
0
1
2
3
4
5
6
7
8

Xi
1
13
41
21
17
29
57
37
33

Xi
2
26
18
42
34
58
50
10
(2)

Xi
3
39
59
63
51
23
43
47
35

Xi
4
52
36
20
(4)
52
36
20
4

i
9
10
11
12
13
14
15
16
17

Xi
45
9
53
49
61
25
5
(1)
13

Xi
26
18
42
34
58
50
10
2
26

Xi
7
27
31
19
55
11
15
(3)
39

Xi
52
36
20
4
52
36
20
4
52

The numbers in parenthesis show where the sequence starts to repeat. The
maximum period of m/4 = 16 is achieved using X0 odd (1 or 3). Notice that
a = 13 = 5 mod(8) as required to achieved maximum period. Note also that when
X0 = 1, the generated sequence assumes values (when ordered) in the set
{1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61}
18

and the gap in this sequence of random numbers is equal to


5
1
1

=
= 0.0625
64 64
16
which is large, and this leads one to be concerned about the density of the random
numbers generated using the scheme in this example. Of course, this generator
has a period that is too short and the density is insufficiently low for it to be used
to generate random numbers, but it does illustrate the importance of properly
choosing a, c, m and X0 . 
Example #6: A Linear Congruence in Which m is Prime
Lets assume that a = 5, m = 7 (a prime) and c = 0. Then the following
table shows that choosing a = 5 leads to a maximum period of m 1 = 6 using
Xi+1 = 5Xi mod(7).
k
ak 1
1
51 1 = 4
2
52 1 = 24
3
53 1 = 124
4
54 1 = 624
5 55 1 = 3124
6 56 1 = 15624

Comments
Not Divisible by
Not Divisible by
Not Divisible by
Not Divisible by
Not Divisible by
Divisible by 7

7
7
7
7
7

i
0
1
2
3
4
5
6

Xi
3
1
5
4
6
2
(3)

The number in parenthesis show where the sequence of Xi s starts to repeat. The
maximum period of m 1 = 6 is then achieved using any value of X0 not equal
to zero, and the resulting sequence of random numbers




3 1 5 4 6 2 3
1 2 3 4 5 6 1
, , , , , , ,...,
, , , , , , ,...,
7 7 7 7 7 7 7
7 7 7 7 7 7 7
when ordered, produces a gap of 1/7. 
Once again we point out that using a large value of m such as m = 248 and
having a maximum period of m (when c = 0) or m/4 (when c = 0) will result
in a small gap and a large period for the appropriately chosen values of a and c,
and this will mask the discrete and the repetitive nature of the numbers being
generated.
19

If the reader is using a reliable random-number generator that is provided with


a reliable software package, then the set of random numbers generated need not
be tested for randomness. In this case, the reader may choose to skip the next
section and proceed to Section 6.
5. Tests for Random Numbers - Optional
Even though a sequence of random numbers may look random, many statistical test must be performed to insure their randomness. The desirable properties
of random numbers - uniformity and independence - were discussed earlier. To
insure that these desirable properties are achieved, a number of statistical tests
can be performed on the generated numbers. Note that the appropriate statistical
tests have already been performed on generators used for most commercial simulation software because, without a dependable random number generator, any
simulation that uses this random-number generator would not yield valid results.
For this reason, this section is for informational purposes only. A more detailed
discussion of these is covered in the ESE 603 course.
Some common tests for random numbers that should be preformed if an inhouse random number generator is constructed, are as follows.
Test for Uniformity
Frequency Test #1 : Uses the Kolmogorov-Smirnov (KS) test to compare the sample cumulative distribution of the set of generated numbers
to the theoretical standard uniform cumulative distribution for U[0, 1)
given by

0, when x 0

x, when 0 x 1 .
F (x) =

1, when 1 x

Frequency Test #2 : Uses the chi-squared (2 ) statistic


2

n

(Oi Ei )2
i=1

20

Ei

to test if that the actual measured number of numbers in a particle


class (Oi ) equals the expected number in that class (Ei ), for all the n
classes, as predicted by the standard uniform distribution U[0, 1).
Tests for Independence
Runs Test: Tests the runs up and down or the runs above and below
the mean by comparing the actual values to the expected values as
predicted by the standard uniform distribution U[0, 1). The statistic
for comparison are the Standard Normal Distribution and The ChiSquared Distribution.
Autocorrelation Test: Tests the correlation between the generated numbers and compares the sample correlation to the expected correlation
of zero as predicted by the standard uniform distribution U [0, 1). The
statistic for comparison is the Standard Normal Distribution.
Gap Test: Counts the number of digits that appear between repetitions of a particular digit and then uses the Kolmogorov-Smirnov (KS)
test to compare this with the expected size of gaps as predicted by a
geometric distribution. The statistic for comparison is the Geometric
Distribution.
Poker Test: Treats numbers grouped together as a poker hand. For
example, a five-digit number 0.11433 can be though of as a five-card
poker hand having two pairs, or 0.2222 can be though of as a five-card
poker hand having five of a kind, and so on. Then, a chi-squared (2 )
statistic is used to compare the frequency of these poker hands to
what is expected based on a deck of 50 cards having: five 0s, five
1s, five 2s, and so on up to five 9s (in the case of five-digit random
numbers).
Hypothesis Testing for Uniformity
In testing for uniformity, the null hypothesis is
H0 : {R1 , R2 , R3 , . . . , RN } are uniform on the interval [0, 1)
(neglecting the seed R0 ) and failure to reject this null hypothesis means that no
evidence of non-uniformity has been detected on the basis of this test.
21

Note that this does not imply that further testing of the generator for uniformity
is unnecessary because no test can ever guarantee that the generated numbers
are distributed uniformly on the interval [0, 1).
Hypothesis Testing for Independence
In testing for independence, the null hypothesis is
H0 : {R1 , R2 , R3 , . . . , RN } are independent on the interval [0, 1)
(neglecting the seed R0 ) and failure to reject this null hypothesis means that no
evidence of dependency has been detected on the basis of this test.
Note that this does not imply that further testing of the generator for independence is unnecessary because no test can ever guarantee that the generated
numbers are independence.
Level of Significance
For each of the above tests, a level of significance must be stated. This level
of significance is the probability of rejecting the null hypothesis given that the
null hypothesis is true and is known as a Type I() error, i.e.,
= Pr(Reject H0 | H0 is true)

(8a)

which is often referred to as a false positive. The decision maker sets the value
of , and usually, is set to a small value such as 0.01 or 0.05. This then says
that the probability that you reject the null hypothesis, given that it is true (i.e.,
make a false positive) would be small. Note that a Type II() error involves the
probability of accepting the null hypothesis given that the null hypothesis is false
and is defined as
= Pr(Accept H0 | H0 is false),
(8b)
and this is known as a false negative. Of course
Pr(Accept H0 | H0 is true) = 1 &
are not considered errors.
22

Pr(Reject H0 | H0 is false) = 1

Note that we can never choose to accept H0 with certainty. We can only choose
to reject H0 (or accept H0 ) up to a certain significance level.
If several tests are made on a sequence of random numbers, the probability
of rejecting the sequence (making a Type I() error) on at least one test, by
chance alone, must increase. Similarly, if one test is conducted on many sets of
random numbers, the probability of rejecting at least one set (making a Type
I() error), by chance alone must increase as well. For example, if 100 sets of
numbers were subjected to a particular test, with = 0.05, it would be expected
that (100)(0.05) = 5 of these sets would be rejected by chance alone, or if one set
of numbers is subjected to 100 tests (all with the same level ), then this set of
numbers is expected to not pass (100)(0.05) = 5 of these tests by chance alone.
In general, if the number of rejections in N tests (all with the same level ) is
close to the expected number, N , then there is no compelling reason to discard
the generator that is being tested since N rejections would normally occur by
chance alone. In addition, if a set of random numbers passes all the tests, it is
still no guarantee that the set is truly random because it is always possible that
some underlying pattern will go undetected.
Frequency Tests
Basic tests that should always be performed to validate a new generator
of random numbers are tests for uniformity. At least two different methods of
testing are readily available. They are the Kolmogorov-Smirnov (KS) and the
chi-squared (2 ) tests and both of these tests measure the degree of agreement
between the distribution of a sample of generated random numbers and results
predicted by the theoretical uniform distribution U [0, 1). These both assume the
null hypothesis of no significant difference between the sample distribution and
the theoretical distribution.
The Kolmogorov-Smirnov (KS) Test
This test compares the empirical cdf SN (x) constructed from a sample of N
random numbers to the theoretical cdf F (x) of the standard uniform distribution.

23

For the standard uniform distribution U [0, 1), the theoretical cdf is given by

0, for x 0

x, for 0 x 1
F (x) =

1, for 1 x
which is plotted below.

1
0.8
0.6
0.4
0.2

-0.5

0 0

0.5

1.5

Plot of the theoretical cdf for U[0, 1)


If a sample from a given random-number generator is {R1 , R2 , R3 , . . . , RN }, (not
including the seed R0 ) then the empirical cdf of this sample is constructed using
SN (x) =

Number of Elements in {R1 , R2 , R3 , . . . , RN } x


.
N

One would think that, if the null hypothesis


H0 : {R1 , R2 , R3 , . . . , RN } U [0, 1)
is true, then SN (x) should be an approximation to F (x), especially as N becomes
larger.
One measure of this approximation is the largest absolute deviation between
F (x) and SN (x) over the range space of the random variable X U [0, 1), i.e.,
DN = max |F (x) SN (x)| .
0x<1

24

(9a)

The Kolmogorov-Smirnov test uses this sampling distribution of DN and this is


tabulated in many references for various values of and of N . If fact, Kolmogorov
and Smirnov showed that




x
2 2

Pr DN
12
(1)k1 e2k x
(9b)
N
k=1
for large values of N . For example, setting




x
2 2
Pr DN D,N =
12
(1)k1 e2k x = 1
N
k=1
gives


1.22
Pr DN
N


2
2
12
(1)k1 e2k (1.22) 0.90 = 1 0.10
k=1

and
1.36
Pr DN
N

12


2
2
(1)k1 e2k (1.36) 0.95 = 1 0.05
k=1

and




1.63
2
2
Pr DN
12
(1)k1 e2k (1.63) 0.99 = 1 0.01.
N
k=1
When using the Kolmogorov-Smirnov to test a random sequence {R1 , R2 , R3 , ..., RN }
against a standard uniform cdf, the test procedure follows the following five steps
which can be easily performed using Microsoft Excel:
1.) Rank the sequence {R1 , R2 , R3 , ..., RN } from smallest to largest. Specifically,
let R(i) denote the ith smallest observation, so that
R(1) R(2) R(3) R(N) .

25

2.) Compute
+
DN

= max

1iN

i
R(i)
N

which is the largest deviation of SN (x) above F (x), and




i1

DN = max R(i)
1iN
N
which is the largest deviation of SN (x) below F (x).
+

3.) Compute DN = max{DN


, DN
}, which is the largest absolute deviation between SN (x) and F (x).

4.) Determine the critical value, D,N , from


icance level and the given sample size
computed using

1.22,

1
1.36,
D,N
N

1.63,

a KS table for a specified signifN. Note that these can be also


when = 0.10
when = 0.05
when = 0.01

when the sample size N is larger than 35, which is usually the case.

5.) If the sample statistic DN is greater than the critical value, D,N , the null
hypothesis that the sample data are a sample from a standard uniform
distribution is rejected. If DN D,N , we conclude that no difference has
been detected between the true distribution of {R1 , R2 , R3 , . . . , RN } and the
standard uniform distribution U [0, 1).
Example #7: The Kolmogorov-Smirnov (KS) Test
Consider the small set of N = 5 random numbers
{0.44, 0.81, 0.14, 0.05, 0.93}.

26

From these we may generate the following table


i
1
2
3
4
5

Ri
0.44
0.81
0.14
0.05
0.93

R(i)
0.05
0.14
0.44
0.81
0.93

i/N
0.20
0.40
0.60
0.80
1.00

i/N R(i) (i 1)/N


0.15
0.00
0.26
0.20
0.16
0.40

0.60
0.07
0.80
+
D = 0.26

R(i) (i 1)/N
0.05

0.04
0.21
0.13

D = 0.21

which leads to D = max{D+ , D } = 0.26. The sample cdf using these N = 5


random numbers is shown in the figure below, along with the expected cdf
F (x) = x for 0 x 1.
1
0.8
0.6
0.4
0.2

0 0

0.2

0.4

0.6
x

0.8

Plot of the theoretical (thin curve)


and sample (thick curve) cdf
This also shows that D = 0.26, and for = 0.05 and N = 5, we find from the
KS tables that DN, = D5,0.05 = 0.565, showing that D < D5,0.05 . Therefore, the
hypothesis that the distribution of generated numbers is random should not be
rejected based on the KS test. 
The Chi-Squared ( 2 ) Test
The chi-squared test on N observations begins by partitioning the N observations into n disjoint classes and then uses the sample statistic
2

n

(Oi Ei )2

Ei

i=1

27

(10a)

where Oi is the observed number of observations in the ith class, Ei is the expected
number of observations in the ith class (based on the random variable one believes
the observations come from), and n is the number of classes choosen. Of course,
we must have
n

Oi = N,
(10b)
i=1

and for the uniform distribution, the expected number of observations in each
class is Ei = N/n for equally-sized classes. We now present an intuitive argument
showing that the 2 sampling distribution (for large values of N) is approximately
the chi-squared distribution with n 1 degrees of freedom. Tables of different
percentage points of the chi-squared distribution with degrees of freedom for
different values of are also easily obtained.
An Intuitive Argument Showing that 2 2n1 - Optional
The statistic
n

(Oi Ei )2
2
=
Ei
i=1

along with the constraint

Oi = N

i=1

is known to have (approximately) a chi-squared distribution with n 1 degrees


of freedom. In this section, we want to motivate why this is so. Toward this
end we will be using the fact from probability which says if Z1 , Z2 , Z3 , ..., Zn
are independent random variables each having a standard normal distribution
N (0, 1), then the random variable
X = Z12 + Z22 + Z32 + + Zn2

(11)

is chi-squared with n degrees of freedom.


Suppose we defined n disjoint classes C1 , C2 , C3 , ..., Cn and suppose data
points are coming in every second so that each data point can be placed into one
and only one of these classes. For example, the classes might simply be defined
by weights of people (in pounds) with
C1 = {W |0 W < 100 lbs} ,

C2 = {W |100 lbs W < 150 lbs}

and
C3 = {W |150 lbs W < 200 lbs}

,
28

C4 = {W |200 lbs W < 250 lbs}

and
C5 = {W |250 lbs W < 300 lbs}

C6 = {W |300 lbs W },

and the data points coming in refers to customers entering a store. Then each
customer can be placed into one of these six classes depending on the weight of
the customer.
Now let Oi be the random variable on the number of data points coming in
and placed in class Ci , for i = 1, 2, 3, ..., n. This is a random variable just like the
number of customers coming into a store with weights less than 100 lbs is also
a random variable. Let Ei = E(Oi ) be the expected value of Oi based on some
distribution and to determine the variance of Oi , we must make some assumption
about the natural of Oi . As data points come into class Ci , suppose we make the
reasonable assumption that they come in according to a Poisson Process, which
leads to a Poisson distribution with parameter i t (with t = 1 time unit). Before
we continue, let us be reminded about the assumptions behind a Poisson process.
A Poisson Process - A Reminder
Consider a sequence of random events such as the arrival of units at a shop or
the arrival of data coming in as measurements. These events may be described
by a counting function N (t) (defined for all 0 t), which equals the number
of events that occur in the closed time interval [0, t]. We assume that t = 0 is
the point at which the observations begin, whether or not an arrival occurs at
that instant and we note that N(t) is a random variable with with possible values
equal to the non-negative integers: 0, 1, 2, 3, . . .. Such an arrival process is called a
Poisson process with mean rate (per unit time) if the following three reasonable
assumptions are fulfilled.
A1: Arrivals occur one at a time: This implies that the probability of 2 or more
arrivals in a very small (i.e., infinitesimal) time interval t is zero compared
to the probability of 1 or 0 arrivals occurring in the same time interval t.
A2: N (t) has stationary increments: The distribution of the numbers of arrivals
between t and t + t depends only on the length of the interval t and not
on the starting point t. Thus, arrivals are completely random without rush
or slack periods. In addition, the probability that a single arrival occurs in
29

a small time interval t is proportional to t and given by t where is


the mean arrival rate (per unit time).
A3: N (t) has independent increments: The numbers of arrivals during nonoverlapping time intervals are independent random variables. Thus, a large
or small number of arrivals in one time interval has no effect on the number
of arrivals in subsequent time intervals. Future arrivals occur completely at
random, independent of the number of arrivals in past time intervals.
Given that arrivals occur according to a Poisson process, (i.e., meeting the three
assumptions A1, A2, and A3), it is possible to derive an expression for the probability that n arrivals (n = 0, 1, 2, 3, . . . ,) occur in the time interval [0, t]. We shall
denote this probability by Pn (t), and it can be shown that
et (t)n
Pn(t) = Pr(N (t) = n) =
n!
for n = 0, 1, 2, 3, . . . and for all time t > 0. This is known as a Poisson distribution
with parameter t and the mean and variance of such a distribution are
E(N (t)) = t

and

V (N(t)) = t = E(N(t)),

respectively. In one time unit (in which t = 1), we then have


E(N(1)) = .

and

V (N(1)) = = E(N (1)).

as the mean and variance in the arrival of the data that is being studied.
Back to the Intuitive Argument that 2 2n1
Using this little reminder about the Poisson process, we then see from A1, A2
and A3 above, that assuming that the data comes in as one piece of data every
time unit is reasonable and under this assumption, we find that

E(Oi ) = Ei
and
V (Oi ) = Ei
so that
(Oi ) = Ei .
If we then define the random variable
Zi =

Oi Ei
Oi E(Oi )

= 
Ei
V (Oi )

we see that E(Zi ) = 0 and V (Zi ) = 1.


30

Plots of the pdfs for the Poisson distribution


eEi (Ei )x
,
x!
along with the normal distribution having the same mean and variance
(xE )2
1
2E i
i

e
,
2Ei

are shown in the figures below for a means of 4, 5, 7, 9 and 11.


0.2

0.15

0.1

0.05

0 0

x 6

10

Plots of the chi-squared distribution (thin)


along with the normal distribution (thick)
having the same mean of 4 and variance of 4
and
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0

10

Plots of the chi-squared distribution (thin)


along with the normal distribution (thick)
having the same mean of 5 and variance of 5
31

and
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0 0

10

12

14

Plots of the chi-squared distribution (thin)


along with the normal distribution (thick)
having the same mean of 7 and variance of 7
and
0.12
0.1
0.08
0.06
0.04
0.02

0 0

10

12

14

16

Plots of the chi-squared distribution (thin)


along with the normal distribution (thick)
having the same mean of 9 and variance of 9

32

and
0.12
0.1
0.08
0.06
0.04
0.02

0 0

10

15

20

Plots of the chi-squared distribution (thin)


along with the normal distribution (thick)
having the same mean of 11 and variance of 11
From these plots, we start to see the central-limit theorem from probability, which
says that Oi is approximately normal with mean Ei and standard deviation Ei ,
N (Ei , Ei ), so that
Zi =

Oi Ei

Ei

is approximately

N (0, 1)

provided that Ei 5. The reason for choosing Ei 5, besides the visual indications in the above figures, is because if X N(, ), then



1 2
X
1
0

Pr(X < 0) = Pr
<
= ( ) =
e 2 x dx.

A plot of this probability, ( ) versus is shown in the following figure

33

0.1
0.08
0.06
0.04
0.02

0 0

A plot of ( ) versus

and it shows that ( ) is very small, in fact

P (X < 0) = ( ) ( 5) = 0.00078
when 5. This shows that when N (, ) is used to approximate the Poisson
distribution, less than 0.00078 of the probability is in the forbidden region to
the left of zero.
Now going back to Equation (11), we see that
2
n
n 


Oi Ei
2
2

Zi =
=
Ei
i=1
i=1

must be approximately chi-squared with n degrees of freedom. However, since we


are fixing the total number of data points coming in to be N, so that
n

Oi = N,

i=1

this then says (for example) that On is completely known once O1 , O2 , ..., On1
are known, which says that
2 
2
2
n1 
n1 

Oi Ei
Oi Ei
On En
2

=
+
=
+ constant
E
E
E
i
n
i
i=1
i=1
or

n1

Zi2 + constant.

i=1

34

This now makes 2 approximately chi-squared with one less degree of freedom
since only the Zi s for i = 1, 2, 3, ..., n1 are independent standard normal random
variables. This is why the statistic
2

n

(Oi Ei )2
i=1

Ei

along with the constraint

Oi = N

i=1

is (approximately) a chi-squared distribution with n 1 degrees of freedom. We


now remind the reader about the form of a chi-squared distribution.
The Chi-Squared Distribution With Degrees of Freedom ( 2 )
The chi-squared distribution with degrees of freedom (denoted by 2 ) has a
pdf given by
1
x/21 ex/2
f (x) = /2
(12a)
2 (/2)
for 0 x and f (x) = 0 for x < 0, where

tz1 et dt
(z) =

(12b)

is the gamma function evaluated at z and is equal to (z 1)! when z is a positive


integer. The cdf of the 2 distribution is given by
 x
 x
1
F (x) =
f (z)dz = /2
z /21 ez/2 dz.
(12c)
2 (/2) 0
0
The mean and variance of the 2 distribution are given by E(X) = and V (X) =
2, respectively, and some typical plots of the pdf function f (x) versus x are
shown in the figure below.

35

1
0.8
0.6
0.4
0.2

0 0

Plots of f (x) for = 1 and 2 and 3


showing: f1 (0) > f2 (0) > f3 (0)
Note that the mode of the 2 distribution is given by 2 for 2 . Note also
that the critical values in most percentage-point tables are computed using
Pr(X > 2, ) =

F (2, ) = Pr(X 2, ) = 1 .

or

For example, if = 7 and = 0.1, then F (20.1,7 ) = 0.9 gives




20.1,7

1
27/2 (7/2)

x7/21 ex/2 dx = 0.9

which has the solution 20.1,7 12.


When using the chi-squared test statistic against a uniform cdf, the test procedure follows these steps given that the sequence {R1 , R2 , R3 , . . . , RN } has already
been generated.
1.) Choose a value of n and partition the unit interval [0,1) into the n disjoint
classes C1 , C2 , C3 , . . ., Cn ,
C1 = [0, x1 ) ,

C2 = [x1 , x2 ) ,

C3 = [x2 , x3 ) ... Cn = [xn1 , 1)

for chosen values of x0 0 < x1 < x2 < x3 < < xn1 < 1 xn . It is
recommended ( but not necessary) that all n classes have the same size by
making xi = i/n for i = 1, 2, 3, ..., n 1, so that
C1 = [0, 1/n),

C2 = [1/n, 2/n),

C3 = [2/n, 3/n), ...,


36

Cn = [(n1)/n, 1).

2.) Compute Oi as
Oi = # of {R1 , R2 , R3 , . . . , RN } in Ci
for each i = 1, 2, 3, . . . , n.
3.) Compute Ei using Ei = (xi xi1 )N as predicted by the standard uniform
distribution U [0, 1), for each i = 1, 2, 3, . . . , n, and (as demonstrated earlier)
it is recommended that the ith class is large enough so that Ei 5. When
using classes of equal size, we have Ei = N/n for each value of i, and then
Ei 5 says that we should choose n so that n N/5. In fact, it is usually
best to choose n so that

N n N/5
when N 25.
4.) Compute the sample statistic
2 =

n

(Oi Ei )2

Ei

i=1

2n1 .

5.) Determine the critical value, 2,n1 , from either a chi-squared table or from
the equation
Pr(X

2,n1 )

2,n1

1
x(n1)/21 ex/2 dx = 1 .
2(n1)/2 ((n 1)/2)

for a specified significance level and the value of n.


6.) If the sample statistic 2 is greater than the critical value, 2,n1 , the null
hypothesis that the sample data are a sample from a standard uniform
distribution is rejected. If 2 2,n1 , then we conclude that no difference
has been detected between the true distribution of {R1 , R2 , R3 , . . . , RN } and
the standard uniform distribution.

37

Example #8: The Chi-Squared ( 2 ) Test


Consider the set of N = 100 random numbers in the following table.
0.34
0.83
0.96
0.47
0.79
0.99
0.37
0.72
0.06
0.18

0.90
0.76
0.99
0.31
0.71
0.17
0.51
0.43
0.39
0.26

0.25
0.79
0.77
0.17
0.23
0.99
0.54
0.56
0.84
0.97

0.89
0.64
0.67
0.82
0.19
0.46
0.01
0.97
0.24
0.88

0.87
0.70
0.56
0.56
0.82
0.05
0.81
0.30
0.40
0.64

0.44
0.81
0.41
0.05
0.93
0.66
0.28
0.94
0.64
0.47

0.12
0.94
0.52
0.45
0.65
0.10
0.69
0.96
0.40
0.60

0.21
0.74
0.73
0.31
0.37
0.42
0.34
0.58
0.19
0.11

0.46
0.22
0.99
0.78
0.39
0.18
0.75
0.73
0.79
0.29

0.67
0.74
0.02
0.05
0.42
0.49
0.49
0.05
0.62
0.78

Using the n (equally length) classes:


C1 = [0, 1/n),

C2 = [1/n, 2/n),

so that

C3 = [2/n, 3/n), ...,

Cn = [(n 1)/n, 1)


i
i1
N
Ei =

N=
n
n
n
for each class, we have (for n = 10) the 10 classes:
C1 = [0, 0.1) ,

C2 = [0.1, 0.2) ,

C3 = [0.2, 0.3)

C10 = [0.9, 1),

and the expected value for each class is Ei = 100/10 = 10 5. Using these 100
numbers we generate the next table.
Class Oi Ei Oi Ei
C1
7
10
3
C2
9
10
1
C3
8
10
2
C4
9
10
1
C5
14 10
+4
C6
7
10
3
C7
10 10
0
C8
15 10
+5
C9
9
10
1
C10
12 10
+2

100 100
0
38

(Oi Ei )2
9
1
4
1
16
9
0
25
1
4

(Oi Ei )2 /Ei
0.9
0.1
0.4
0.1
1.6
0.9
0.0
2.5
0.1
0.4
2 = 7.0

The results of this table show that 2 = 7.0, and since


 20.05,9
1
x9/21 ex/2 dx = 1 0.05
9/2
2 (9/2)
0
leads to
1
210

  2
0.05,9
2
x7/2 ex/2 dx = 0.95
0

resulting in 20.05,9 16.919, we see that 2 < 20.05,9 , and so the null hypothesis
that the 100 numbers come from a standard uniform distribution should not be
rejected on the bases of this test and significance level. 
Both the Kolmogorov-Smirnov and the chi-squared test are acceptable for
testing the uniformity of a sample of data, and the Kolmogorov-Smirnov test is
the more powerful of the two since it directly compares cdfs, and so it is the
more recommended of the two. Furthermore the Kolmogorov-Smirnov test can
be applied to small sample sizes, whereas the chi-squared test is valid only for
large samples so that each Ei 5.
Testing for uniformity is certainly important but it does not tell the whole
story. It should be noted that the order in which the Ri s are computed has no effects on the conclusions drawn from the Kolmogorov-Smirnov and the chi-squared
tests for uniformity but the order in which the Ri s are computed is certainly important from the perspective of giving the appearance of independence, as the
next example shows.
Example #9: A Perfect Random-Number Sequence Or Not
Consider the sequence
{X1 , X2 , X3 , . . . , XN }
generated using X0 = m 1 and
Xi+1 = (Xi + 1) mod(m).
Such a sequence must always lead to
{X1 , X2 , X3 , . . . , Xm } = {0, 1, 2, 3, . . . , m 1, ...}
39

which then repeats in a maximum cycle of length m. It is clear that the resulting
random numbers
{R1 , R2 , R3 , . . . , Rm } = {0, 1/m, 2/m, 3/m, . . . , (m 1)/m, ...}
would easily pass any Kolmogorov-Smirnov and chi-squared tests since we would
always find that Dm = 0 and 2 = 0 for any choice of classes C1 , C2 , C3 , ...,
Cn . Yet such a sequence definitely does not look random. This set of numbers
would pass all possible frequency tests with ease, but the ordering of the numbers
produced by the generator would not be random and so these numbers would not
pass any tests for independence. 
In fact, in general, one can take any sequence of random numbers that would
pass all possible frequency tests and simply rearrange them (i.e., in increasing
order) and these same numbers would easily fail any type of independence test.
There are many tests for independence, some of which include:
Runs Test: Tests the runs up and down or the runs above and below
the mean by comparing the actual values to the expected values as
predicted by the standard uniform distribution U[0, 1). The statistic
for comparison are the Standard Normal Distribution and The ChiSquared Distribution.
Autocorrelation Test: Tests the correlation between the generated numbers and compares the sample correlation to the expected correlation
of zero as predicted by the standard uniform distribution U [0, 1). The
statistic for comparison is the Standard Normal Distribution.
Gap Test: Counts the number of digits that appear between repetitions of a particular digit and then uses the Kolmogorov-Smirnov (KS)
test to compare this with the expected size of gaps as predicted by a
geometric distribution. The statistic for comparison is the Geometric
Distribution.
Poker Test: Treats numbers grouped together as a poker hand. For
example, a five-digit number 0.11433 can be though of as a five-card
poker hand having two pairs, or 0.2222 can be though of as a five-card
poker hand having five of a kind, and so on. Then, a chi-squared (2 )
statistic is used to compare the frequency of these poker hands to
40

what is expected based on a deck of 50 cards having: five 0s, five


1s, five 2s, and so on up to five 9s (in the case of five-digit random
numbers).
These tests lie outside the scope of the ENM 503 courses and are discussed in
detail in the ESE 603 course. Now that we know how to generate a set of random
numbers which is a sample from the standard uniform distribution U [0, 1), let us
see how these can be converted into a sample from any random variable X.
6. Using Random Numbers to Generate Random Samples of X
This section deals with one common procedure for converting a set of random
numbers
{R1 , R2 , R3 , ..., RN }
into a random sample
{X1 , X2 , X3 , ..., XN }
from a random variable X that has either a continuous or discrete distribution.
Although many of the standard simulation programs generate these random variates (for many of the standard random variables discussed in probability and
statistics) using subroutines and functions, it is still important to understand
how random-variate generation occurs just in case you are faced with a random
variable X that has a distribution that is not covered by the standard simulation
programs.
The method we shall discuss is called the inverse transform method, but other
methods, such as the convolution method, the acceptance-rejection method, and
the composition method are also important, but these lie outside the scope of the
ENM 503 course but they are covered in the ESE 603 course.
We assume from the start that a source of uniform random numbers
{R1 , R2 , R3 , . . . , RN } U [0, 1)
is readily available and throughout this section the symbol R and {R1 , R2 , R3 , . . . , RN }
represent random numbers uniformly distributed on [0, 1).

41

The Inverse Transform Technique


The inverse transform method is based on the fact that if f(x) and F (x) are
the pdf and cdf, respectively, of some random variable X, then Z = F (X) has a
uniform distribution on the interval [0, 1), i.e.,
Z = F (X) U[0, 1).
This is an incredibly simple yet powerful result. To prove this, we simply note
that since 0 F (X) 1, then 0 Z 1, and so the possible values of Z are
between 0 and 1, and if g(z) and G(z) are the pdf and cdf, respectively, for Z,
then
G(z) = Pr(Z z) = Pr(F (X) F (x)).

But F (x) is a monotonically increasing function of x and so F (X) F (x) implies


X x, and hence
G(z) = Pr(F (X) F (x)) = Pr(X x) = F (x) = z,
since z = F (x) from the definition of Z . Then
g(z) =

dG(z)
=1
dz

and so we see that the pdf of Z = F (X) is

1, for 0 z < 1
g(z) =
,

0, for otherwise

which is also the pdf of a uniform distribution on the interval [0, 1), and so we
have shown that if F (x) is the cdf of some random variable X, then Z = F (X)
U [0, 1).

This means that (in theory) if F (x) is the cdf of some random variable X,
then R = F (X) is the continuous standard uniform distribution on the interval
[0, 1), and then X = F 1 (R), the inverse function of F , (which always exists
since F (x) is a monotonically increasing function of x) has distribution with pdf
f (x) = F (x). Therefore if
{R1 , R2 , R3 , . . . , RN }
42

is a random sample from U [0, 1), then


 1

F (R1 ), F 1 (R2 ), F 1 (R3 ), . . . , F 1 (RN )

becomes a random sample from the random variable X having cdf F (x). Note that
in practice, it may be very difficult (if not impossible) to get a simple algebraic
form for F (X), and even if its possible to get a simple algebraic form for F (X), it
may be very difficult (if not impossible) to get a simple algebraic form for F 1 (R).
For this reason, other methods such as the acceptance-rejection method have been
developed and this method is discussed in detail in the ESE 603 course. Let us
now look at a few examples.
Example #10: Exponential Distribution with Parameter
The exponential distribution with parameter > 0 has pdf
x
e , for 0 x
f (x) =

0,
for x < 0

and cdf given by

F (x) =

f (z)dz =

1 ex , for 0 x

0,

for x < 0

Then R = F (X), yields R = 1 eX . Solving for X, we get


1
X = F 1 (R) = ln(1 R)

so that if {R1 , R2 , R3 , . . . , RN } is a random sample from U [0, 1), then


{X1 , X2 , X3 , . . . , XN }

with

1
Xi = ln(1 Ri )

for i = 1, 2, 3, . . . , N, becomes a random sample from an exponential distribution


with parameter > 0. Note that since R and 1R are both uniform distributions
on the interval [0, 1), we may just use
1
X = ln(R)

instead of
43

1
X = ln(1 R).

This removes the need for the operations that subtracts each of the Ri s from 1,
which could result in considerable computer savings time especially if the value
of N (i.e., the size of the random sample) is very large. In Excel, this allows one
to use
1
ln(RAND())

to generate samples of X Exp(). 


Example #11: Uniform Distribution on the Interval [a, b]
The uniform distribution on the interval [a, b] has pdf

1/(b a), for a x b


f (x) =

0,
otherwise

and cdf given by

0,
for x a

 x

(x a)/(b a), for a x b .


f (z)dz =
F (x) =

1,
for b x

Then R = F (X) gives R = (X a)/(b a). Solving for X yields


X = F 1 (R) = a + (b a)R

so that if {R1 , R2 , R3 , . . . , RN } is a random sample from U [0, 1), then


{X1 , X2 , X3 , . . . , XN }

with

Xi = a + (b a)Ri

for i = 1, 2, 3, . . . , N , is a random sample from U[a, b). In Excel, this allows one
to use
a + (b a) RAND()
to generate samples of X U [a, b). 

44

Example #12: Triangular Distribution with Parameters a, b and c


The triangular distribution with parameters a, b, c, and h = 2/(c a), has pdf

0,
for x a

h(x a)/(b a), for a x b


f (x) =

h(c x)/(c b), for b x c

0,
for elsewhere

and cdf given by

F (x) =

f (z)dz =

0,

12 h(x a)2 /(b a),

for x a
for a x b

1 12 h(c x)2 /(c b), for b x c

1,
for c x

The shape of f , illustrated in the figure below,

0.05
0.04
0.03
0.02
0.01

0 0

10

20

30

40

Plot of f (x) versus x using


a = 5, b = 15 and c = 40
explains its name. To determine a sample of X from a random number R, we
45

may use the inverse transform method and set R = F (X), yielding
R=

h(X a)2
2(b a)

1
0 = F (a) R F (b) = h(b a)
2

when

and
R=1

h(c X)2
2(c b)

1
h(b a) = F (b) R F (c) = 1.
2

when

Solving each of these for X leads to



2R(b a)
1
X = F (R) = a +
h

when

1
0 R h(b a)
2

and
X=F

(R) = c

2(1 R)(c b)
h

when

1
h(b a) R 1.
2

Replacing h by 2/(c a), we then have


X =a+
and
X =c


(b a)(c a)R

when


(c b)(c a)(1 R)

when

0R

ba
ca

ba
R 1.
ca

Therefore if {R1 , R2 , R3 , . . . , RN } is a random sample from U[0, 1), then


{X1 , X2 , X3 , . . . , XN }
with
Xi =

a + (b a)(c a)Ri ,

for 0 Ri (b a)/(c a)


(c b)(c a)(1 Ri ), for (b a)/(c a) Ri 1

for i = 1, 2, 3, . . . , N , is a random sample from a triangular distribution with


parameters a, b and c. In Excel, this allows one to use either


a + (b a)(c a)RAND() or c (c b)(c a)(1 RAND())
46

to generate samples of X Tri(a, b, c), depending on where RAND() falls relative


to the quantity (b a)/(c a). 
Newton Iteration Method - Optional
While it might be possible to obtain a simple algebraic form for F (X), it
may not be possible to obtain a simple algebraic form for F 1 (R). Under these
conditions, a Newton Method of iteration may be used to solve R = F (X) for a
given value of X from a given value of R. For example, suppose that a continuous
random variable X has a pdf given by
f(x) =

10
(x x4 )
3

for 0 x 1, and zero otherwise. A plot of this pdf is shown in the figure below.
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0 0

0.2

0.4

0.6

0.8

Plot of f (x) = 10(x x4 )/3


versus x for 0 x 1
The cdf of X is given by
10
F (x) =
3

5
2
(t t4 )dt = x2 x5
3
3

for 0 x 1, which is a simple algebraic form. However, solving for X given


that
5
2
R = F (X) = X 2 X 5
3
3

47

is not very easy to do analytically. From Newtons Iteration in Calculus, we know


that one method for solving an equation of the form g(x) = 0 is to generate a
sequence {x0 , x1 , x2 , x3 , . . . , xn , . . .} using an initial guess x0 , and then using
xn+1 = xn

g(xn )
.
g (xn )

If the initial guess x0 is not too far away from a solution to g(x) = 0, then
lim xn = a solution to g(x) = 0.

Therefore if we want to solve for X given that R = F (X), then we let g(X) =
F (X) R and get g (X) = F (X) = f (X), where f is the pdf of X, so that
Xn+1 = Xn

g(Xn )
F (Xn ) R
= Xn

g (Xn )
f (Xn )

(with an initial guess of 0 < X0 < 1) could generate a sequence of values that
converge to X in which R = F (X). For example, earlier we had f (x) = 10(x
x4 )/3 and F (x) = 5x2 /3 2x5 /3, and so solving
2
5
R = F (X) = X 2 X 5
3
3

leads to
Xn+1

F (Xn ) R
= Xn
= Xn
f(Xn )

which reduces to
Xn+1

1
= Xn
10

For R = 0.3 and a starting point


table.
n
0
1
2
3
4
5
6

5
3

Xn2 23 Xn5 R
10
(Xn Xn4 )
3

2Xn5 5Xn2 + 3R
Xn (Xn3 1)

guess of X0 = 0.5 we generate the following


Xn
0.5000000000
0.4342857143
0.4312449349
0.4312370715
0.4312370716
0.4312370715
0.4312370716
48

which converges very quickly to X 0.43123707. Therefore the equation


5
2
R = F (X) = X 2 X 5 = 0.3
3
3
has X 0.43123707 as the unique solution between zero and one. A plot of F (x)
versus x along with the horizontal line at R = 0.3 is shown in the figure below
and this shows the point of intersection at (X = 0.431, R = 0.3).
1
0.8
0.6
0.4
0.2

0 0

0.2

0.4

0.6

0.8

Plots of F (x) = 5x2 /3 2x5 /3


and R = 0.3 showing the point
of intersection at x 0.4312
This means that the random number R = 0.3 leads to the random variate X
0.4312.
Example #13: An Empirical Continuous Distribution
Consider an empirical continuous distribution with points
{ x0 , x1 , x2 , x3 , . . . , xn } ordered so that

x0 x1 x2 x3 xn

with respective probabilities


{p1 , p2 , p3 , . . . , pn }

pk = 1

k=1

where pi = Pr{xi1 X xi } for i = 1, 2, 3, . . . , n. These can be organized in


49

the following table


Random Variable
Intervals
Probabilities
x0 X x1
p1
x1 X x2
p2
x2 X x3
p3
..
..
.
.
xn1 X xn
pn

Random Variable Cumulative


Intervals
Probabilites
X x1
c1
X x2
c2
X x3
c3
..
..
.
.
X xn
cn

where c1 = p1 , and ck = ck1 + pk , for k = 2, 3, 4, . . . , n, are the cumulative


probabilities. This says that the cumulative distribution function F (X), contains
the points
{(x0 , c0 ), (x1 , c1 ), (x2 , c2 ), (x3 , c3 ), . . . , (xn , cn )}
where c0 0, F (xk ) = ck for k = 1, 2, 3, . . . , n, and cn = 1. Using linear interpolation, we may construct a continuous cumulative distribution function (cdf)
by

0,
for x x0

c0 + m0 (x x0 ),
for x0 x x1

c1 + m1 (x x1 ),
for x1 x x2

c2 + m2 (x x2 ),
for x2 x x3
F (x) =

..
..
..

.
.
.

cn1 + mn1 (x xn1 ), for xn1 x xn

1,
for xn x
where the slopes are given by

mi =

ci+1 ci
xi+1 xi

50

for i = 0, 1, 2, . . . , n 1. Then setting R = F (X) and solving for X yields

x0 + (R c0 )/m0 ,
for c0 R c1

x1 + (R c1 )/m1 ,
for c1 R c2

1
x2 + (R c2 )/m2
for c2 R c3
X = F (R) =

..
..
..

.
.
.

x
for cn1 R cn
n1 + (R cn1 )/mn1
so that if {R1 , R2 , R3 , . . . , RN } is a random sample from U [0, 1), then
{X1 , X2 , X3 , . . . , XN }
with

x0 + (Ri c0 )/m0 ,

x1 + (Ri c1 )/m1 ,

x2 + (Ri c2 )/m2
Xi =

..

x
n1 + (Ri cn1 )/mn1

for c0 Ri c1
for c1 Ri c2
for c2 Ri c3
..
.

..
.

for cn1 Ri cn

for i = 1, 2, 3, . . . , N, is a random sample from an empirical distribution described


by the above table. Note that a non-linear interpolation scheme (e.g., cubic) may
be used if more smoothness is desired in the cdf function. 
Example #14: Continuous Distributions Without a Closed-Form Cdf
A number of very useful distributions (such as the normal, gamma, and beta
distributions) do not have simple closed-forms for their cdf F (x), or its inverse,
and hence using the inverse transform method to generate random variates may
be very difficult. For example, if X N (0, 1), then
 x
1 2
1
F (x) =
e 2 t dt (x)
2
51

does not have a simple closed form. If we are willing to approximate the inverse of
the cdf, then we may still be able to generate these random variates. For example,
starting from F (x), we may choose a value of n and values of x1 , x2 , x3 , . . ., xn,
and construct the following table
Random Variable Values
(Increasing Order)
x1
x2
x3
..
.

Cumulative
Probabilites
c1 = Pr(X x1 ) = F (x1 )
c2 = Pr(X x2 ) = F (x2 )
c3 = Pr(X x3 ) = F (x3 )
..
.

xn

cn = Pr(X xn ) = F (xn)

like that in the previous example and then we may use the method described in
the previous example to generate a random sample from X. This says that if
{R1 , R2 , R3 , . . . , RN } is a random sample from U[0, 1), then
{X1 , X2 , X3 , . . . , XN }
with

x0 + (Ri F (x0 ))/m0 ,

x1 + (Ri F (x1 ))/m1 ,

x2 + (Ri F (x2 ))/m2


Xi =

..

x
n1 + (Ri F (xn1 ))/mn1

for F (x0 ) Ri F (x1 )


for F (x1 ) Ri F (x2 )
for F (x2 ) Ri F (x3 )
..
.

..
.

for F (xn1 ) Ri F (xn )

and F (x0 ) = 0, F (xn ) = 1, and

mi =

F (xi+1 ) F (xi )
,
xi+1 xi

for i = 1, 2, 3, . . . , N, is an approximation to a random sample from the distribution having cdf F (x). The larger we make the value of n, and the smaller we make
52

the intervals [xi , xi+1 ], the better the approximation, but also the more computer
work involved and so the slower the algorithm. 
Discrete Distributions
Samples from discrete distributions can also be generated using the inverse
transform method, either numerically through a table look-up procedure, or in
some cases algebraically with the final generation scheme in terms of a formula
involving the ceiling and/or floor functions.
Note that for the sake of this discussion involving discrete distributions we
shall assume that R U (0, 1], which includes 1 but not 0. This is done simply
out of convenience and since
U(0, 1) = U(0, 1] = U[0, 1) = U [0, 1],
it really does not matter as long as N (the sample size) is large. After all, getting
exactly R = 0 or R = 1 should be very rare events. Let us illustrate the ideas
with some examples.
Example #15: An Empirical Discrete Distribution
Suppose we have a random variable X with a discrete range space
RX = {x1 , x2 , x3 , . . . , xn }
and corresponding probabilities {p1 , p2 , p3 , . . . , pn }. Then we may construct the
following table of cumulative probabilities.
Random Variable Values Probabilities
Cumulative
(Increasing Order)
Probabilites
x1
p1
c1 = p1
x2
p2
c2 = c1 + p2
x3
p3
c3 = c2 + p3
..
..
..
.
.
.
xn
pn
cn = cn1 + pn = 1

53

This says that the cdf of X is a step function defined by

c0 0, for x < x1

c1 ,
for x1 x < x2

c2 ,
for x2 x < x3

c3 ,
for x3 x < x4
F (x) =
.

..
..
..

.
.
.

cn1 ,
for xn1 x < xn

cn 1, for xn x

Note that F is not continuous and it should not be made continuous using some
interpolation scheme. When applying the inverse transform method in this case,
we note that if
{R1 , R2 , R3 , . . . , RN }
is a random sample from U (0, 1], then
{X1 , X2 , X3 , . . . , XN }
with

x1 ,

x2 ,

1
x3 ,
Xi = F (Ri ) =

..

x ,
n

for 0 = c0 < Ri c1
for c1 < Ri c2
for c2 < Ri c3
..
.

..
.

for cn1 < Ri cn = 1

is a random sample from an empirical distribution described by the above table.




54

Note that for generating discrete random variables, the inverse transform technique becomes a table look-up procedure, and unlike the case of a continuous
variable, interpolation should not be done. However, if the values of xi in the
above table are such that xi+1 xi is a constant (independent of i), then the
ceiling and/or floor functions may be used along with the expression for F 1 (R)
to generate a sample from a discrete distribution X, as we shall now demonstrate.
But first, let us be reminded of the ceiling and floor functions.
The Ceiling (Round Up) and Floor (Round Down) Functions

by

If x is a real number, the ceiling or round up of x is denoted by x and defined


x = The smallest integer greater than or equal to x

(13a)

and the floor or round down of x is denoted by x and defined by


x = The largest integer less than or equal to x.
Plots of these are shown below.

4
2

-4

-2

0 0

2 x

-2
-4

Plot of x versus x

55

(13b)

and
4
2

-4

-2

0 0

2 x

-2
-4

Plot of x versus x
Note that in general
x x x

(13c)

x = 1 + x

(13d)

for any real number x, and


for any non-integer real number x. Of course, if x is an integer, Equation (13d)
is not valid and the inequalities in Equation (13c) become equalities. In addition,
if x is an integer satisfying a x < a + 1, then we must have x = a and
if x is an integer satisfying a < x a + 1, we must have x = a + 1. As
mentioned, the ceiling (round up) and the floor (round down) functions can often
be use along with F 1 (R) to construct a sample from a discrete random variable
X when xi+1 xi is a constant (independent of i) and let us now look at some
examples.
Example #16a: A Discrete Uniform Distribution
Consider the discrete uniform distribution on the set
S = {a + b, a + 2b, a + 3b, . . . , a + kb}
(denoted by U D[a, b]) for fixed values of a and b > a, with pmf p(x) = 1/k for all

56

x in S, and cdf given by

0/k,

1/k,

2/k,
F (x) =

..

(k 1)/k,

1,

for x < a + b
for a + b x < a + 2b
for a + 2b x < a + 3b

..
.

..
.

for a + (k 1)b x < a + kb


for a + kb x

Then we note that xi+1 xi = b (independent of i) and

a + b,
for 0 < Ri 1/k

a + 2b, for 1/k < Ri 2/k

1
a + 3b, for 2/k < Ri 3/k
Xi = F (Ri ) =

..
..
..

.
.
.

a + kb, for (k 1)/k < R k/k


i

which we may write as

a + b,

a + 2b,

1
a + 3b,
Xi = F (Ri ) =

..

a + kb,

57

for 0 < kRi 1


for 1 < kRi 2
for 2 < kRi 3
..
.

..
.

for k 1 < kRi k

Using the ceiling function, we may write all of this as simply


Xi = F 1 (Ri ) = a + kRi b.
Note that this simple form for Xi using the ceiling function is made possible
because we are assuming in this section that R U(0, 1]. Thus we find that if
{R1 , R2 , R3 , ..., RN }
is a random sample from U (0, 1], then
{X1 , X2 , X3 , ..., XN }

with

Xi = a + kRi b

(14)

is a random sample of the discrete uniform distribution on the set


{a + b, a + 2b, a + 3b, . . . , a + kb}.
In Excel, this allows one to use
a + b ceiling(kRAND())
to generate samples of X DU [a, b]. 
Example #16b: A Discrete Uniform Distribution - Acceptance/Rejection
Consider again the discrete uniform random variable X defined on the set of
k equally spaced values
S = {a + b, a + 2b, a + 3b, . . . , a + kb}
(denoted by DU[a, b]) for fixed values of a and b > a, with pmf p(x) = 1/k for all
x in S. Consider also the iterative scheme in which
Zi+1 = (aZi + c) mod(m)
where m k, i = 0, 1, 2, ..., M and Z0 is some seed. Excluding the seed Z0 , this
will generate a sequence of M integers
{Z1 , Z2 , Z3 , ..., ZM }
that have values between 0 and m 1, inclusive. Setting r = [m/k], which is
the greatest integer less than or equal to m/k (which is also the floor of x), let
58

us agree that all those values of Zi satisfying 0 Zi < r will be assigned the
value of Xi = a + b, all those values of Zi satisfying r Zi < 2r will be assigned
the value of Xi = a + 2b, all those values of Zi satisfying 2r Zi < 3r will be
assigned the value of Xi = a + 3b, and so on up to all those values of Zi satisfying
(k 1)r Zi < kr will be assigned the value of Xi = a + kb. In other words all
values of Zi satisfying
(j 1)r Zi < jr

Xi = a + jb

for j = 1, 2, 3, ..., k are accepted and assigned the value Xi = a + jb. Since
(j 1)r Zi < jr

j 1 Zi /r < j

we see that
j = [Zi /r]

and so

Xi = a + [Zi /r]b

can be used to compute the value of Xi from the value of Zi whenever [Zi /r] k,
and any value of Zi kr is rejected. The number of acceptable value of Zi is then
equal to kr = k[m/k], and the number of rejected values is m kr = m k[m/k].
It should be noted that
jr (j 1)r = r

(a result not dependent on j)

of the Zi s will be assigned the value of Xi = a + jb and since kr numbers are


accepted, the fraction r/kr = 1/k of the numbers accepted are assigned the value
Xi = a + jb for each j, showing that each value of Xi has the same probability of
1/k of occurring, so it is doing the right thing. This method has the advantage
of using only integer arithmetic but it does require that some numbers are discarded. The efficiency of this acceptance/rejection method can be measured by
the quantity
kr
k[m/k]
[m/k]
c=
=
=
m
m
m/k
so that the closer c is to 1, the better. A plot of [x]/x is shown in the figure below,

59

1
0.9
0.8
0.7
0.6
0.5

0.4

10

15

20

25

Plot of [x]/x versus x for x 1.


The dotted curves are 1 and 1 1/x.
and since

1
[x]

1
x
x
for all x > 1, we see that the efficiency of the method is better than 0.9 (90%)
when m/k 10. 
1

Example #17: The Geometric Distribution


Consider the geometric distribution with pmf
p(x) = p(1 p)x1
for x = 1, 2, 3, . . .. Its cdf is given by
F (x) =

x

k=0

p(1 p)

k1

1 (1 p)x
=p
1 (1 p)

= 1 (1 p)x

for x = 1, 2, 3, . . . and we note that xi+1 xi = 1 (independent of i). Using the


inverse transform method we see that if R is a random number satisfying
F (x 1) < R F (x)
then X = x. Since F is a non-decreasing function, we may rewrite this set of
inequalities as
x 1 < F 1 (R) x

or simply
60

F 1 (R) x < 1 + F 1 (R).

Using the ceiling function, this says that


X = F 1 (R).
In the case of the geometric distribution, we have F (X) = 1 (1 p)X = R, so
that
ln(1 R)
(1 p)X = 1 R
or
X = F 1 (R) =
,
ln(1 p)
and hence




 1
ln(1 R)
X = F (R) =
.
ln(1 p)

Therefore, if {R1 , R2 , R3 , . . . , RN } is a random sample from U(0, 1], then




ln(1 Ri )
with
Xi =
{X1 , X2 , X3 , . . . , XN }
ln(1 p)

(15)

is a random sample from a geometric distribution having parameter p and range


space S = {1, 2, 3, ...}. In Excel, this allows one to use


ln(1 RAND())
ceiling
ln(1 p)
to generate samples of X geometric(p). 
In a more general setting suppose that X is a discrete distribution with range
space
RX = {a1 , a2 , a3 , . . . , an , . . .}
where
a1 < a2 < a3 < < an <
and suppose that F (x) is the cdf of X. Then for a given random number R, the
value of X is chosen by the condition X = ak when F (ak1 ) < R F (ak ).
7. Using Simulation for Parameter Estimation
The heart of using simulation for parameter estimation lies in The Strong Law
of Large Numbers from probability which is probably one of the best-known results in probability theory. It simply states that the sample average of a sequence
61

of independent random variables X1 . X2 , X3 , ..., Xn having a common distribution will, with probability 1, converge to the mean = E(X), of that common
distribution. In other words,


n
1
Pr lim
Xi = = 1.
n n
i=1
Therefore, simulation is ideal for approximating the average (or expected value)
of a random variable by simply computing

for a large number of samples N.

N
1
Xi
N i=1

Example #18: Computing Averages Using Simulation


Suppose that a triangle is to be constructed from three points (0, 0), (X, 0)
and (Y, Z), as shown in the following figure,

The triangle having points (0, 0)-lower left,


(X, 0)-lower right and (Y, Z)-upper
where X, Y and Z are all independent each coming from the standard uniform
distribution U[0, 1). First we would like to analytically compute the average area
of such a triangle. To solve this exactly, we first compute the area of one such
triangle as
1
A = XZ
2
62

and then since each of X, Y and Z are from U [0, 1), we have
 1 1 1
1
1

A=
xzdxdydz = .
8
0
0
0 2
The area worksheet that accompanies this chapter shows the result obtained
using a simulation for N = 5000 samples and it agrees rather nicely with the
result of 1/8.
A more difficult analytical calculation is to compute the average perimeter of
such a triangle since the perimeter of one such triangle is


P = X + Y 2 + Z 2 + (Y X)2 + Z 2
resulting in

P =

1
0

(x +

y2 + z2 +

(y x)2 + z 2 )dxdydz

which we may write as


 1 1
 1 1 1
1
2
2

P = +
y + z dydz +
(y x)2 + z 2 dxdydz.
2
0
0
0
0
0

Using numerical integration, we find that


 1 1
y 2 + z 2 dydz 0.7652
0

and

 1
0

1
0


(y x)2 + z 2 dxdydz 0.65176

and so P 0.5 + 0.7652 + 0.65176 = 1.917. The perimeter worksheet that


accompanies this chapter shows the result obtained using a simulation for N =
5000 samples and one should note that this calculation using simulation is really
no more difficult than the area calculation using simulation. This is one big
advantage with Monte-Carlo simulation. 
We see how simulation can be used to estimate averages by using sample means
and
N
1
Xi
N i=1
63

for a large number of samples N. We may also estimate k moments using


N
1 k
X
N i=1 i

for k = 1, 2, 3, ..., and variances using the statistic


N
1 2
X
N i=1 i

N
1
Xi
N i=1

2

and coefficients of variation using the statistic




N
 1 
 N
Xi2

i=1

2 1

N
 1 
Xi
N
i=1

for a large number of samples N. Let us now see how we may use simulation to
estimate probabilities.
Computing Probabilities Using The Strong Law of Large Numbers
Using simulation to compute probabilities is an important application of the
strong law of large numbers. It works by constructing probabilities as expected
values. Toward this end, suppose that a sequence of independent trials of some
experiment is performed and suppose that E is some fixed event of the experiment
and suppose that E occurs with probability Pr(E) on any particular trial. Defining
the random variable X by

1, if E does occur on the ith trial


X=
,

0, if E does not occur on the ith trial

for i = 1, 2, 3, ..., we note that

= Pr(E)
E(X) = (1) Pr(E) + (0) Pr(E)

64

showing that the expected value of X is the same as the probability of the occurrence of E. Therefore, letting

1, if E does occur on the ith trial


Xi =
,

0, if E does not occur on the ith trial

for i = 1, 2, 3, ..., then by the strong law of large numbers


n

1
lim
Xi = E(X) = P (E)
n n
i=1

or
n

1
Xi
Pr(E) = lim
n n
i=1

which we may also write as

N
1
Pr(E)
Xi
N i=1

for large N. This result is very important in simulation because it allows us


to compute probabilities using only expected values. In fact, we shall now end
this Chapter by using this idea to answer the problem that was proposed at the
beginning of this chapter.
8. A Solution to the Problem in Section 1
In section 1 at the beginning of this chapter, the following problem was proposed. Suppose that two coins of radii R1 and R2 are thrown on a rectangular
sheet of paper having length L and width W so that the position of each coins
center uniformly lands somewhere on the sheet of paper. Given these conditions,
we were asked to compute (in terms of the inputs: L, W , R1 and R2 ) the probability that the two coins overlap. By placing the rectangle is on the xy plane as
the region
R = {(x, y)|0 x L, 0 y W }
and letting (X1 , Y1 ) give the coordinates of the center of coin 1 and if (X2 , Y2 ) give
the coordinates of the center of coin 2, then, under the conditions of the problem,
we have
X1 U [0, L)
,
Y1 U [0, W )
and
X2 U[0, L)

,
65

Y2 U [0, W ).

and the two coins will overlap when the distance between their centers,

D = (X2 X1 )2 + (Y2 Y1 )2

is less then or equal to the sum of their radii, i.e., when D R1 +R2 . To compute
the probability that the two coins overlap requires that we compute
P = Pr(D R1 + R2 ).
Using simulation to solve this problem, we first use our random number generator
(RAND() in Microsoft Excel) to generate four independent random numbers R11 ,
R12 , R13 and R14 and we use a + (b a)R to generate a sample from U [a, b). Thus
we set
X11 = 0 + (L 0)R11 = LR11

Y11 = 0 + (W 0)R12 = W R12

Y12 = 0 + (W 0)R14 = W R14

along with
X12 = 0 + (L 0)R13 = LR13

to generate X11 U[0, L), X12 U [0, L), Y11 U[0, W ) and Y12 U[0, W ).
Then we compute

D1 = (X12 X11 )2 + (Y12 Y11 )2
and we set Z as the random variable defined by

1, when D R1 + R2
Z=
.

0, when D > R1 + R2
so that

Z1 =

1, when D1 R1 + R2

0, when D1 > R1 + R2

We then use our random number generator (RAND() in Microsoft Excel) to generate another four independent random numbers R21 , R22 , R23 and R24 and we
set
X21 = 0 + (L 0)R21 = LR21

66

Y21 = 0 + (W 0)R22 = W R22

along with
X22 = 0 + (L 0)R23 = LR23
Then we compute
D2 =
and we set
Z2 =

Y22 = 0 + (W 0)R24 = W R24 .


(X22 X21 )2 + (Y22 Y21 )2

1, when D2 R1 + R2

0, when D2 > R1 + R2

Continuing this process and constructing Z3 , Z4 , ..., we then use the fact that
Pr(D R1 + R2 ) = E(Z)
to get
Pr(D R1 + R2 ) = E(Z) = lim

resulting in the estimate


P = Pr(D R1 + R2 )

1
Zi
n i=1

N
1
Zi
N i=1

for large values N , which is known as the number of Monte-Carlo simulations.


A simulation using N = 5000 is presented in the coin problem worksheet that
accompanies this chapter. Specifically, using the inputs: L = 6, W = 3, R1 = 2
and R2 = 1, we find that P 0.7. The user is encourage to change the inputs
that are in the yellow cells and to show that the end result depends only on the
ratios = W/L and = (R1 + R2 )/L, and under the special case when L = W ,
the value of P depends only on the ratio = (R1 + R2 )/L.

67

Das könnte Ihnen auch gefallen