Beruflich Dokumente
Kultur Dokumente
FACULTY OF SCIENCES
Course Title:
Simulation and Modeling
Page 1 of 90
CISY 403 SIMULATION AND MODELLING
Course Objectives:
At the end of the course students will be able:
• To analyze computer and communication systems through case studies.
• To demonstrate understanding of system modeling through the competent use of computer
simulation methods in mathematical modeling techniques.
Course Content:
Introduction and basic simulation procedures. Model classification: Monte Carlo simulation,
discrete-event simulation, continuous system simulation, mixed continuous/discrete-event
simulation. Queuing networks: analytical and simulation modeling of queuing systems. Input
and Output Analysis: random numbers, generating and analyzing random numbers, sample
generation, place and execution driven simulation point and internal estimation. Process-
oriented and Parallel simulation.
Teaching Methods:
• Class lectures that involve proper explanation various simulation models.
• Tutorials that entail solving of problems by both students and lecturer.
• Practical sessions in the lab and practical demonstrations.
• Group discussions among the students and active participation in class.
• Regular CATs and assignments that are discussed after grading.
Reference Textbooks:
• Modeling and Simulation: The Computer Science of Illusion
• Raczynski , S.: Modelling and Simulation: The Computer Science of Illusion; John Wiley,
2006.
• Birta, Louis G., Arbez, Gilbert : Modelling and Simulation Exploring Dynamic System Behaviour;
Springer, 2007.
Page 2 of 90
CISY 403 SIMULATION AND MODELLING
• Law and Kelton: Simulation Modeling and Analysis; 3rd Edition, McGraw Hill, 1991.
Teaching Tools:
• Computer installed with a simulator program.
Page 3 of 90
CISY 403 SIMULATION AND MODELLING
List of Contents
Course Outline
Module 1: Introduction
Special Instructions:
These notes are brief discussions of the contents of the subjects. You shall be receiving a
Page 4 of 90
CISY 403 SIMULATION AND MODELLING
Page 5 of 90
CISY 403 SIMULATION AND MODELLING
Page 6 of 90
CISY 403 SIMULATION AND MODELLING
Basic Concept
Time
Processing
Page 7 of 90
CISY 403 SIMULATION AND MODELLING
Time
e1 e2 e3 e4
0 ∆t 2∆t 3∆t 4∆t
Time
0 t1 t2 c1 t3 c2
A1 A2 A3
S1 S2
ti = Arrival
ci = Completion D1
Di = Delay
Page 8 of 90
CISY 403 SIMULATION AND MODELLING
Variability
• Each defined quantity will generally be a
random variable
Components of a DE Model
• System State – Collection of state
variables
• Simulation Clock – Current value of
simulated time, represents ‘wall’ clock.
• Event list – list of future time event values
• Statistical Counters – accumulators of
simulation results
• Initialization Routine – start up state
definition for time zero (t0)
Page 9 of 90
CISY 403 SIMULATION AND MODELLING
Performance Measures
1. Average delay
2. Expected average queue length
3. Expected utilization
Time
0 t1 t2 c1 t3 c2
A1 A2 A3
S1 S2
ti = Arrival
ci = Completion D1
Di = Delay
Page 10 of 90
CISY 403 SIMULATION AND MODELLING
Performance Measures
1. Average customer delay (customer concern)
∑D i
dˆ ( n ) = i =1
= D (n )
n
Performance Measures
2. Expected Average Queue Length
Performance Measures
• Observed average • Expected (estimated)
queue length average queue length
∞ ∞
∑ ip
i =0
i ∑ ipˆ
i =0
i
q( n ) = qˆ (n ) =
n n
Page 11 of 90
CISY 403 SIMULATION AND MODELLING
Q(t)=i
Time Dependent Queue
Q(t)
3 Arrivals
Departures
Time →
0
0 1 2 3 4 5 6 7 8 9 10
-1
0.4
1.6
2.1
3.8
4.0
5.6
5.8
7.2
[T(6)]
2.4
3.1
3.3
4.9
8.6
-2 (6 delays)
Performance Measures
• Recast average queue length
• Let Ti = total time that length is I
• Then: T ( n ) = T0 + T1 + T2 + ...
∞
• And: pˆ = Ti / T ( n )
∑ iT
i =0
i
qˆ ( n ) =
T (n )
Page 12 of 90
CISY 403 SIMULATION AND MODELLING
Computing qˆ(n)
• From the example time graph:
– First compute the Ti’s (pick from graph):
T0 = (1.6 − 0.0) + (4.0 − 3.1) + (5.6 − 4.9) = 3.2
T1 = (2.1−1.6) + (3.1− 2.4) + 4.9 − 4.0) + 5.8 − 5.6) = 2.3
T2 = (2.4 − 2.1) + (7.2 − 5.8) = 1.7
T3 = (8.6 − 7.2) = 1.4
– Then the sum:
∞
Representing qˆ ( n ) Continuously
• Note that from the graph we can consider out
summation as an area under the curve Q(t).
We can then write: ∞ T (n)
∑ iTi = ∫
i =0
0
Q (t )dt
• And our estimator becomes:
T (n)
qˆ (n ) =
∫
0
Q (t )dt
T (n)
Page 13 of 90
CISY 403 SIMULATION AND MODELLING
Performance Measures
3. Expected System Utilization
– The expected amount of time over T(n) in which
the server is busy.
uˆ ( n ) | 0 ≤ uˆ (n ) ≤ 1
Page 14 of 90
CISY 403 SIMULATION AND MODELLING
Distributing Data
• Any continuous function can be used
– Needs to be contained within [0,1]
– Should resemble real world observations
• Example: Exponential
1
– β = mean
f ( x) = e− x / β
– x = random value [0,1] β
f(x)
Interval (x)
Use of Exponential
• Insure that output is in [0,1]
f (x ) = 1 if the server is busy at time t
0 if the server is idle at time t
Page 15 of 90
CISY 403 SIMULATION AND MODELLING
Module 3
Simple Queueing II
Basic Concept
Time
Processing
Status Items 2
Customers 0.4
Page 16 of 90
CISY 403 SIMULATION AND MODELLING
Q(t)=i
Time Dependent Queue
Q(t)
3 Arrivals
Departures
Time →
0
0 1 2 3 4 5 6 7 8 9 10
-1
0.4
1.6
2.1
3.8
4.0
5.6
5.8
7.2
[T(6)]
2.4
3.1
3.3
4.9
8.6
-2 (6 delays)
Initialization
Page 17 of 90
CISY 403 SIMULATION AND MODELLING
Arrival 0.4
Arrival 1.6
Arrival 2.1
Page 18 of 90
CISY 403 SIMULATION AND MODELLING
Departure 2.4
Departure 8.6
Modeling
• Single server example has one input stream
• Interarrival times are randomly distributed
– Can be depicted by exponential with a mean β
– Applicable to both arrivals and service
Page 19 of 90
CISY 403 SIMULATION AND MODELLING
Distributing Data
• Any continuous function can be used
– Needs to be contained within [0,1]
– Should resemble real world observations
• Example: Exponential
1
– β = mean
f ( x) = e−x / β
– x = random value [0,1] β
f(x)
Interval (x)
Use of Exponential
• Insure that output is in [0,1]
f (x ) = 1 if 0 <= x <= 1
0 otherwise
Page 20 of 90
CISY 403 SIMULATION AND MODELLING
Use of Exponential
• Generate random variate U(0,1)
• Invert the exponential distribution f (x )
Arrival Departure
End
Simulation
Page 21 of 90
CISY 403 SIMULATION AND MODELLING
An Inventory System
• Variables of concern
– Amount on hand (inventory)
– Cost of inventory (holding cost)
• Insurance, storage rental , maintenance, taxes
– Cost of ordering (process, shipping, etc.)
– Shortage costs (loss of sales)
• Time Cycles
– Weekly, monthly, annually?
An Inventory System
• I(t) = Inventory level at time t
• I+(t) = Maximum inventory == max{I(t),0}
• I-(t) = Backlog (on order) == max{-I(t),0}
• s = minimum threshold of inventory for ordering
n
∫I
+
( t ) dt
+
I = 0
n
S
In v e n to r y M o d e l
I( t )
I+ ( t )
I- ( t )
s
0 10 20 30
P la c e R e c e iv e P la c e
Page 22 of 90
CISY 403 SIMULATION AND MODELLING
Order
Arrival Demand Evaluate
End
Simulation
Inventory Variables
• Inter-demand time – random exponential
• Demand size – discrete random
– Select weighted value (range) within U{0,1)}
• Delivery lags – Uniform = {a+U(b-a)}
Page 23 of 90
CISY 403 SIMULATION AND MODELLING
Module 4
Analysis of a
Single Server Queue
Some Definitions
New and Review
• IID - Independent and Identically Distributed
– Exponential random distribution – arrivals, departures
• GI – General Independent – arrivals
• G - General – service
• M - Markovian, memoryless of previous events,
exponential distribution
• Ek – k-Erlang – summation of exponential
distributions
• D – Deterministic – fixed times
System Notation
• General model notation
• <Arrival type>/<Service type>/<#servers>
– GI/G/s – general queue
– M/M/1 – single server queue with exponential
arrival and service times.
Page 24 of 90
CISY 403 SIMULATION AND MODELLING
An Exception Interlude
• Assume:
• 1: arrivals and services are performed at
a rate per unit time of a and s
– Means are 1/a and 1/s
– Ratio of rates is u = a/s or utilization factor
• 2: we have a steady state of probability
P(n) in which there are n entities in the
system (queues and servers)
– This is actually the limit of Pn(t) as t→∞
Page 25 of 90
CISY 403 SIMULATION AND MODELLING
Graphically
S = State
s = service rate Arrivals
a = arrival rate
P = probability
k = sequence aPk-1 aPk
Sk-1 Sk Sk+1
Service Completions
Computing probabilities
• We still don’t know what the Pk are
– But we know they have to sum to 1!
– Assume |u| < 1
• Then 1 = P0 + P1 + …
= P0 + uP0 + u2 P0 + …
= P0 (1 + u + u2 + … )
= P0 /(1 – u)
• This means that P0 = 1 – u
Page 26 of 90
CISY 403 SIMULATION AND MODELLING
Little’s Law
• To get other quantities, use very important
property of many queuing systems: L =
aW
– L = expected number in system
– W = expected waiting time
– a = arrival rate (usually λ)
• This works for system, queue, etc.
Page 27 of 90
CISY 403 SIMULATION AND MODELLING
Page 28 of 90
CISY 403 SIMULATION AND MODELLING
An important lesson
• Utilization is very important
– You might think everything is OK as long as
it’s < 1
• Not so: if it’s near 1 then things blow up
– Important to keep it not only < 1 but
comfortably below 1
• This kind of analysis applies also to more
general queues
Page 29 of 90
CISY 403 SIMULATION AND MODELLING
Module 5
Timeshared Computer Model
in simlib
Overview of system
• Terminal operator ‘thinks’
– Time exp, mean 25 (seconds)
– Then sends job w/svc time exp, mean 0.8
• Mainframe has fixed svc quantum q=.1
– If remaining svc time s≤q, process for s + (τ =
.015) (swap), send job to terminal
– If s > q, process for q+τ and send to end of
the queue
Page 30 of 90
CISY 403 SIMULATION AND MODELLING
Response time R
• This is (finish of job at mainframe) – (time
job left terminal)
– Set number of terminals = 10, 20, …, 80
– (Each case) Observe 1,000 response times
– Get average response time, server utilization,
number of jobs in queue
• Initially all terminals are in think state
• # of terminals for average R < 30?
(n of
these)
End processing End
Arrival simulation
period at CPU
Page 31 of 90
CISY 403 SIMULATION AND MODELLING
Definitions
Page 32 of 90
CISY 403 SIMULATION AND MODELLING
void arrive(void);
void start_CPU_run(void); These are functions that
void end_CPU_run(void);
void report(void);
you will have to write to
run the model (also main)
Main function
• First some housekeeping
– Open input/output files
– Read parameters for the simulation
– Print header for the output
Page 33 of 90
CISY 403 SIMULATION AND MODELLING
fclose(infile);
fclose(outfile);
This is a (rather unusual) use of
do/while instead of while/break.
return 0; Also, note that at this point we have
no EVENT_END_SIMULATION
scheduled.
Arrival
void arrive(void) /* Event function for arrival of job at CPU after think time. */
{
/* Place the arriving job at the end of the CPU queue.
Note that the following attributes are stored for each job record:
1. Time of arrival to the computer.
2. The (remaining) CPU service time required (here equal to the
total service time since the job is just arriving). */
transfer[1] = sim_time;
transfer[2] = expon(mean_service, STREAM_SERVICE);
list_file(LAST, LIST_QUEUE);
Page 34 of 90
CISY 403 SIMULATION AND MODELLING
/* This job requires more CPU time, so place it at the end of the queue
and start the first job in the queue. */
list_file(LAST, LIST_QUEUE);
start_CPU_run();
}
else {
/* This job is finished, so collect response-time statistics and send it back
to its terminal, i.e., schedule another arrival from the same terminal. */
sampst(sim_time - transfer[1], SAMPST_RESPONSE_TIMES);
event_schedule(sim_time + expon(mean_think, STREAM_THINK),
EVENT_ARRIVAL);
Slide 20
Page 35 of 90
CISY 403 SIMULATION AND MODELLING
Output
void report(void) /* Report generator function. */
{
/* Get and write out estimates of desired measures of
performance. */
fprintf(outfile, "\n\n%5d%16.3f%16.3f%16.3f",
num_terms,
sampst(0.0, -SAMPST_RESPONSE_TIMES),
filest(LIST_QUEUE), filest(LIST_CPU));
}
Comments
• We never identify the terminals
– When a job ends its run, it just vanishes
– It also schedules another arrival
• The point here is that the mainframe
doesn’t care which terminal a job comes
from
– If it did, we’d have to identify the terminal with
the job, e.g. with an attribute
Page 36 of 90
CISY 403 SIMULATION AND MODELLING
Coming up next
• Multiteller bank with jockeying
– Customers at ends of queues can ‘jockey’ to
ends of other queues if shorter
• Illustrates a new use of lists
– Moving customers from one list to another
• Also illustrates slightly more complex use
of event list
Page 37 of 90
CISY 403 SIMULATION AND MODELLING
Module 6
Page 38 of 90
CISY 403 SIMULATION AND MODELLING
Page 39 of 90
CISY 403 SIMULATION AND MODELLING
Recall conditions
Fad will last 2-5 years. Average = 3.5
Interest rate on loan is 15% (0.15)
Loan is 1 million
Return is 0.4 million per year
NPV = Net Present Value (what is process
worth at a given time)
Page 40 of 90
CISY 403 SIMULATION AND MODELLING
Let’s simulate
No time sequence: not a DEDS
This is just a Monte Carlo simulation
Excel is very good for this application
See “Uniform” tab in spreadsheet
Start with uniform duration between 2 and 5 years
Duration = U(2,5) = 2 + 3*RAND()
Calculate NPV as shown on previous slide
For this application, continuous cash flow works best
10
What do we find?
35-40% chance of losing money!
This is fairly risky; maybe think again
Sample mean NPV is consistently below
NPV of the mean duration. Why?
The NPV function is concave
Using f[average argument] consistently
overestimates average f[argument]
See graph on next slide
11
0.800
0.600
0.400
0.200
NPV in millions
-
(0.200) $- 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
(0.400)
(0.600)
(0.800)
(1.000)
(1.200)
Years
12
Page 41 of 90
CISY 403 SIMULATION AND MODELLING
13
Triangular distribution
Support = [a,b]
Mode (value with largest density) = m
Height at m must then be h = 2/(b-a)
This makes area under density = 1
Density function f
a m b
14
Distribution function
For triangular distribution T(a, m, b):
Let k = (m-a)/(b-a)
If x∈[a, m] then F(x) = k[(x-a)/(m-a)]2
If x∈[m, b] then
F(x) = 1 - (1-k)[(b-x)/(b-m)]2
Why? Calculus (integrate the density)
To generate T(a, m, b), invert this
15
Page 42 of 90
CISY 403 SIMULATION AND MODELLING
16
17
18
Page 43 of 90
CISY 403 SIMULATION AND MODELLING
19
20
Next time
Back to DEDS
Walk-through of single-server queue
simulation
Contents of registers
How and when things get updated
Event graphs
Some time for questions, if you want it
21
Page 44 of 90
CISY 403 SIMULATION AND MODELLING
Module 7
Simulation Software
Languages –
General vs. Simulation
• General Purpose – Common coding languages such as
Pascal, Fortran, C++, etc. (The basic algorithmic
languages)
– Require design from ground up
– Require software engineering concepts
– Much more flexible to fit needs
– Cheaper tools, but likely larger project cost
– Efficient
• Simulation Languages provide
– Natural Framework
– Prepared basic features
– Better error detection
– Quicker setup and revision
Simulation Questions
• What are the events?
• Where’s the randomness?
• What are the lists and attributes?
• What statistics do we need?
• What variables do we need to use?
• Are there any special features that we
need to take account of?
Page 45 of 90
CISY 403 SIMULATION AND MODELLING
Key transformation
• Both approaches end in the same place
– Event list with entries processed in time
sequence; other lists such as queues
• The main difference
– Event scheduling: for each event type, ask
what happens to all kinds of entities
– Process description: for each entity, just
describe what happens to it
• Software transforms descriptions to interactions
Modeling elements
• Entities
– Job (shop), customer (bank), …
• Attributes
– Type (shop), priority class (fin svcs), …
• Resources
– Broker (fin svcs), machine (shop), …
• Queues
• Sources of randomness (usually)
Page 46 of 90
CISY 403 SIMULATION AND MODELLING
Simulation software
• Most popular simulation packages now have
graphical user interfaces (GUI)
– Many ‘housekeeping’ operations hidden from user
• Great advantage in hiding complexity
– Classic example: Excel < MATLAB < C
• But you lose a lot, too
– Simple logic is tough in Excel (VBA!)
– MATLAB is much slower than C
A compromise
• Good simulation packages tend to be
compromises
– Easy enough to use so that people can do
modeling without technical simulation
knowledge
– But with enough flexibility to handle a range of
models
• Danger in modeling without knowledge!
Data handling
• For some applications we need quite a lot
of data
– Example: tables of attribute values for entities
in simulation
• Very desirable to have direct input/output
capability for data
– Not just ease, but avoidance of error
• Also needed for interface w/other apps
Page 47 of 90
CISY 403 SIMULATION AND MODELLING
Page 48 of 90
CISY 403 SIMULATION AND MODELLING
Object-oriented simulation
• Idea: write simulation in ‘blocks’ or ‘pieces’
that fit together in structured ways
– Then you can re-use these blocks in other
simulations without rewriting them
– You can also modify the same simulation
more easily, with less chance of error
• Object-oriented idea originated in SIMULA
– Tutorial at
http://www.informs-cs.org/wsc98papers/018.PDF
Page 49 of 90
CISY 403 SIMULATION AND MODELLING
Temporal dimension
• This is even more difficult
– Code maintained over a long period of time by
many different modelers
– Often spatial dimension also large (DOD)
– People quit, die, retire
• How do you keep institutional memory?
– What does this code do? With what data?
• Example of combat simulations
Exhibit: Agency X
• Military modeling
• Has existed for > 30 years
• Maintains some key models used in wide
range of studies
• Key elements
– Written in FORTRAN in 60s and 70s
– For practical purposes, undocumented
• To stop this: JWARS, JSIMS
– Note: language chaos also stimulated ADA
Page 50 of 90
CISY 403 SIMULATION AND MODELLING
Session Summary
• Brief review of simlib models
• Start looking at simulation software
– General classification
– Comparison of types of products
– Some things to look for
• How this fits together with organizational
problems
Page 51 of 90
CISY 403 SIMULATION AND MODELLING
Module 8
The Central Limit Theorem
Confidence Intervals
D
D iiss ttrr iibb uu ttiioo nn iinn ffoo rrm
m aa tt io
io nn
•• WW hh aa tt ww ee ’ ’rree gg oo in in gg ttoo ss hh oo ww i iss tthh aa tt i iff ww ee
nn oo rrm m aa l li izz ee YY nn i inn aa cc ee rrttaa i inn ww aa yy,, tthh ee nn i itt hhaa ss
((aa pp pp rroo xx i im m aa ttee l lyy)) aa vv eerr yy ww eel ll l kk nn oo ww nn
dd i iss ttrri ibb uu tti ioonn
–– W W ee ss ta ta rrt t ww i it thh hh oo ww t too nn oo rrm m aa l li izzee i it t
–– TT hh ee nn i innt trroo dd uucc ee t thh ee nn oo rrm m aa l l dd i iss t trri ibbuu t ti ioo nn
–– TT hh ee cc ee nnt trraa l l l li im m t thhee nn t ti iee ss t thh ee ss ee t tww oo
m i it t t thh ee oo rree m
t too ggee t thh ee rr
Page 52 of 90
CISY 403 SIMULATION AND MODELLING
Normalizing RV
• Suppose X is an RV that has a mean and
variance
– Set up a new RV: Z = [X-E(X)]/σ(X)
– Then Z has mean 0 and variance 1
• Chebyshev’s inequality says that for a > 0, the
probability that |Z| ≥ a is ≤ 1/a2
– So the density of Z should drop off sharply as we get
away from 0
– If Z is an average of IID RV, we can say more
Normalizing Yn
• Let’s normalize Yn = Sn/n in the same way
[Y n − E (Y n )] / σ (Y n )
= [( S n / n ) − m ] /[ σ ( X 1 ) / n 1 / 2 ] =: Z n
Page 53 of 90
CISY 403 SIMULATION AND MODELLING
0.25
0.2
0.15
0.1
0.05
0
.6
.2
.8
.4
.6
.2
6
-3
-1
3
0.
0.
1.
1.
2.
2.
-2
-2
-1
-1
-0
-0
Page 54 of 90
CISY 403 SIMULATION AND MODELLING
An Example:
• Count the number of people on the KEMU
commons at 3 p.m. each day for 30
consecutive days, do not keep track of the
day of week, compute average value.
– Is the average of these data a correct
representation of true annual average?
– Is the average a true representation of actual
weekday population?
– What if a bus loaded with people shows up at
count time?
Confidence intervals
• Suppose we do our experiment (observing
Xi) n times, and compute the sample mean
Yn = Sn/n
– Theory so far says if n is large then Yn is very
likely close to the true population mean m
– But how close?
• We’ll see how to estimate this
Page 55 of 90
CISY 403 SIMULATION AND MODELLING
How to find w
• People have tabulated the distribution
function F of the unit normal RV Q
– So we find w such that F(w) = (1 + p)/2
• Thus, P{Q ≤ w} = (1 + p)/2, so P{Q > w} = (1-p)/2
– The normal density is symmetric
• So, P{Q < - w} = (1-p)/2 also
– Then P{Q ∈ [-w, w] } = 1 - (1-p)/2 - (1-p)/2 = p
• Why? 3 disjoint events, probabilities add to 1
• See next slide for a diagram
-w w
Probability mass p
An example
• We’re observing a process and want to
estimate its expected value (mean)
– We’ve taken 40 observations
– Previous experience indicates this process
has a variance of 9
– Our sample mean is 18.6
• Can we get a 90% confidence interval?
Page 56 of 90
CISY 403 SIMULATION AND MODELLING
Yn − m σ = Var( X1 )
Zn = σ ( X1 ) ;
n
Now go to tables
• We have p = .90, (1+p)/2 = .95
• Tables of the normal show that
F(1.645) = .95, so w = 1.645
• Then P{Zn ∈ [-1.645, 1.645]} = .90
• We can write this as
-1.645 ≤ 2.11(18.6-m) ≤ 1.645,
• i.e., 17.82 ≤ m ≤ 19.38 (sometimes written
as 18.6 ± 0.78)
Page 57 of 90
CISY 403 SIMULATION AND MODELLING
More on interpretation
• So what this methodology does is not to
give us a guarantee
– Rather, it lets us say, “Unless I’ve observed a
rare event (one with probability 10% or less),
I’ve found a sample mean that’s within 0.78 of
the real mean”
• But maybe you saw a rare event?
– Maybe you did … tough luck
– Methodology can’t protect you against that
Unanswered questions
• How large does n have to be?
– There are some answers for common
situations
• How did we get the variance of 9?
– We normally wouldn’t know that
– So we have to use the sample variance
• More on these questions next time
Page 58 of 90
CISY 403 SIMULATION AND MODELLING
Module 9
Conditional Probability
Conditional Expectation
Conditional probability
• This is a way of taking ‘slices’ of a joint
distribution
– Rough idea: Suppose we have a joint density
f(x,y) for X and Y
– Also suppose we’re only interested in a
certain value y0 for Y
• Try to make the function (of x) f(x, y0) into
a density for x
Page 59 of 90
CISY 403 SIMULATION AND MODELLING
∫
R
f ( x, y0 )dx = fY ( y0 ) (from last time)
So, the " conditional density"we want is
f X |Y ( x | y0 ) = f ( x, y0 ) / fY ( y0 )
Diagram
y0 y
x
p X |Y ( x | y0 ) = p ( x, y0 ) / pY ( y0 )
Page 60 of 90
CISY 403 SIMULATION AND MODELLING
f X |Y ( x | y ) = f ( x, y ) / f Y ( y )
= f X ( x) fY ( y ) / fY ( y ) = f X ( x)
So (for any y) the conditional distribution of X
given y is just the marginal, and Y is irrelevant.
Similarly for discrete RV. This is a useful way to
think about independence.
Conditional expectation
• The conditional expectation of X given Y is
just the expectation taken with respect to
the conditional distribution:
E ( X | Y = y 0 ) = ∫ xf X |Y ( x | y 0 ) dx
R
Similarly for discrete RV, except that the
sum replaces the integral and we use the
probability mass functions
Expectations by conditioning
• E(X|Y=y) is a function (of the variable y)
– Let’s try taking its expectation, recalling that y is a
value of the random variable Y:
E [ E ( X | Y = y )] = ∫ E ( X | Y = y ) f Y ( y ) dy
R
= ∫ ( ∫ x[ f ( x , y ) / f Y ( y )]dx ) f Y ( y ) dy
R R
= ∫ x ( ∫ f ( x , y ) dy ) dx = ∫ xf X ( x ) dx = E ( X )
R R R
Page 61 of 90
CISY 403 SIMULATION AND MODELLING
Expectations by conditioning
• This means that we could compute E(X) if
we knew the conditional expectation
E(X|Y=y) for all y, and the marginal
distribution of Y
• Similar (easier) proof for discrete RV
• Is this a way to make easy things hard?
– No! it’s extremely useful to know …
– Example follows (how to avoid simulation)
Page 62 of 90
CISY 403 SIMULATION AND MODELLING
What do we want?
• We want to know the net present value
(NPV) of the total expected net income
from now on (forever)
– Call it VE for expansion, VN for no expansion
– ‘Forever’ might seem unrealistic, but at 12%
discount, NPV is under 10% after 20 years
• Difficulty: infinite number of different paths
into the future; evaluate each!
Year 0
(Now)
Page 63 of 90
CISY 403 SIMULATION AND MODELLING
One possibility
• Use simulation
– Simulate a large number of possible paths
• You could use Excel for this
• Maybe go 10 years out into future
• For each of 10 years, use U(0,1) random variate to
draw from the finite distribution of economic states,
then calculate NPV
– Average the NPV over these paths
• Do calculation with/without expansion
Use conditioning …
• That means that the total expected NPV of
your future income, assuming
– You expand, and
– There will be a boom next year,
is 4.71 + e-.12 VE
• You can do this for each of the other
possibilities too
– See table on next slide
Page 64 of 90
CISY 403 SIMULATION AND MODELLING
Expansion option
Page 65 of 90
CISY 403 SIMULATION AND MODELLING
Page 66 of 90
CISY 403 SIMULATION AND MODELLING
Module 10
Probability Inequalities
Laws of Large Numbers
Markov’s inequality
• Start with a RV X that’s always ≥ 0
• Let t > 0 and use definition of expectation:
Page 67 of 90
CISY 403 SIMULATION AND MODELLING
Chebyshev’s inequality
• Here we apply Markov’s inequality to the RV
[X-E(X)]2, which is always ≥ 0, to get
P{[X-E(X)]2≥ ε 2} ≤ Var(X)/ ε 2
– Remember, Var(X) = E([X-E(X)]2)
• We can write this as
– P{ |X-E(X)| ≥ ε } ≤ Var(X)/ ε2 (usual form)
– Or as P{ |X-E(X)|/ σ(X) ≥ k} ≤ 1/k2 (scaled form)
• To see this, take ε = k σ(X)
• This “z-score form” gives a crude confidence interval
• These just require mean and variance, with no
other assumptions
Page 68 of 90
CISY 403 SIMULATION AND MODELLING
Page 69 of 90
CISY 403 SIMULATION AND MODELLING
Variance of a combination
• Consider RV Z1, Z2, …, Zn and real numbers a1,
a2, …, an
– Let Z = a1Z1 + a2Z2 + … + anZn
• We want to find Var(Z)
– We know Var(Z) = E{[Z-E(Z)]2}
– Also, Z – E(Z) = a1[Z1 –E(Z1)]+…+ an[Zn –E(Zn)]
• Now let’s compute the expectation of [Z-E(Z)]2
n n
E{[Z − E( Z )]2 } = E{∑∑ ai [ Z i − E( Z i )][Z j − E( Z j )]a j }
i =1 j =1
n n
= ∑∑ ai E{[Z i − E( Z i )][Z j − E( Z j )]}a j
i =1 j =1
n n
= ∑∑ ai Cov( Z i , Z j )a j
i =1 j =1
Page 70 of 90
CISY 403 SIMULATION AND MODELLING
Page 71 of 90
CISY 403 SIMULATION AND MODELLING
Page 72 of 90
CISY 403 SIMULATION AND MODELLING
Page 73 of 90
CISY 403 SIMULATION AND MODELLING
Module 11
The t Distribution
Confidence Intervals for the Mean
Page 74 of 90
CISY 403 SIMULATION AND MODELLING
Estimating σ2
• We want to compute an estimator S2(n)
that we can use to replace σ2
• An idea: try to mimic the formula for
variance, involving Σ(Xi - Yn)2 divided by
something (what?)
– Key: try to make the expectation = σ2
– Therefore we’ll compute the expectation of
Σ(Xi - Yn)2 to find out what to divide by
Page 75 of 90
CISY 403 SIMULATION AND MODELLING
More on expectation
So E[( X i − Yn ) 2 ] = ( µ 2 + σ 2 ) − 2( µ 2 + σ 2 / n)
+ ( µ 2 + σ 2 / n) = σ 2 (1 − 1 / n ).
Hence E[∑i =1 ( X i − Yn ) 2 ] = nE[( X i − Yn ) 2 ]
n
= ( n − 1)σ 2 .
Therefore E[∑i =1 ( X i − Yn ) 2 /( n − 1)] = σ 2 .
n
∑
n
S2 = i =1
( X i − Y n ) 2 /( n − 1)
This is an unbiased estimator (its
expectation is the quantity σ2 that is
being estimated). The n-1 is surprising,
but it comes from dependence
Page 76 of 90
CISY 403 SIMULATION AND MODELLING
Back to CLT
• If we now replace σ2 in the normalized
variable from the CLT by the sample
variance S2, then we get
(Yn − µ ) /( S 2 / n )1 / 2
−1 / 2
Y −µ ∑ n ( X i − Yn ) 2
= 2n
1/ 2
n − 1 i =1 2
(σ / n ) σ
The procedure
• What we do is to replace the use of the
normal distribution with the normalized
variable (Yn - µ)/(σ2/n)1/2 by the use of
Student’s t distribution with the normalized
variable (Yn - µ)/(S2/n)1/2
– This solves our problem of not knowing σ2
• The t and normal aren’t very different
– Even less as degrees of freedom increase:
Page 77 of 90
CISY 403 SIMULATION AND MODELLING
0.45
0.4
0.35
0.3
Normal
Density
0.25
t Distribution (4 df)
0.2
t Distribution (20 df)
0.15
0.1
0.05
0
-3
-2.6
-2.2
-1.8
-1.4
-1
-0.6
-0.2
0.2
0.6
1
1.4
1.8
2.2
2.6
3
Units of S D
Page 78 of 90
CISY 403 SIMULATION AND MODELLING
Page 79 of 90
CISY 403 SIMULATION AND MODELLING
Page 80 of 90
CISY 403 SIMULATION AND MODELLING
Module 11
Random Number Tests
Issues
• Theory requires (empirical tests):
1. Uniform distribution U(0,1)
• Common flatness of the distribution frequency
2. No serial or dimensional dependency
0.02 0.70 0.23 0.61 0.69 0.13 0.65 0.37 0.85 0.85
0.04 0.30 0.70 0.67 0.04 0.74 0.25 0.73 0.33 0.21
0.66 0.84 0.19 0.67 0.38 0.52 0.93 0.93 0.60 0.76
. . .
Test 1
• Basic chi squared (χ2)
– Divide (0,1) IID into k subintervals (≥ 100)
– Generate U1, U2, …, Un, for n/k ≥ 5
k
χ2 = k
n ∑( f
j =1
j − nk ) 2 ;
χ2 > χ2 k −1 ,1− α
Page 81 of 90
CISY 403 SIMULATION AND MODELLING
How to Generate fj
• Set fj = 0 for j = 1, 2, …, k
• For i = 1, …., n do
– Generate Ui ;random number
– Set J = Ceiling (kUi) ;scaled to bins
– Replace fJ by fJ+1 ;counter in bin
• End do
Test 2
• Multi-dimensional chi squared (χ2)
– Divide (0,1) IID into k subintervals (≥ 100)
– Generate non-overlapping d-tuples
– U1 = (U1, U2,…,Ud), U2 = (Ud+1, Ud+2,…,U2d), … Un…. for
n/kd ≥ 5
• Compute and test
k k k
χ = 2 kd
n ∑∑ ... ∑ ( f
j1 =1 j2 =1 jd =1
j1 j 2... jd − knd ) 2
Test 2
Example:
d = 2; n = 32768, k = 64, df = 642-1=4095
Pairs of values for Ui, i = 1, n
(U1,U2), (U3, U4), … (U2n-1, U2n)
fj1j2 ==
Page 82 of 90
CISY 403 SIMULATION AND MODELLING
Module 12
Density Distributions
Generating Distributions
• Uniform
• Exponential distribution
• Gamma
• Weibull
• Normal
• Lognormal
• Triangular
Uniform Distribution
• Flat valued, a straight line, U(a,b), first approximation
1
if a ≤ x ≤ b
Density : f (x) = b − a
0 otherwise
0 if x < a
x − a
Distributi on : F (x) = if a ≤ x ≤ b
b − a
1 if b < x
Range : [a , b ]
f(x)
a + b
Mean :
2 1/(b-a)
Page 83 of 90
CISY 403 SIMULATION AND MODELLING
Exponential distribution
• A common and easy model
• Special case of gamma or Weibull distributions
• Inter-arrival times of ‘customers’
1 e−x/β if x ≥ 0
Density : f (x) = β
0 otherwise
1 − e − x / β if x ≥ 0
Distributi on : F (x) =
0 otherwise
f(x)
[0 , ∞ ]
1.2
Range : 1
Mean : β 0.8
0.6
Mode : 0 0.4
0.2
2
Variance : β 0
0 1 2 3 4 5 6
Gamma distribution
• Task completion distribution
• Acquires some similarity to small sample distributions
β−αxα−1e−x/ β
if x > 0
Density
: f (x) = Γ(α)
0 otherwise
−x/ β α−1 (x / β) j
1−e ∑ if x > 0
on: F(x) =
Distributi j=0 j!
0 otherwise
f(x)
Range: [0,∞] 1.2 α=1/2
Gamma(α,1)
1
Mean: αβ 0.8
α=1
Mode: β(α −1) if α ≥1, 0if α <1 0.6
0.4
α=2
2 α=3
: αβ
Variance 0.2
0
0 1 2 3 4 5 6
Page 84 of 90
CISY 403 SIMULATION AND MODELLING
Weibull distribution
• Task completion distribution
• Like gamma, but gives emphasis to a mode value
αβ−α xα−1e−( x / β ) if x > 0
α
Density: f ( x) =
0 otherwise
1− e−( x / β )
α
if x > 0
on: F( x) =
Distributi
0 otherwise
Range: [0, ∞] f(x)
β 1 1.2 Weibull(α,1)
Γ( )
α=3
Mean:
α α 1
β ifα ≥ 1
Mode: α 0.6 α=1
0 ifα < 1
0.4
α=1/2
0.2
β 2 2 1 1
2
Variance: 2Γ − Γ
α α α α
0
0 1 2 3 4 5 6
Normal distribution
• Error terms of various types (uncertainty about a mean)
1 2 2
Density : f (x) = e −( x−µ ) / 2σ
2
2 πσ
Distributi on : no closed form
Range : [− ∞,∞]
f(x)
Mean : µ 0.5
Normal(0,1)
0.4
Mode : µ 0.3
2
Variance : σ
0.2
0.1
0
-3 -2 -1 0 1 2 3
Page 85 of 90
CISY 403 SIMULATION AND MODELLING
Lognormal distribution
• Found to represent many natural phenomena such as
infant mortality vs. age.
1
e −(ln ( x )− µ )
2
/ 2σ 2
Density : f ( x) =
x 2πσ 2
Distributi on : no closed form
f(x)
Range : [0, ∞ ] 1
0.9
0.8
σ=3/2
σ=1/2
Normal(0,σ2)
µ +σ 2 / 2 0.7
Mean : e 0.6
0.5
σ=1
µ −σ 2 0.4
Mode : e 0.3
0.2
Variance : e 2 µ +σ 2 / 2
(e σ2
−1 ) 0.1
0
0 1 2 3 4 5
Triangular Distribution
• Rough model in absence of data
2 (x − a )
(b − a )(c − a ) if a ≤ x ≤ c
2 (b − x )
Density : f ( x) = if c ≤ x ≤ b
(b − a )(b − c )
0 otherwise
0 if x < a
(x − a )2 x (b − a )
if a ≤ x ≤ c ; F ' ( x) =
(b − a )(c − a ) (c − a )
Distributi on : F (x) = (b − x )2 x
1− if c < x ≤ b ; F ' ( x) = 1 −
( b − a )( b − c ) (c − a )
1−
(b − a )
1 if b < x
Range : [a , b ] f(x)
Triangular (3,15,6)
a+b+c 2/(b-a) -
Mean :
3
Mode : c
a 2 + b 2 + c 2 − ab − ac − bc
Variance :
18
a c b
Page 86 of 90
CISY 403 SIMULATION AND MODELLING
Other Distributions
• Beta(α1,α2)
• Pearson type 5 – PT5(α,β)
• Pearson type 6 – PT6(α,β)
• Descrete types, various isolated values (0, 1, etc.)
– Bernoulli(p) - B(p)
– Descrete uniform – DU(i,j)
– Binomial – Binom(t,p)
– Geometric – geom(p)
– Negative binomial – negbin(s,p)
– Poisson - Poi(λ)
• Empirical
Module14
Examples and Result Treatment
Page 87 of 90
CISY 403 SIMULATION AND MODELLING
Page 88 of 90
CISY 403 SIMULATION AND MODELLING
Boniferroni’s Inequality
• A realization of practical inference of means in sets
driven by random events.
• For simulation models, assume multiple test runs k that
have consenus value Is = 100(1-a)
– Obtain a set of result estimators, µs for s = 1, 2, …, k
– Each µs has a confidence interval or uncertainty
• Since all are independently generated, the overall
probability that they are contained in the total
uncertainty for the combination of runs is:
k
– P(µs Is for all s =1, 2, …, k) ≥1 − ∑α
s =1
s
95
97.5
99
Page 89 of 90
CISY 403 SIMULATION AND MODELLING
0.16
0.14
0.12
0.1
0.08
0.06
0.04 S 21
0.02 S 16
0 S 11
1
4
S6
7
10
13
S1
16
19
Example: Terminating Function
Service Queue
• Two measurement parameters, A, B
• 20 test runs yielding (Ai, Bi) results
• Average values: A = 23 min, B = 34 min.
• Sample variance S2 = 5.1 and 7.5 resp.
• Determine a 95% CI for A combined with B
– StdDev A = sqrt(5.1/20) = .504
– StdDev B = sqrt(7.5/20) = .612
– At 95%CI, use Student t, .9875, 19 samples
• (A) (2.433)(.504) = ±1.23
• (A) Range= (23-1.23, 23+1.23)
• Similarly for B
Page 90 of 90