Beruflich Dokumente
Kultur Dokumente
Simulation Programming
with Python
This chapter shows how simulations of some of the examples in Chap. 3 can be
programmed using Python and the SimPy simulation library[1]. The goals of
the chapter are to introduce SimPy, and to hint at the experiment design and
analysis issues that will be covered in later chapters. While this chapter will
generally follow the flow of Chap. 3, it will use the conventions and patterns
enabled by the SimPy library.
4.1
SimPy Overview
ries as necessary software libraries are being ported and tested. SimPy itself
supports the Python 3.x series as of version 2.3. In addition, SimPy is undergoing a major overhaul from SimPy 2.3 to version 3.0. This chapter and the code
on the website will assume use of Python 2.7.x and SimPy 2.3. In the future,
the book website will also include versions of the code based on SimPy 3.02 and
Python 3.2 as they and their supporting libraries are developed.
SimPy comes with data collection capabilities. But for other data analysis
tasks such as statistics and plotting it is intended to be used along with other
libraries that make up the Python scientific computing ecosystem centered on
Numpy and Scipy[3]. As such, it can easily be used with other Python packages
for plotting (Matplotlib), GUI (WxPython, TKInter, PyQT, etc.), statistics
(scipy.stats, statsmodels), and databases. This chapter will assume that you
have Numpy, Scipy3 , Matplotlib4 , and Statsmodels5 installed. In addition the
network/graph library Networkx will be used in a network model, but it can
be skipped with no loss of continuity. The easiest way to install Python along
with its scientific libraries (including SimPy) is to install a scientifically oriented
distribution, such as Enthought Canopy6 for Windows, Mac OS X, or Linux;
or Python (x,y)7 for Windows or Linux. If you are installing using a standard
Python distribution, you can install SimPy by using easy install or pip. Note
the capitalization of SimPy throughout.
easy_install install SimPy
or
pip install SimPy
The other required libraries can be installed in a similar manner. See the
specific library webpages for more information.
This chapter will assume that you have the Numpy, Scipy, Matplotlib, and
Statsmodels libraries installed.
4.1.1
Random numbers
4.1.2
SymPy components
SimPy is built upon a special type of Python function called generators [?].
When a generator is called, the body of the function does not execute, rather,
it returns an iterator object that wraps the body, local variables, and current
point of execution (initially the start of the function). This iterator then runs
the function body up to the yield statement, then returns the result of the
following expression.
Within SimPy, the yield statements are used to define the event scheduling.
This is used for a process to either do something to itself (e.g. the next car
arrives); to request a resource, such as requesting a server; to release a resource,
such as a server that is no longer needed; or to wait for another event. [2].
yield hold: used to indicate the passage of a certain amount of time
within a process;
yield request: used to cause a process to join a queue for a given resource (and start using it immediately if no other jobs are waiting for the
resource);
8 this
yield release: used to indicate that the process is done using the given
resource, thus enabling the next thread in the queue, if any, to use the
resource;
yield passivate: used to have a process wait until awakened by some
other process.
A Process is based on a sequence of these yield generators along with
simulation logic.
In a SimPy simulation, the simulation is initialized, then resources are defined. Next, processes are defined to generate entities, which in turn can generate their own processes over the course of the simulation. The processes are
then activated so that they generate other events and processes. Monitors collect
statistics on entities, resources, or any other event which occurs in the model.
In SimPy, resources such as parking spaces are represented by Resource .
There are three types of resources.
A Resource is something whose units are required by a Process. When
a Process is done with a unit of the Resource it releases the unit for
another Process to use. A Process that requires a unit of Resource when
all units are busy with other Processes can join a queue and wait for the
next unit to be available.
A Level is homogeneous undifferentiated material that can be produced
or consumed by a Process. An example of a level is gasoline in a tank at
a gas station.
A Store models the production or consumption of specific items of any
type. An example would be a storeroom that holds a variety of surgical
supplies,
The simulation must also collect data for use in later calculating statistics
on the performance of the system. In SimPy, this is done through creating a
Monitor.
Collecting data within a simulation is done through a Monitor or a Tally.
As the Monitor is the more general version, this chapter will use that. You
can go to the SimPy website for more information. The Monitor makes observations and records the value and the time of the observation, allowing for
statistics to be saved for analysis after the end of the simulation. The Monitor can also be included in the definition of any Resource (to be discussed
in the main simulation program) so statistics on queues and other resources
can be collected as well. In the Car object; the Monitor parking tracks the
value of the variable self.sim.parkedcars at every point where the value of
self.sim.parkedcars changes, as cars enter and leave the parking lot.
In the main part of the simulation program, the simulation object is defined,
then resources, processes, and monitors are defined and activated as needed to
start the simulation. In addition, the following function calls can be made:
import
import
import
import
import
import
random, math
numpy as np
scipy as sp
scipy.stats as stats
matplotlib.pyplot as plt
SimPy.Simulation as Sim
4.2
Here we consider the parking lot example of Sect. 3.1a queueing system with
time-varying car arrival rate, exponentially distributed parking time and an
infinite number of parking spaces.
A SimPy simulation generally consists of import of modules, defining processes and resources, defining parameters, a main program declaring, and finally
reporting.
At the top of the program are imports of modules used. (Fig. 4.1) By
convention, modules that are part of the Python Standard Library such as
math and random are listed first. Next are libraries considered as foundational
to numerical program such as numpy, scipy, and matplotlib. Note that we
use the import libraryname as lib convention. This short form allows us to
make our code more readable as we can use lib to refer to the module in the code
instead of having to spell out libraryname. Last, we import in other modules
and modules we may have written ourselves. While you could also import
directly into the global namespace using the construct from SimPy.Simulation
import *, instead of import SimPy.Simulation as Sim, this is discouraged
as there is a potential of naming conflicts if two modules happened to include
functions or classes with the same name.
Next are two components, the cars and the source of the cars. (Fig. 4.2)
Both of these are classes that subclass from Sim.Process. The class Arrival
generates the arriving cars using the generate method. The mean arrival rate
is a function based on the time during the day. Then the actual interarrival
time is drawn from an exponential distribution.
class Arrival(Sim.Process):
""" Source generates cars at random
Arrivals are at a time-dependent rate
"""
def generate(self):
i=0
while (self.sim.now() < G.maxTime):
tnow = self.sim.now()
arrivalrate = 100 + 10 * math.sin(math.pi * tnow/12.0)
t = random.expovariate(arrivalrate)
yield Sim.hold, self, t
c = Car(name="Car%02d" % (i), sim=self.sim)
timeParking = random.expovariate(1.0/G.parkingtime)
self.sim.activate(c, c.visit(timeParking))
i += 1
class Car(Sim.Process):
""" Cars arrives, parks for a while, and leaves
Maintain a count of the number of parked cars as cars arrive and leave
"""
def visit(self, timeParking=0):
self.sim.parkedcars += 1
self.sim.parking.observe(self.sim.parkedcars)
yield Sim.hold, self, timeParking
self.sim.parkedcars -= 1
self.sim.parking.observe(self.sim.parkedcars)
Within the Car object, the yield Sim.hold, self, timeParking line is
used to represent the car remaining in the parking lot. To look at this line more
closely
1. yield: a Python keyword that returns a generator. In this case, it calls
the function that follows in a computationally efficient manner.
2. Sim.hold: The hold function within Simpy.Simulation. This causes a
process to wait.
3. self: A reference to the current object (Car). The first parameter passed
to the Sim.hold command. In this case it means the currently created Car
should wait for a period of time. If the yield hold line was preceded by
a yield request, the resource required by yield request would be included in the yield hold, i.e. both the current process and any resources
the process is using would wait for the specified period of time.
4. timeParking: The time that the Car should wait. This is the second
parameter passed to the Sim.hold command.
Typically, yield, request, hold, release are used in sequence. For example, if the number of parking spaces were limited, the act of a car taking a
parking space for a period of time would be represented as follows within the
Car.visit() function.
yield Sim.request, self, parkingspace
yield Sim.hold, self, timeParking
yield Sim.release, self, parkingspace
In this case, since the purpose of the simulation is to determine demand for
parking spaces, we assume there are unlimited spaces and we count the number
of cars in the parking lot at any point in time.
In SimPy, resources such as parking spaces are represented by Resources.
There are three types of resources.
A Resource is something whose units are required by a Process. When
a Process is done with a unit of the Resource it releases the unit for
another Process to use. A Process that requires a unit of Resource when
all units are busy with other Processes can join a queue and wait for the
next unit to be available.
A Level is homogeneous undifferentiated material that can be produced
or consumed by a Process. An example of a level is gasoline in a tank at
a gas station.
A Store models the production or consumption of specific items of any
type. An example would be a storeroom that holds a variety of surgical
supplies,
class G:
maxTime = 24.0 # hours
arrivalrate = 100 # per hour
parkingtime = 2.0 # hours
parkedcars = 0
seedVal = 9999
Global declarations are in Fig. 4.3. While Python allows these to be in the
global namespace, it is preferred that these be put into a class created for the
purpose as shown here. In particular, this is generally used as a central location
for parameters of the model.
The main simulation class is shown in Fig. 4.4. The simulation class Parkingsim
is a subclass of Sim.Simulation. While it can have other functions, the central
function is the run(self) function. Note that the first argument of a member
function of a class is self, indicating that the function has access to the other
functions and variables of the class. The run function takes one argument, the
random number seed. Sim.initialize() sets all Monitors and the simulation
clock. Then processes and resources are created. In this simulation there is one
process, the generation of arriving cars in Arrival. Note that at this point,
there are no Cars, as the Arrival process will create the cars. This simulation
does not have any resources; but if, for example, the parking lot had a limited
capacity of 20 spaces, it could have been created here by:
self.parkinglot = Sim.Resource(capacity = 20, name=Parking lot,
unitName=Space, monitored=True, sim=self)
This creates the parking lot with 20 spaces. By default, resources are a
FIFO, non-priority queue with no preemption. In addition to the capacity of
the Resource (number of servers), there are two queues (lists) associated with
the resource. First is the activeQ, which is the list of process objects currently
using the resource. Second is the waitQ, which is the number of processes that
have requested but have not yet received a unit of the resource. If when creating
the resource the option monitored=True is set, then a Monitor is created for
each queue, and statistics can be collected on resource use and the wait queue for
further analysis. The last option is sim=self, which declares that the Process
or Resource is part of the simulation defined in the current class derived from
Sim.Simulation. If the simulation was defined in the global namespace (not
inside of a class), then this option would not be needed.
After Processes and Resources are created, any additional Monitors are created here. Note that Processes, Resources, and Monitors are created using
self. constructs. This classifies the object as part of the class, and not only
local to the run() function. Other classes and methods that are part of this
simulation (declared using sim = self) can refer to any object created here
using self.sim.objectname. This includes Monitors as well as variables that
need to be updated from anywhere in the simulation. For example, the variable self.parkedcars can be updated from the Car class in Fig. 4.2 using
10
class Parkingsim(Sim.Simulation):
def run(self, aseed):
random.seed(seed)
Sim.initialize()
s = Arrival(name=Arrivals, sim=self)
self.parking = Sim.Monitor(name=Parking, ylab=cars,
tlab=time, sim=self)
self.activate(s, s.generate(), at=0.0)
self.simulate(until=G.maxTime)
parkinglot = Parkingsim()
parkinglot.run(1234)
11
12
13
4.2.1
14
Figure 4.13: Normal probability plot of average cars parked in a day generated
by scipy.stats.statsplot().
4.3
15
Here we consider the hospital example of Sect. 3.2,a queueing system with Poisson arrival process, some (as yet unspecified) service-time distribution, and a
single server (either a receptionist or an electronic kiosk); in other words, an
M/G/1 queue. Patient waiting time is the key system performance measure,
and the long-run average waiting time in particular.
Recall that Lindleys Equation (3.3) provides a shortcut way to simulate
successive customer waiting times:
Y0
X0 = 0
Yi
where Yi is the ith customers waiting time, Xi is that customers service time,
and Ai is the interarrival time between customers i1 and i. Lindleys equation
avoids the need for an event-based simulation, but is limited in what it produces
(how would you track the time-average number of customers in the queue?). In
this section we will describe both recursion-based and event-based simulations
of this queue, starting with the recursion.
4.3.1
To be specific, suppose that the mean time between arrivals is 1 minute, with
the distribution being exponential, and the mean time to use the kiosk is 0.8
minutes (48 seconds), with the distribution being an Erlang-3. An Erlang-p is
the sum of p i.i.d. exponentially distributed random variables, so an Erlang-3
with mean 0.8 is the sum of 3 exponentially distributed random variables each
with mean 0.8/3.
In Sect. 3.2 we noted that the waiting-time random variables Y1 , Y2 , . . . converge in distribution to a random-variable Y , and it is = E(Y ) that we will
use to summarize
Pm the performance of the queueing system. We also noted that
Y (m) = m1 i=1 Yi converges with probability 1 to as the number of customers simulated m goes to infinity.
All of this suggests that we make a very long simulation run (large m) and
estimate by the average of the observed waiting times Y1 , Y2 , . . . , Ym . But
this is not what we will do, and here is why: Any m we pick is not , so the
waiting times early in the runwhich will tend to be smaller than because
the queue starts emptywill likely pull Y (m) down. To reduce this effect, we
will let the simulation generate waiting times for a while (say d of them) before
starting to actually include them in our average. We will still make m large,
but our average will only include the last m d waiting times. That is, we will
use as our estimator the truncated average
Y (m, d) =
m
X
1
Yi .
md
i=d+1
(4.1)
16
Table 4.1: Ten replications of the M/G/1 queue using Lindleys equation.
replication
1
2
3
4
5
6
7
8
9
10
average
std dev
Y (55000, 5000)
2.057990460
2.153527560
2.172301541
2.339207242
2.094318451
2.137171478
2.168302534
2.100971509
2.117776760
2.220187818
2.156175535
0.075074639
In addition, we will not make a single run of m customers, but instead will make
n replications. This yields n i.i.d. averages Y1 (m, d), Y2 (m, d), . . . , Yn (m, d) to
which we can apply standard statistical analysis. This avoids the need to directly
estimate the asymptotic variance 2 , a topic we defer to later chapters.
Figure 4.14 shows a Python simulation of the M/G/1 queue using Lindleys
equation. In this simulation m = 55,000 customers, we discard the first d = 5000
of them, and make n = 10 replications.
The ten replication averages can be then individually written to an comma
separated value (csv) file named lindley.csv (Fig. 4.15) and also printed out
to the screen with the mean and standard deviation as in Table 4.1.
Notice that the average waiting time is a bit over 2 minutes, and that Python,
like all programming languages, could display a very large number of output
digits. How many are really meaningful? A confidence interval is one way to
provide an answer.
Since the across-replication averages are i.i.d., and since each across-replication
average is itself the within-replication average of a large number of individual
waiting times (50,000 to be exact), the assumption of independent, normally distributed output data is reasonable. This justifies a t-distribution confidence interval on . The key ingredient is t1/2,n1 , the 1/2 quantile of the t distribution with n1 degrees of freedom. If we want a 95% confidence interval, then
1/2 = 0.975, and our degrees of freedom
are 101 = 9. Since t0.975,9 = 2.26,
we get 2.156175535 (2.26)(0.075074639)/ 10 or 2.156175535 0.053653949.
This implies that we can claim with high confidence that is around 2.1 minutes, or we could give a little more complete information as 2.15 0.05 minutes.
Any additional digits are statistically meaningless.
Is an average of 2 minutes too long to wait? To actually answer that question
would require some estimate of the corresponding wait to see the receptionist,
17
import numpy as np
# Use scipy.stats because it includes the Erlang distribution
from scipy.stats import expon, erlang
import matplotlib.pyplot as plt
def lindley(m=55000, d = 5000):
Estimates waiting time with m customers, discarding the first d
Lindley approximation for waiting time in a M/G/1 queue
replications = 10
lindley = []
for rep in range(replications):
y = 0
SumY = 0
for i in range(1, d):
# Random number variable generation from scipy.stats
# shape = 0, rate =1, 1 value
a = expon.rvs(0, 1)
# rate = .8/3, shape = 3
x = erlang.rvs(3, scale = 0.8/3, size=1)
y = max(0, y + x - a)
for i in range(d, m):
a = expon.rvs(0, 1)
# rate = .8/3, shape = 3
x = erlang.rvs(3, scale = 0.8/3, size=1)
y = max(0, y + x - a)
SumY += y
result = SumY / (m - d)
lindley.append(result)
return lindley
Figure 4.14: Simulation of the M/G/1 queue using Lindleys equation.
18
import csv
with open("lindley.csv"), "rb") as myFile:
lindleyout = csv.writer(myFile)
lindleyout.writerow("Waitingtime")
for row in result:
print row
for i in range(len(result)):
print ("%1d & %11.9f " % (i+1, result[i]))
print("average & %11.9f" % (mean(result)))
print("std dev & %11.9f" % (std(result)))
Figure 4.15: Simulation of the M/G/1 queue using Lindleys equation.
import SimPy.Simulation as Sim
import numpy as np
from scipy.stats import expon, erlang
from random import seed
class G:
maxTime = 55000.0
# minutes
warmuptime = 5000.0
timeReceptionist = 0.8 # mean, minutes
phases = 3
ARRint = 1.0
# mean, minutes
theseed = 99999
\textbf
Figure 4.16: Declarations for the hospital simulation.
either from observational data or a simulation of the current system. Statistical
comparison of alternatives is a topic of Chap. 8.
4.3.2
The simulation program consists of some global declarations (Fig. 4.16), declarations (Fig. 4.17), the main program (Fig. 4.18), some event routines (Fig. 4.19),
running the simulation and reporting provided by SimPy.
This model will illustrate the Process, Resource, and Monitor classes.
At a high level, here is what they do:
Process objects are used to generate entities or to govern the behavior of
entities in the system.
Resource objects including Level and Store are used to designate re-
19
class Hospitalsim(Sim.Simulation):
def run(self, theseed):
np.random.seed(theseed)
self.receptionist = Sim.Resource(name="Reception", capacity=1,
unitName="Receptionist", monitored=True, sim=self)
s = Arrivals(Source, sim=self)
self.initialize()
self.activate(s, s.generate(meanTBA=G.ARRint,
resource=self.receptionist), at=0.0)
self.simulate(until=G.maxTime)
avgutilization = self.receptionist.actMon.timeAverage()[0]
avgwait = self.receptionist.waitMon.mean()
avgqueue = self.receptionist.waitMon.timeAverage()[0]
leftinqueue= mg1.receptionist.waitMon.yseries()[-1:][0]
return [avgwait, avgqueue, leftinqueue,avgutilization]
20
21
class Arrivals(Sim.Process):
""" Source generates customers randomly """
def generate(self, meanTBA, resource):
i=0
while self.sim.now() < G.maxTime:
c = Patient(name="Patient%02d" % (i), sim=self.sim)
self.sim.activate(c, c.visit(b=resource))
t = expon.rvs(0, 1.0 / meanTBA, size = 1)
yield Sim.hold, self, t
i+=1
class Patient(Sim.Process):
""" Patient arrives, is served and leaves """
def visit(self, b):
arrive = self.sim.now()
yield Sim.request, self, b
wait = self.sim.now() - arrive
tib = erlang.rvs(G.phases,
scale = float(G.timeReceptionist)/G.phases,
size = 1)
yield Sim.hold, self, tib
yield Sim.release, self, b
Figure 4.18: Event routines for the hospital simulation.
hospitalreception = []
for i in range(10):
mg1 = Hospitalsim()
mg1.startCollection(when=G.warmuptime,
monitors=mg1.allMonitors)
result = mg1.run(4321 + i)
hospitalreception.append(result)
hospitalaverage = mean(hospitalreception, axis=0)
hospitalstd = std(hospitalreception, axis=0)
hospital95 = [hstd * t.ppf(0.975, reps-1) for hstd in hospitalstd]
Figure 4.19: Initializing and running the hospital simulation.
22
Table 4.2: Ten replications of the M/G/1 queue using the event-based simulation.
Rep
1
2
3
4
5
6
7
8
9
10
average
std dev
95 C.I.
Total Wait
3.213462878
3.280708784
3.165536548
2.879284462
3.222073305
3.179741189
3.201467600
3.095751521
3.445755412
3.039486198
3.172326790
0.141778429
0.320725089
Queue
2.160833599
2.246598372
2.108143837
1.919272701
2.176042086
2.145124000
2.170802331
2.062482343
2.331405016
2.032295429
2.135299972
0.108549951
0.245557048
Remaining
0
1
12
6
0
4
1
3
5
2
3.400000000
3.469870315
7.849391986
Utilization
0.798567300
0.808480012
0.793258928
0.794702843
0.800707100
0.800669924
0.801600605
0.795603232
0.802041612
0.796786730
0.799241829
0.004231459
0.009572226
Again there are meaningless digits, but the confidence intervals can be used
to prune them. For instance, for the mean total wait we could report 3.17 0.32
minutes. How does this statistic relate to the 2.160.08 minutes reported for the
simulation via Lindleys equation? Mean total time in the kiosk system (which
is what the event-based simulation estimates) consists of mean time waiting
to be served (which is what the Lindley simulation estimates) plus the mean
service time (which we know to be 0.8 minutes). So it is not surprising that
these two estimates differ by about 0.8 minutes.
4.3.3
1. In what situations does it make more sense to compare the simulated kiosk
system to simulated data from the current receptionist system rather than
real data from the current receptionist system?
2. It is clear that if all we are interested in is mean waiting time, defined
either as time until service begins or total time including service, the
Lindley approach is superior (since it is clearly faster, and we can always
add in the mean service time to the Lindley estimate). However, if we
are interested in the distribution of total waiting time, then adding in
the mean service time does not work. How can the Lindley recursion be
modified to simulate total waiting times?
3. How can the event-based simulation be modified so that it also records
waiting time until service begins?
23
4.4
4.4.1
Figure 4.20 shows a Python implementation of the algorithm displayed in Sect. 3.4
and repeated here:
24
Since Pr{Y tp } is known for this example (see Eq. 3.12), the true =
Pr{Y > tp } = 0.16533 when tp = 5 is also computed by the program so that
we can compare it to the simulation estimate. Of course, in a practical problem
we would not know the answer, and we would be wasting our time simulating
it if we did. Notice that even if all of the digits in this probability estimate are
correct, they certainly are not practically useful.
The simulation estimate turns out to be b = 0.15400. A nice feature of a
probability estimate that is based on i.i.d. outputs is that an estimate of its
standard error is easily computed:
s
b )
b
(1
.
se
b =
n
Thus, se
b is approximately 0.011, and the simulation has done its job since the
b This is a reminder that simulations do
true value is well within 1.96 se
b of .
not deliver the answer, like Eq. 3.12, but do provide the capability to estimate
the simulation error, and to reduce that error to an acceptable level by increasing
the simulation effort (number of replications).
4.4.2
This section uses the NetworkX graph library and may be skipped without loss
of continuity.
As noted in Sect. 3.4,we can think of the completion of a project activity
as an event, and when all of the inbound activities I(j) to a milestone j are
completed then the outbound activities i O(j) can be scheduled, where the
destination milestone of activity i is D(i). Thus, the following generic milestone
event is the only one needed:
event milestone (activity ` inbound to node j)
I(j) = I(j) `
if I(j) = then
for each activity i O(j)
schedule milestone(activity i inbound to node D(i) to occur Xi time
units later)
end if
25
26
import random
import SimPy.Simulation as Sim
import networkx as nx
class SANglobal:
F = nx.DiGraph()
a = 0
b = 1
c = 2
d = 3
inTo = 0
outOf = 1
F.add_nodes_from([a, b, c, d])
F.add_edges_from([(a,b), (a,c), (b,c), (b,d), (c,d)])
Figure 4.21: Network description for the discrete-event SAN simulation.
Of course, this approach shifts the effort from enumerating all of the paths
through the SAN to creating the sets I, O, D, but these sets have to be either
explicitly or implicitly defined to define the project itself. The key lesson from
this example, which applies to many simulations, is that it is possible to program
a single event routine to handle many simulation events that are conceptually
distinct, and this is done by passing event-specific information to the event
routine.
In this case we need to develop the configuration of that activity network
and use that description to direct the simulation. To do so we will use the
NetworkX11 graph library. NetworkX is a Python language software library for
the creation, manipulation, and study of the structure, dynamics, and functions
of complex networks. While it includes a wide range of graph algorithms, we
will use it as a standard representation of graphs such as the stochastic activity
network.
Using NetworkX, we create a directed graph (nx.DiGraph()) with four nodes
with five directed edges. (Figure 4.21) Implicitely, it also creates predecessor
and successor lists for each node that can be accessed using F.predecessors(i)
or F.successors(i).
We then define events as the completion of activities that go into a given
node. Events trigger a signal that can be used to trigger other activities.
In the first block of code in Figure 4.22 an event is defined for each node in
the network and added to a list of events (nodecomplete) which corresponds to
the list of nodes. Then, for each node, the list of predecessor events is created
(preevents) and the node is created as an ActivityProcess. (Figure )
For each ActivityProcess, the waitup() function is focused on the waitevent
event. This takes as an arguement myEvent, which is the list of predecessor
11 http://networkx.github.io/
27
SANglobal.finishtime = 0
Sim.initialize()
SANglobal.F.nodecomplete= []
for i in range(len(SANglobal.F.nodes())):
eventname = Complete%1d % i
SANglobal.F.nodecomplete.append(Sim.SimEvent(eventname))
SANglobal.F.nodecomplete
activitynode = []
for i in range(len(SANglobal.F.nodes())):
activityname = Activity%1d % i
activitynode.append(ActivityProcess(activityname))
for i in range(len(SANglobal.F.nodes())):
if i <> SANglobal.inTo:
prenodes = SANglobal.F.predecessors(i)
preevents = [SANglobal.F.nodecomplete[j] for j in prenodes]
Sim.activate(activitynode[i], activitynode[i].waitup(i,preevents))
startevent = Sim.SimEvent(Start)
Sim.activate(activitynode[SANglobal.inTo],
activitynode[SANglobal.inTo].waitup(SANglobal.inTo, startevent))
sstart = StartSignaller(Signal)
Sim.activate(sstart, sstart.startSignals())
Sim.simulate(until=50)
Figure 4.22: Main SAN DES simulation.
class ActivityProcess(Sim.Process):
def waitup(self,node, myEvent):
# PEM illustrating "waitevent"
# wait for "myEvent" to occur
yield Sim.waitevent, self, myEvent
tis = random.expovariate(1.0)
print (
The activating event(s) were %s %
([x.name for x in self.eventsFired]))
yield Sim.hold, self, tis
finishtime = Sim.now()
if finishtime >SANglobal.finishtime:
SANglobal.finishtime = finishtime
SANglobal.F.nodecomplete[node].signal()
Figure 4.23: Activity process waits for events to be cast.
28
class StartSignaller(Sim.Process):
# here we just schedule some events to fire
def startSignals(self):
yield Sim.hold, self, 0
startevent.signal()
Figure 4.24: StartSignaller class for initiating the SAN simulation.
events that were identified in the main simulation. As each in the myEvent list
occurs, it broadcasts its associated signal using the signal() function. When all
the events in a ActivityProcess waitevent list (myEvent in Figure ?? have
occurred, the yield condition is met and the next line in ActivityProcess
begins.
To start the simulation, we create a Process that will provide the initiating
event of the simulation. (Figure 4.24) Similarly, we treat the initial node of the
simulation differently by having it wait for the start signal to begin instead of
waiting for predecessor events like the other nodes.
Notice (see Fig. 4.22) that the simulation ends when there are no additional
activities remaining to be completed.
A difference between this implementation of the SAN simulation and the one
in Sect. 4.4.1 is that here we write out the actual time the project completes on
each replication. By doing so, we can estimate Pr{Y > tp } for any value of tp by
sorting the data and counting how many out of 1000 replications were greater
29
than tp . Figure 4.25 shows the empirical cdf of the 1000 project completion
times, which is the simulation estimate of Eq. (3.12).
4.4.3
1. In real projects there are not only activities, but also limited and often
shared resources that are needed to complete the activities. Further, there
may be specific resource allocation rules when multiple activities contend
for the same resource. How might this be modeled in SimPy?
2. Time to complete the project is an important overall measure, but at the
planning stage it may be more important to discover which activities or
resources are the most critical to on-time completion of the project. What
additional output measures might be useful for deciding which activities
are critical?
4.5
X
\
)= 1
X(T
X(it).
m i=1
This makes simulation possible, since
p
1 2
X(ti+1 ) = X(ti ) exp
r (ti+1 ti ) + ti+1 ti Zi+1
2
for any increasing sequence of times {t0 , t1 , . . . , tm }, where Z1 , Z2 , . . . , Zm are
i.i.d. N(0, 1).
Figure 4.26 is Python code that uses m = 32 steps in the approximation,
and makes 10,000 replications to estimate . Discrete-event structure would
slow execution without any obvious benefit, so a simple loop is used to advance
30
Rate (faxes/minute)
4.37
6.24
5.29
2.97
2.03
2.79
2.36
1.04
time. The value of the option from each replication is saved to a list for postsimulation analysis.
The estimated value of is $2.20 with a relative error of just over 2% (recall
that the relative error is the standard error divided by the mean). As the
histogram in Fig. 4.27 shows, the option is frequently worthless (approximately
68% of the time), but the average payoff, conditional on the payoff being positive,
is approximately $6.95.
4.6
31
32
33
"""
sumx = 0.0
# Create instance of a random number object
generator = random.Random()
generator.seed(seed)
x = initialValue
sigma2 = (sigma*sigma)/2.0
for j in range(steps):
z = generator.normalvariate(0, 1)
x = x * math.exp((interestRate - sigma2) * interval + sigma * math.sqrt(interval) * z)
sumx = sumx + x
value = math.exp(-interestRate * maturity) * max(sumx / float(steps) - strikePrice, 0.0)
return value
replications = 10000
initialSeed = 1234
maturity = 1.0
steps = 32
sigma = 0.3
interestRate = 0.05
initialValue = 50.0
strikePrice = 55.0
interval = maturity / float(steps)
values = [Asianoption(interestRate, sigma, steps, initialValue, strikePrice, maturity, \
i + initialSeed) for i in range(replications)]
print(mean(values))
print(std(values)/math.sqrt(replications)) # standard error
print(std(values)/math.sqrt(replications)/mean(values)) # relative error
2000
4000
6000
34
10
20
30
40
Asian
Figure 4.27: Histogram of the realized value of the Asian option from 10,000
replications.
import random
import SimPy.Simulation as Sim
class F:
maxTime = 100
# hours
seedval = 1234
period = 60.0
nPeriods = 8 # periods per day
meanRegular = 2.5/period # hours
varRegular = 1.0/period # hours
stdRegular = math.sqrt(1.0)/period
meanSpecial = 4.0/period # hours
varSpecial = 1.0/period # hours
stdSpecial = math.sqrt(1.0)/period
tPMshiftchange = 4.0
numAgents = 15
numAgentsPM = 9
numSpecialists = 6
numSpecialistsPM = 3
maxRate = 6.24
aRate= [4.37, 6.24, 5.29, 2.97, 2.03, 2.79, 2.36, 1.04]
aRateperhour = [aRate[i] * period for i in range(len(aRate))]
meanTBA = 1/(maxRate * period) # hours
pSpecial = 0.20
35
The main program for the simulation is in Fig. 4.29. Of particular note are
the two Monitor statements defining Regular10 and Special10. These will
be used to obtain the fraction of regular and special faxes that are processed
within the 10-minute requirement by recording a 1 for any fax that meets the
requirement, and a 0 otherwise. The mean of these values is the desired fraction.
Also notice is the condition that ends the main simulation loop:
self.simulate(until=F.maxTime)
Because the simulation ends well after the arrivals cease, any faxes still in
the queue will be completed prior to the end of the simulation. When the event
calendar is empty, then there are no additional faxes to process, and no pending
arrival of a fax. This condition will only hold after 4 PM and once all remaining
faxes have been entered.
36
class Faxcentersim(Sim.Simulation):
def run(self, aseed):
random.seed(aseed)
self.agents = Sim.Resource(capacity=F.numAgents,
name="Service Agents", unitName="Agent", monitored = True,
qType=Sim.PriorityQ, sim=self)
self.specialagents = Sim.Resource(capacity=F.numSpecialists,
name="Specialist Agents", unitName="Specialist", monitored=True,
qType=Sim.PriorityQ, sim=self)
self.meanTBA = 0.0
self.initialize()
s = Source(Source, sim=self)
a = ArrivalRate(Arrival Rate, sim=self)
tchange = SecondShift(PM Shift, sim=self)
self.Regularwait = Sim.Monitor(name="Regular time",
ylab=hours, sim=self)
self.Specialistwait = Sim.Monitor(name="Special time",
ylab=hours, sim=self)
self.activate(a, a.generate(F.aRateperhour))
self.activate(s,
s.generate(resourcenormal=self.agents,
resourcespecial=self.specialagents), at=0.0)
self.activate(tchange, tchange.generate(F.tPMshiftchange,
resourcenormal=self.agents,
resourcespecial=self.specialagents))
self.simulate(until=F.maxTime)
def reporting(self):
regularcount = self.Regularwait.count()
regularwait = self.Regularwait.mean()
regularQ = self.agents.waitMon.timeAverage()
regularagents = self.agents.actMon.timeAverage()
regular10min = sum([1.0 if waittime < 1./6 else 0
for waittime in self.Regularwait.yseries()])
fractionregular10min = regular10min/regularcount
specialcount = self.Specialistwait.count()
specialwait = self.Specialistwait.mean()
specialQ = self.specialagents.waitMon.timeAverage()
specialagents = self.specialagents.actMon.timeAverage()
special10min = sum([1.0 if waittime < 1./6 else 0
for waittime in self.Specialistwait.yseries()])
fractionspecial10min = special10min/specialcount
result = [regularwait, regularQ, regularagents,
specialwait, specialQ, specialagents,
fractionregular10min, fractionspecial10min]
return result
reps=10
faxsimulation = []
for i in range(reps):
faxsim = Faxcentersim()
faxsim.run(F.seedval + i)
result = faxsim.reporting()
faxsimulation.append(result)
mean(faxsimulation, axis=0)
std(faxsimulation, axis=0)
Figure 4.29: Main program and statistics reporting for service center simulation.
37
Figure 4.30 includes processes that generate faxes (Source) and determine
how they are routed to agents (Fax). As faxes are routed, the simulation determines if they require special handling after the regular agent is complete
(if (checkSpecial < F.pSpecial)). There are also two Monitor present,
Regularwait and Specialistwait which record the total wait time for each
fax that is completed.
self.sim.Regularwait.observe(finished)
Later, in the reporting()) function of the main program shown in Figure
4.29, this Monitor will be used to both get the number of total faxes of this
type, the number that waited less than 10 minutes before completing processing,
and the fraction.
regularcount = self.Regularwait.count()
regular10min = sum([1.0 if waittime < 1./6 else 0
for waittime in self.Regularwait.yseries()])
fractionregular10min = regular10min/regularcount
While collecting statistics in this fashion provides the most flexibility as it
keeps the wait time observations for future use, we could have instead recorded
if the wait time was less than 10 minutes.
if finished < 1.0/6: # 10 minutes
self.sim.Regularwait.observe(1)
else:
self.sim.Regularwait.observe(0)
Then we could have calculated the fraction by taking the sum of the observations divided by the number of observations.
fractionregular10min = float(sum(self.Regularwait))/
len(self.Regularwait)
The Fax.reporting() then uses the Monitor for wait time as well as the
monitors associated with the agents and specialagents resources. The actMon
monitors track how the resources are being used while the waitMon monitors
track the waiting queue for each monitor. For resources, we tend to be interested in the time average value of the size of the queue or the number of units of
resource in use so the agents.actMon.timeAverage() function returns the average utilization of the agents while specialagents.waitMon.timeAverage()
function returns the average number of special faxes in queue.
Ten replications of this simulation with a staffing policy of 15 Entry Agents
in the morning and 9 in the afternoon, and 6 Specialists in the morning and 3
in the afternoon, gives 0.98 0.04 for the fraction of regular faxes entered in 10
minutes or less, and 0.840.08 for the special faxes. The are 95% confidence
intervals. This policy appears to be close to the requirements, although if we
absolutely insist on 80% for the special faxes then additional replications are
needed to narrow the confidence interval.
38
class Source(Sim.Process):
""" Source generates customers randomly """
def generate(self, resourcenormal, resourcespecial):
i = 0
while(self.sim.now() < F.nPeriods and self.sim.meanTBA>0):
f = Fax(name="Fax%02d" % (i), sim=self.sim)
self.sim.activate(f, \
f.visit(agent=resourcenormal, specialagent = resourcespecial))
t = random.expovariate(1.0 / self.sim.meanTBA)
yield Sim.hold, self, t
i += 1
class Fax(Sim.Process):
""" Fax arrives and is procossed
Processed first by regular staff, then specialized staff if needed
Assume that anyone working on a fax will continue working until it finishes.
"""
def visit(self, agent, specialagent):
arrive = self.sim.now()
yield Sim.request, self, agent
wait = self.sim.now() - arrive
tis = -1.0
while (tis < 0):
tis = random.normalvariate(F.meanRegular, F.stdRegular)
yield Sim.hold, self, tis
yield Sim.release, self, agent
checkSpecial = random.random()
if (checkSpecial < F.pSpecial):
tspecial = -1.0
while (tspecial < 0):
tspecial = random.normalvariate(F.meanSpecial, F.stdSpecial)
yield Sim.request, self, specialagent
yield Sim.hold, self, tspecial
yield Sim.release, self, specialagent
finished = self.sim.now() - arrive
self.sim.Specialistwait.observe(finished)
else:
finished = self.sim.now() - arrive
self.sim.Regularwait.observe(finished)
39
class ArrivalRate(Sim.Process):
""" update the arrival rate every hour
Reads in the arrival rate table and updates the arrival rate every hour
One hour after the last hour begins, changes arrival rate to 0
"""
def generate(self, arrivalrate):
for i in range(len(arrivalrate)):
self.sim.meanTBA = 1.0/(arrivalrate[i])
yield Sim.hold, self, 1.0
# After the end of the day, set the arrival rate = 0
self.sim.meanTBA = 0.0
class SecondShift(Sim.Process):
""" Trigger the change in shifts for agents
The effect should be to move the wait queue to the new set of agents
"""
def generate(self, tshiftchange, resourcenormal, resourcespecial):
yield Sim.hold, self, tshiftchange
reduceagents = F.numAgents - F.numAgentsPM
reducespecial = F.numSpecialists - F.numSpecialistsPM
for i in range(reduceagents):
a = Agentoff(name="removeagent%02d" % (i), sim=self.sim)
self.sim.activate(a, a.visit(resourcenormal))
for j in range(reducespecial):
a = Agentoff(name="removespecial%02d" % (i) , sim=self.sim)
self.sim.activate(a, a.visit(resourcespecial))
class Agentoff(Sim.Process):
def visit(self, agent):
tremain = F.maxTime - self.sim.now()+1
# use priority 100 to ensure agent is removed as soon as current fax is done
yield Sim.request, self, agent, 100
yield Sim.hold, self, tremain
yield Sim.release, self, agent
40
class F:
maxTime = 100.0
theseed = 9999
# hours
period = 60.0
nPeriods = 8
meanRegular = 2.5/period # hours
varRegular = 1.0/period # hours
stdRegular = math.sqrt(1.0)/period
meanSpecial = 4.0/period # hours
varSpecial = 1.0/period # hours
stdSpecial = math.sqrt(1.0)/period
tPMshiftchange = 4.0
numAgents = 15
numAgentsPM = 9
numSpecialists = 6
numSpecialistsPM = 3
maxRate = 6.24
aRate= [4.37, 6.24, 5.29, 2.97, 2.03, 2.79, 2.36, 1.04] # per minute
aRateperhour = [aRate[i] * period for i in range(len(aRate))] # per hour
meanTBA = 1/(maxRate * period) # hours
pSpecial = 0.20
4.6.1
1. There are many similarities between the programming for this simulation
and the event-based simulation of the M/G/1 queue.
2. The fax entry times were modeled as being normally distributed. However,
the normal distribution admits negative values, which certainly does not
make sense. What should be done about this? Consider mapping negative
values to 0, or generating a new value whenever a negative value occurs.
Which is more likely to be realistic and why?
Exercises
1. For the hospital problem, simulate the current system in which the receptionists service time is well modeled as having an Erlang-4 distribution
41
with mean 0.6 minutes. Compare the waiting time to the proposed electronic kiosk alternative.
2. Simulate an M (t)/G/ queue where G corresponds to an Erlang distribution with fixed mean but try different numbers of phases. That is, keep
the mean service time fixed but change the variability. Is the expected
number if queue sensitive to the variance in the service time?
3. Modify the SAN simulation to allow each activity to have a different mean
time to complete (currently they all have mean time 1). Use a Collection
to hold these mean times.
4. Try the following numbers of steps for approximating the value of the
Asian option to see how sensitive the value is to the step size: m =
8, 16, 32, 64, 128.
5. In the simulation of the Asian option, the sample mean of 10,000 replications was 2.198270479, and the standard deviation was 4.770393202.
Approximately how many replications would it take to decrease the relative error to less than 1%?
6. For the service center, increase the number of replications until you can
be confident that that suggested policy does or does not achieve the 80%
entry in less than 10 minutes requirement for special faxes.
7. For the service center, find the minimum staffing policy (in terms of total
number of staff) that achieves the service-level requirement. Examine the
other statistics generated by the simulation to make sure you are satisfied
with this policy.
8. For the service center, suppose that Specialists earn twice as much as
Entry Agents. Find the minimum cost staffing policy that achieves the
service-level requirement. Examine the other statistics generated by the
simulation to make sure you are satisfied with this policy.
9. For the service center, suppose that the staffing level can change hourly,
but once an Agent or Specialist comes on duty they must work for four
hours. Find the minimum staffing policy (in terms of total number of
staff) that achieves the service-level requirement.
10. For the service center, pick a staffing policy that fails to achieve the service
level requirements by 20% or more. Rerun the simulation with a replication being defined as exactly 8 hours, but do not carry waiting faxes over
to the next day. How much do the statistics differ using the two different
ways to end a replication?
11. The function NSPP_Fax is listed below. This function implements the
thinning method described in Sect. 4.2 for a nonstationary Poisson process
with piecewise-constant rate function. Study it and describe how it works.
42
random.seed(stream)
pthinning = [(1-hourlyrate/MaxRate) for hourlyrate in arrivalrate]
t = 0.0
arrivaltimes = []
totaltime = NPeriods * periodlength
while t < totaltime:
deltat = random.expovariate(MaxRate)
t = t + deltat
if t < totaltime:
pthin = pthinning[int(floor(t/periodlength))]
uthin = random.random()
if uthin > pthin:
arrivaltimes.append(float(t)) # add arrival since not thinned
return arrivaltimes
12. Beginning with the event-based M/G/1 simulation, implement the changes
necessary to make it an M/G/s simulation (a single queue with any number of servers). Keeping = 1 and /s = 0.8, simulate s = 1, 2, 3 servers
and compare the results. What you are doing is comparing queues with
the same service capacity, but with 1 fast server as compared to two or
more slower servers. State clearly what you observe.
13. Modify the SimPy event-based simulation of the M/G/1 queue to simulate
an M/G/1/c retrial queue. This means that customers who arrive to
find c customers in the system (including the customer in service) leave
immediately, but arrive again after an exponentially distributed amount
of time with mean MeanTR. Hint: The existence of retrial customers should
not affect the arrival process for first-time arrivals.
14. This problem assumes a more advanced background in stochastic processes. In the simulation of the M (t)/M/ queue there could be a very
large number of events on the event calendar: one Arrival and one Departure for each car currently in the garage. However, properties of the
exponential distribution can reduce this to no more than two events. Let
= 1/ be the departure rate for a car (recall that is the mean parking
time). If at any time we observe that there are N car in the garage (no
matter how long they have been there), then the time until the first of
43
44
Bibliography
[1] SimPy
Development
Team,
http://simpy.sourceforge.net/, 2012.
SimPy
Simulation
Package,
45