Beruflich Dokumente
Kultur Dokumente
Y, MONTH 1995 1
Abstract | Complex models may have model components that PDES speeds up otherwise serial computations dur-
distributed over a network and generally require signicant ing a simulation. The second reason for PDES (distributed
execution times. The eld of parallel and distributed simu- model constraint) is based on a situation where models for
lation has grown over the past fteen years to accommo-
date the need of simulating the complex models using a system components are stored in physically dierent loca-
distributed versus sequential method. In particular, asyn- tions. The other author (Fishwick) is building a prototype
chronous parallel discrete event simulation (PDES) has been distributed simulation of a process plant where each plant
widely studied, and yet we envision greater acceptance of
this methodology as more readers are exposed to PDES in- component is ultimately co-located with the manufacturer
troductions that carefully integrate real-world applications. of that component.
With this in mind, we present two key methodologies (con- The paper proceeds as follows. First, in Section II, we
servative and optimistic) which have been adopted as solu-
tions to PDES systems. We discuss PDES terminology and dene our terms within the PDES area and demonstrate
methodology under the umbrella of the personal communi- the generic approach to distributed simulation. In Sec-
cations services application. tion III, we introduce the PCS application and demon-
Keywords |parallel algorithm, distributed simulation, syn- strate the need for synchronization of incoming messages
chronization, virtual time, network communications to a given process. There are two key approaches to syn-
chronization. Method 1, dened in Section IV, is termed
I. Introduction the conservative method since it ensures that the causal re-
The PCS area, to be discussed in Section 3, uses a spa- do work then the communications overhead becomes less
tial model in that the system is viewed as a hexagonal dis- critical. The kind of work ideally suited in simulation is a
cretization of a large two-dimensional space representing sequential simulation within the LP, composed of the usual
an area where cellular communications are to be imple- FEL and event routines. Thus, the distributed simulation
mented. Spatial models can be executed in several ways is hybrid in form with sequential simulation coinciding |
including time slicing, event scheduling and parallel and and synchronized with| distributed simulation. The LVT
distributed. Our approach will be to use a parallel and of this more substantial LP is updated by removing the
distributed approach to model execution, while using the highest priority event (lowest timestamp) from the FEL
concept of event scheduling within each process. Speaking and executing the associated event routine. Some (or all)
of process, we must dene this term appropriately. Model of these event routines will contain scheduling commands
components for a PCS implementation will be a collection to place events with new times back into the FEL. Some
of hexagonal cells. Other model types, such as a queu- event routines will involve messages to be issued through
ing model, are composed of other components (facilities). the output buer(s) to a target LP.
A logical process (LP) is dened as a set containing basic
model components, so a PCS logical process will be a set B. Object Oriented Implementation
of hexagons, or just one hexagon. A physical process or A PDES consists of several PDES objects or LPs. These
processor is a set of logical processes mapped in a way that LPs execute asynchronously with coordination to complete
conforms to the architecture of the parallel/distributed sys- a simulation run. To implement the objects in an LP (as
tem. described in Section II-A), the attributes and methods of
An LP contains several objects: the LP are classied into four categories (see Table I):
Local Virtual Time (LVT): time associated with the
LP. The LP does not know another LPs time unless [Table 1 about here.]
communicated via a message. A clock mechanism indicates the progress of the LP.
Future Event List (FEL): event list used when there An attribute LVT represents the timestamp of the
are internal events posted within the LP itself. event that just occurred in the LP. The LVTUpdate()
Event: an item within the FEL. method updates LVT to advance the \clock" of the LP.
Message: an item sent from one LP to another. A FEL mechanism processes the events occurring in
The FEL is composed of events, where an event combines the LP. The FEL is basically a priority queue with one
the following objects: 1) time stamp, 2) token, 3) event attribute and three methods. An attribute eventList
type. The time stamp re
ects when the event is to oc- maintains the events to occur in the future. The
cur. An event's occurrence correlates with the execution Enqueue() method inserts a time-tagged event into
of an event routine for that LP. The token is associated eventList so that eventList maintains its ordered
with whatever is
owing through the network of LPs. For sequence. The Dequeue() method deletes the event
the PCS application, portables (i.e., mobile phones)
ow with the minimum timestamp in eventList. The
through the system. An event type species what will Cancel() method deletes the event with a specied
happen to the token (arrival, boundary crossing, depar- timestamp in eventList.
ture, incoming call). An LP has input channels and output A synchronization mechanism interacts with other
channels where each channel has a rst-in/rst-out (FIFO) LPs to coordinate the execution of PDES. The
buer associated with it. A message is equivalent to an ReceiveMessage() method receives messages from
event that must be moved from one LP to another. Mes- other LPs (these messages will be inserted into the
sages which simply enter an FEL and are processed are FEL for processing). The method ExecuteMessage()
generally called events. When an event must be issued to executes events in the FEL. The SendMessage()
another LP, it becomes a message. The relationship among method sends output message (generated by the ex-
the above terms are shown in Figure 1. ecution of events) to their destination LPs.
It is probably more appropriate to consider
[Figure 1 about here.] ExecuteMessage() as a method of the FEL. However,
Messages arrive in one of several input channel buers this method is aected by the PDES synchronization
and are routed directly to the LPs FEL. Note that simple mechanisms to be described later. Thus the method
LPs may involve a calculation such as 1) taking the times- is classied as part of the synchronization mechanism.
tamp from an incoming message, 2) adding a value to this An application mechanism represents a sub-model for
timestamp, and 3) sending the new message to the output a specic simulation application to be simulated by
buers. Such an LP would not have any need of an FEL the LP (to be elaborated).
and would be a \pure" distributed simulation. This kind of
technique, however, is wasteful of the computing elements C. PDES Implementation Platforms
since there will be a large price to pay in communications PDES systems have been implemented in dierent par-
overhead among inter-LP communication. A simple addi- allel architectures such as BBN Butter
y [12], [13], [14],
tion is not sucient to warrant a distributed approach. On Sequent [15], [16], [17], JPL Mark III [18], Simulated Stan-
the other hand, if the processing element can be made to ford Dash Multiprocessor [19], Transputers [20], [21], CM-
LIN AND FISHWICK: ASYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION 3
1/CM-5 [22], KSR [23], and iPSC/860 [24]. PDES has scheme, if no channel is available in the new cell, then the
also been implemented in workstations connected by a lo- call will be dropped or forced terminated immediately.
cal area network [4] which is widely available in both the The PCS example is probably more realistic to the reader
industrial and the academic environments. if we add some geometry to these moving vehicles (porta-
bles). Unfortunately, whether a vehicle moves from one
III. Personal Communication Services cell to another cannot be simply determined by the phys-
We use personal communication service (PCS) network ical movement of the vehicle. We also need to consider
simulation to illustrate PDES functionality. A PCS net- the radio propagation. It is possible that the connection
work [25], [26] provides low-power and high-quality wire- to a vehicle changes from one port to another even if the
less access for PCS subscribers or portables. The service vehicle is stationary { the change of radio signal strength
area of a PCS network is populated with a number of radio may result in re-connecting the vehicle to a dierent port.
ports. Every radio port covers a sub-area or cell. The port According to the PCS network measurement methods, we
is allocated a number of channels (time slots, frequencies, determine that the movement (in the sense of port con-
spreading codes or a combination of these). A portable nection) of a vehicle is best characterized by the residence
occupies a channel for an incoming/outgoing call. If all time1 distribution and the destination cell routing proba-
channels are busy in the radio port, the call is blocked. bility. The reader may image that this movement model
In PCS network planning, PCS network modeling (usually is equivalent to a simple path approach where a vehicle
conducted by simulation experiments) is required to inves- moves straight with an angle. The angle determines the
tigate the usage of radio resources. Since PCS network destination cell and the residence time is the product of a
simulation is time-consuming, PDES eectively speeds up constant speed and the diameter of the cell2 .
the process of PCS network simulation. Specically, To map the PCS model into PDES, the cells in the PCS
The size of the PCS network under study is usually network are represented by cell objects derived from the
large (e.g., thousands of cells). A typical sequential PDES objects (i.e., LPs). These LPs are then mapped
PCS simulation run takes over 20 hours, while the cor- to processors for execution (see Figure 2). A cell LP
responding PCS PDES takes less than 3 hours using has the following attributes and methods (i.e., the ap-
8 processors [4]. plication mechanism of a general LP): A constant at-
Another popular parallel approach, the parallel inde- tribute channelNo represents the total number of chan-
pendent replicated simulation [27], [28], [29] (running nels in a radio port. An attribute idleChannelNo repre-
multiple simulation replications concurrently) does not sents the number of idle radio channels. A portableList
work for PCS simulation. In most cases, the PCS de- collects the information of all portables reside in the cell.
signer only is interested in the behavior of the PCS There are ve methods in the cell object: CallArrival(),
network at the engineered workload (e.g., the work- CallCompletion(), PortableMoveIn(),
load at which the blocking probability is 1%). To cal- PortableMoveOut(), and Handoff(). These methods will
ibrate the simulation at the engineered workload, the be elaborated later.
setup of input parameters for the next simulation run [Figure 2 about here.]
is dependent on the previous run.
Now we describe the PCS model and its mapping to the The portables in the PCS network are represented by
corresponding PDES. For demonstration purposes, we de- the portable objects. A portable object consists of four at-
scribe a simplied PCS model without considering the de- tributes:
tails of the radio signal propagation issues (such as Rayleigh The busy attribute indicates the status of the portable.
fading, co-channel interference, and so on). We assume If busy=YES then the portable is in a conversation.
that there are S cells in the PCS network, and on the aver- The callArrivalTime attribute represents the next
age, there are n portables in a cell. Every port is allocated call arrival time.
some number of channels. A portable resides at a cell for The callCompletionTime attribute represents the
a period of time which is a random variable with some completion time of the current phone call when
distribution (e.g., exponential [30], [31], [32]). Then the busy=YES . If busy=NO , the callCompletionTime at-
portable moves to a neighbor cell based on some routing tribute is meaningless.
function (e.g., equal routing probabilities for all neighbors). The portableMoveOutTime attribute represents the
The call arrivals to a portable is a random process (e.g., time when the portable moves out of the current cell.
Poisson), and is independent of the portable's movement. There are two categories of events in a PDES. An internal
A call is connected if a channel is available. Otherwise, the event is scheduled and executed at the same LP (the event
call is blocked. When a portable moves from one cell to represents the interaction between a cell and a portable
another while a call is in progress, the call requires a new within the cell in our PCS example), and an external event
channel (in the new cell) to continue. This procedure of is scheduled by one LP and is executed by another LP.
changing channels is called hando or automatic link trans- 1
fer (ALT). Several hando schemes have been proposed in cell.Residence time refers to the time that a portable resides within a
the literature [33], [34], [35]. In this paper, we consider 2 But note that our movement model is practical { it is used to ap-
the simplest scheme called non-prioritized scheme. In this proximate real radio systems, while the simple path approach cannot.
4 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
is scheduled and inserted in LPA 's FEL. When the LVT IV. Conservative Method
of LPA advances to 13, m2 is executed. The method The conservative simulation [36] is conservative in the
LPA .CallCompletion(p1) is invoked and the attributes of sense that it does not execute an event before it ensures
p1 are modied as that the local causality rule is satised. The conservative
busy = NO;
simulation follows two rules: the input waiting rule and the
output waiting rule. It also assumes that
callArrivalTime = 20; the messages are received in the order they are sent
callCompletionTime =?; (the FIFO communication property), and
portableMoveOutTime = 16 the communication channels among LPs are xed and
never change during the simulation. In Figure 4(b),
and a new event LPA (LPC ) has one output channel directed to LPB ,
and LPB has two input channels (one from LPA and
m3 = (16; p1 ; PortableMoveOut) one from LPC ).
is scheduled. At LVT 16, m3 is executed. The method A. Basic Synchronization Mechanism
LPA .PortableMoveOut(p1) is invoked to determine the In a conservative simulation, every logical process LP
destination cell (which is B in Figure 3), and a message repeats the following two steps.
m4 = (16; p1 ; PortableMoveIn) Step 1. LP waits to select an input message m from
its input channels (extra data structures are required to
is sent from LPA implement
ing LP .
input channels in a logical process) by invok-
ReceiveMessage(). This method is implemented
to LPB by invoking LPA .SendMessage(m4; LPB ). Note based on the input waiting rule to be described. The
that the portable p1 migrates to LPB when m4 is sent. (In method inserts m into LP 's FEL.
GIT/Bellcore's PCS implementation [4], a message is part Step 2. Let ts be the timestamp of m.
of a portable object, and sending a message automatically LP .ExecuteMessage() is invoked to process all events
migrates the corresponding portable object.) When LPB 's in the FEL with timestamps no larger than ts in non-
LVT advances to 16, it executes m4 . The next portable decreasing timestamp order. The execution may invoke
move time is generated (which is 24). The attributes of p1 LP .SendMessage() to send output messages. This method
are modied as is implemented based on the output waiting rule to be
busy = NO; described. If the termination condition is satised (e.g.,
callArrivalTime = 20;
LP .LVT>5000), then exit the loop. Otherwise go to Step
1.
callCompletionTime =?; The waiting rules are described as follows.
portableMoveOutTime = 24 The input waiting rule. An LP does not process any
input message until it has received at least one message
and a new event m5 = (20; p1 ; CallArrival) is scheduled. from each of its input channels. The input message with
A PDES is correct if the following rule is satised. the smallest timestamp is selected for processing. Figure 5
Local Causality Constraint: Every LP processes shows how the input message is selected for the PCS sim-
events in nondecreasing timestamp order. ulation.
The major problem of PDES is that the logical processes [Figure 5 about here.]
are executed at dierent speeds. Consider the scenario in
Figure 4 that portable p1 moves from cell A to cell B at Figure 5(a) illustrates a PCS system where 6 portables
time 20 with an ongoing phone call (i.e., a hando call), p1 ; p2 ; p3 ; p4 ; p5 , and p6 move from cells B, C, D, E, F,
and portable p2 moves from cell C to cell B at time 13 with and G to cell A at times 30, 10, 26, 4, 12, and 14, respec-
an ongoing phone call (see Figure 4(a)). tively. In the PDES model (see Figure 5(b)), the Portable-
[Figure 4 about here.] MoveIn events of p1 ; :::; p6 are represented by the messages
m1 ; :::; m6 sent to LPA . By the input waiting rule, m4 is
Consider the PDES scenario in Figure 4(b). LPA sends the next message to be executed in LPA .
a PortableMoveIn event (message) m1 (for p1 ) with times- Assume that all messages sent from one LP to another
tamp 20 to LPB . Later LPC sends m2 (for p2 ) with times- are in non-decreasing timestamp order (this property will
tamp 13 to LPB . If LPB executes m1 before m2 arrives, be guaranteed by the output waiting rule to be described
then the modications to LPB .idleChannelNo is out of the next), then the input waiting rule ensures that the times-
timestamp order, and the local causality rule is violated. tamp of the selected message is no larger than any input
Thus the simulation result is not correct. messages to be processed in the future.
To solve this problem, the executions of the logical pro- The output waiting rule. An LP does not send an
cesses must be synchronized. The remainder of this paper output message to another LP until it ensures that no out-
describes two popular asynchronous synchronization mech- put messages with smaller timestamps will be scheduled
anisms, the conservative and the optimistic methods. (at LP) in the future. Assume that all input messages are
LIN AND FISHWICK: ASYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION 7
handled in non-decreasing timestamp order (the property LPC does not send out any output message before it re-
is guaranteed by the input waiting rule). The output wait- ceives an input message from LPB , and LPB does not send
ing rule is satised if an LP only sends output messages out any output message before it receives an input message
with timestamps no larger than its current LVT value. from LPA (i.e., before m1 is processed). Thus the PDES is
Consider the following PCS example. Portables p1 ; p2 in the deadlock situation. Two deadlock resolutions have
and p3 move into cell A at times 10, 20, and 30, and move been proposed: deadlock avoidance [36] and deadlock de-
out of the cell at times 29, 24, and 36, respectively (see tection/recovery [37], [38]. It has been shown [39] that the
Figure6(a)). This situation occurs since a portable, once cost of deadlock detection/recovery is much higher than
inside cell A, may take a dramatically dierent from other deadlock avoidance. This article will focus on the deadlock
portables. Some portables may stay in the same physical avoidance mechanism.
location for a period while other portables continue moving In a PCS network, a portable is expected to reside in
towards an adjacent cell to A. a cell for a period of time before it moves. Assume that
[Figure 6 about here.] every portable resides in a cell for at least six time units
before it moves to a new cell. The information that \a
In PDES, m1 ; m2 , and m3 are input messages represent- portable resides in a cell for at least 6 time units" is used
ing the arrivals of p1 ; p2 and p3 , respectively (see Step 1 in the deadlock avoidance mechanism to predict when an
in Figure 6(b)). When m1 is processed, a move event LP will receive an input message, and \6 time units" is
m01 for p1 is scheduled with timestamp 29 (see Step 2 in referred as the lookahead value. The lookahead information
Figure 6(b)). In the conservative simulation, m01 cannot is carried by the control messages called null messages. A
be sent to the destination LP immediately, or the output null message does not represent any event in the simulated
waiting rule may be violated. In our PCS PDES imple- system. Instead, it is used to break deadlock as well as to
mentation, the portable move is simulated by two types improve the progress of a conservative simulation.
of events: a PortableMoveOut event and a PortableMoveIn In Figure 7(b) , at the beginning of PDES, the LVTs
event. In Figure 6(b), m0i and m00i represent the Portable- of the three LPs are 0, and a PortableMoveOut event m1
MoveOut event and PortableMoveIn event of portable pi , with timestamp 8 is in LPA 's FEL. At time 0, LPA sends a
respectively. When the event m0i is scheduled, it is in- null message with timestamp 0+6=6 (the LVT value plus
serted in LPA 's FEL. When the LVT of LPA advances to the lookahead value) to LPB (see Figure 7(c)). The null
the portable \move time" (i.e., the timestamp of m0i ), m0i message implies that no portable will move in cell B earlier
is processed, which results in sending the PortableMoveIn than time 6. Thus, the LVT of LPB advances to 6 when
event m00i (with the timestamp of m0i ) to the destination. In the null message arrives (Figure 7 (d)). Since no portable
Figure 6 (b), m002 and m001 are sent after Step (3) and before arrives at cell B before time 6, it implies that no portable
Step (4); i.e., when LPA is sure that next input message to will move out of cell B before time 12 and LPB sends a null
be handled has timestamp larger than m01 and m02 . Note message with timestamp 12 to LPC . After the sending of
that m002 is sent before m001 is. several null messages, LPA will eventually receive a null
Since the output waiting rule is guaranteed by using the message with timestamp larger than 8 (see Figure 7 (e)),
two \move" event types, the conservative SendMessage() and by the input and output waiting rules, m1 is sent from
method simply sends the output message to the destina- LPA to LPB and the deadlock is avoided (see Figure 7 (f)).
tion. Note that for other applications, a dierent conserva-
tive SendMessage() method may be required to implement C. Exploiting Lookahead
the output waiting rule. It is important to exploit the lookahead to improve the
The correctness of the conservative simulation can be progress of a conservative simulation. Experimental studies
proved by induction on the interaction of the two waiting have indicated that the larger the lookahead values, the
rules. better the performance of the conservative simulation [39].
Based on the techniques proposed in [40], [41], [42], we give
B. Deadlock and Deadlock Avoidance three PCS examples for lookahead exploration. The rst
The input waiting rule may result in deadlock (LPs are two examples assume single cell entrance and exit. The
waiting for input messages from each other and cannot single entrance/exit PCS model has been used in modeling
progress) even if the simulated system is deadlock free. highway cellular phone systems [43]. The results can be
Consider a three-cell PCS network (see Figure 7(a)). easily generalized for multiple entrances and exits. The
techniques introduced can be combined to exploit greater
[Figure 7 about here.] lookahead.
There is one portable in the network, and the portable 1. Lookahead Method 1 (FIFO): In a large scale PCS
moves in the path A ! B ! C ! A. At time 0, the network, a cell may only cover a street, and the porta-
portable is in cell A. The portable moves form cell A to bles leave the cell in the order they move in (the FIFO
cell B at time 8. In the conservative simulation, a Portable- property; see Figure 8(a)).
MoveOut event m1 is scheduled in LPA initially (see Fig- [Figure 8 about here.]
ure 7(b)). By the input waiting rule, LPA waits for an in- Consider the corresponding FIFO LP for cell A in
put message from LPC before it can process m1 . Similarly, PDES. The lookahead for the LP can be derived by
8 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
a presampling technique proposed by Nicol [41]. The (a) For j = 1, 1 3 min[0 + 9] = 9 ? No.
idea is to presample the residence times of the arrival (b) For j = 2, 2 3 min[0 + 9; 3 + 4] = 7 ? No.
portables. (c) For j = 3, 3 3 min[0 + 9; 3 + 4; 6 + 1] = 7 ? Yes.
If the FEL is not empty, then the next departure time From this procedure, we derive = 7 by using the
can be easily computed. In the PCS PDES, the move- rst three pre-sampled residence times.
out timestamp of a portable is computed and stored 3. Lookahead Method 3 (Minimum Residence Time): If
in portableMoveOutTime of the portable object at the the FIFO portable movement property does not pre-
time when the PortableMoveIn event is processed. The serve, and does not exist (or is too small to be use-
FIFO property guarantees that the next departure ful), then the technique proposed in the previous ex-
time is the minimum of the portableMoveOutTime ample may not work. In a PCS simulation, the total
values of portable objects in the FEL. Thus, the pre- number N = S n of portables is an input parameter.
computed next departure times can be used as the To compute the next lookahead value for an LP, it suf-
lookahead. ces to sample the next N portable residence times,
If the FEL of the LP is empty at timestamp LP .LVT, and (1) is re-written as [42]
then the lookahead can be generated by the same pre- = 1min t
sampling technique. Since the portable will arrive at iN i
the cell later than LP .LVT, it will leave the cell later The last two examples may require a large number of op-
than LP:LVT + t (where t is the presampled portable erations to generate a lookahead value. In [40], O(1) al-
residence time). The FIFO property guarantees that gorithms have been proposed to generate the lookahead
after time LP .LVT, no portables will depart earlier values.
than LP .LVT+t, and the LP may send null messages When the ExecuteMessage() method processes a null
with this timestamp to the downstream LPs. message in
2. Lookahead Method 2 (Minimum Inter-Boundary an LP, it invokes a method ComputeLookahead() to com-
Crossing Time): Consider the example in Figure 8(b) pute the timestamp of the output (null) messages. The
where the FIFO portable movement property in the ComputeLookahead() method may implement the looka-
previous example does not hold. In practice, the in- head exploiting techniques described above. Then the new
ter arrival times to the cell (for the portables from null message is sent to some or all output channels by in-
the same entrance) cannot be arbitrary small. In- voking the SendMessage() method.
stead, a minimum cell crossing time is assumed.
Let pi (i = 1; 2; 3; :::) be the ith portable arrival af- V. Optimistic Method
ter time LP .LVT. The portable residence time for pi The optimistic simulation [44] is optimistic in the sense
is ti . Then the departure time of pi is later than that it handles the arrival events aggressively. When a mes-
LP:LVT + (i ? 1) + ti , and the next departure time at sage m arrives at an LP, LP .ReceiveMessage() simply in-
the cell after LP .LVT is later than LP:LVT + where serts m in the input queue (the optimistic simulation termi-
nology for the FEL). The logical process assumes that the
= 1min
i<1
(i ? 1) + ti (1) events already in its input queue are the \true" next events.
The ExecuteMessage() method proceeds to execute these
Since > 0, there exists j such that events in timestamp order, and SendMessage() is invoked
whenever an output message is scheduled . When a mes-
j 1min (i ? 1) + ti = sage arrives at the LP, the timestamp of the message may
ij be less than some of the events already executed. (This
arrived message is referred to as a straggler.) The opti-
In other words, to compute it suces to consider the mism was unjustied, and therefore a method Rollback()
rst j presampled residence timestamps in (1). Fig- is invoked by ExecuteMessage() to cancel the erroneous
ure 9 displays a situation where we employ formula computation. To support rollback, data structures such as
(1). the state queue and the output queue are required (to be
[Figure 9 about here.] elaborated).
Four portables arrive using times 10, 14, 19 and 22. Several strategies for cancelling incorrect computation
Let = 3 so that we know that no two consecutive were surveyed by Fujimoto [45]. Two popular cancellation
portable arrivals will be less than 3. The residence strategies called aggressive cancellation [44] and lazy can-
times for the portables are placed in parentheses in cellation [46] are described in this section.
Figure 9. The variable j is increased by 1 until the
above inequality is satised. Suppose that LPA needs A. Cancellation Strategies
to send a null message to its downstream before it Consider the example in Figure 10.
receives the PortableMoveIn event for p1 . The resi- [Figure 10 about here.]
dence times of the subsequent arriving portables are
pre-samples as t1 = 9; t2 = 4; t3 = 1; t4 = 5... Our For simplicity, assume that cell C has one radio chan-
algorithm proceeds as follows: nel (i.e., LPC .channelNo=1 in PDES). In this example,
LIN AND FISHWICK: ASYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION 9
portable p2 moves from cell B to cell C at time 10 (event and LPC .Rollback() is invoked. Two strategies for can-
1), and make a phone call at time 13. The call is completed celling incorrect computation are described below.
at time 21. Portable 1 moves from cell A to cell C at time Aggressive Cancellation. When a straggler arrives,
16 (event 2), and attempts to make a phone call at time aggressive cancellation assumes that the out-of-order com-
20. Since the only radio channel is used by portable 2, the putation, as well as all other computations that may
call attempt from portable 1 is blocked. Portable 1 moves have been aected by this computation are not correct.
from cell C to cell D at time 24. Figures 11, 12, and 13 Thus, the out-of-order computation is recomputed, and
[Figure 11 about here.] LPC .Rollback() cancels the aected computations imme-
diately by sending anti-messages. In our example, a roll-
illustrate the data structures of LPC (the logical process back of LPC at timestamp 10 occurs. In Figure 13 (b), the
corresponding to cell C) assuming that message m1 (the anti-messages m?2 ; m?3 , and m?4 are
message that represents event 2) arrives at LPC earlier
than message m5 (the message that represents event 1) [Figure 13 about here.]
does. In LPC , a state queue and an output queue are main- deleted from the output queue, and are sent to their des-
tained to supported rollback. In our example, the state tinations to annihilate false messages m2 ; m3 , and m4 , re-
variable (attribute) for LPC is the number of idle channels spectively. After the rollback (see Figure 13 (c)), messages
LPC .idleChannelNo. The state variable is checkpointed m2 and m3 (and m4 in LPD ) are removed from the in-
and saved in the state queue from time to time. The snap- put queue. The state of LPC at timestamp 0 is re-stored.
shots in the state queue are used to recover the state of Then LPC .ExecuteMessage() resumes the simulation by
LPC when rollbacks occur. The output queue records the executing m5 .
anti-messages of the output messages that have been sent Lazy Cancellation. It is possible that the erroneous
from LPC . The anti-messages are used to annihilated false computation still generated correct output messages. In
messages sent in the incorrect computation. that case, it is not necessary to cancel the original message
In Figure 11(a), LPC receives m1 that is inserted in that was sent. In lazy cancellation, logical processes do
LPC 's input queue. Initially, the output queue of LPC not immediately send the anti-messages for any rolled back
is empty, and the value of LPC .idleChannelNo at times- computation. Instead, they wait to see if the reexecution
tamp 0 is saved. After m1 is executed, the system state of the computation causes any of the same messages to be
at timestamp 16 is checkpointed, and a call arrival event regenerated. If the same message is recreated, there is no
(message m2 ) is scheduled for LPC itself (see Figure 11(b)). need to cancel the original. Otherwise, an anti-message
Note that after its execution, m1 is kept in the input queue is sent. In our example, lazy cancellation applies to three
(this message may be re-executed if a rollback occurs). A situations.
pointer in the input queue indicates the next event to be 1. If portable p2 arrives at cell C (LPC ) at time 10 and
executed. The anti-message m?2 of m2 is saved in LPC 's leaves cell C at time 28 without making any phone call
output queue. The message m?2 is identical to m2 except (see Figure 14(a)) then the arrival of m5 in Figure 13
that it includes a destination eld (in the original optimistic (a) will not aect the executions of m1 ; m2 , and m3 .
or Time Warp algorithm [44], the sender and the destina- (Note that in PDES, whether a call for p2 occurs in
tion are recorded in both the output message and the cor- the interval [10,28] can be detected in the portable
responding anti-message for
ow control). To summarize, object.) Thus messages m1 ; m2 , and m3 do not need
the ExecuteMessage() method for the optimistic simula- to be reexecuted after m5 is executed. This is called
tion saves the system state after an event execution (note jump forward or lazy reevaluation [1].
that the state may be saved after several event executions), [Figure 14 about here.]
and the executed event is not deleted from the input queue. In this case, LPC .ReceiveMessage() simply inserts
The SendMessage() method saves the anti-messages in the m5 in the input queue, and the pointer of the input
output queue when it sends an output message. queue points to m5 . LPC .ExecuteMessage() executes
After m2 is executed, the number of idle channel is decre- m5 and the pointer jumps directly after m3 without
mented by 1, and re-executing m1 ; m2 , and m3 .
LPC :idleChannelNo = 0 2. The call for p2 does not block the call for p1 if
p2 's call completes before p1 's call arrives (see Fig-
is saved in the state queue. A PortableMoveOut event m3 ure 14 (b)) or p2 's call overlaps p1 's call but LPC has
is scheduled at timestamp 24, and its anti-message m?3 is two or more radio channels (i.e., LP:channelNo 2;
stored in the output queue (see Figure 12(a)). see Figure 14(c)). In these cases, the channel uti-
[Figure 12 about here.] lization (not shown as a state variable in our ex-
ample) changes, but the subsequent messages (i.e.,
When m3 is executed, a PortableMoveIn message m4 is m2 ; m3 , and m4 ) scheduled due to the execution of
sent to LPD (see Figure 12 (b)). After m4 is sent, the strag- m1 are not aected. Thus, messages m1 ; m2 , and
gler m5 (the event that p2 moves in LPC at timestamp 10) m3 are re-executed to re
ected the correct channel
arrives. Since LPC .LVT=24, the out-of-order execution is utilization. No anti-messages need to be sent (i.e.,
detected (see Figure 13 (a)) by LPC .ReceiveMessage(), m?2 ; m?3 , and m?4 are not sent out). Like the previ-
10 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
ous case, LPC .ReceiveMessage() simply inserts m5 consider the example in Figure 15. In this example, we
in the input queue. After m5 has been executed, ignore the phone call events and assume that all Portable-
LPC .ExecuteMessage() will re-execute m1 ; m2 , and MoveIn/PortableMoveOut events must be executed in their
m3 without re-generating any output messages. timestamp order in the optimistic simulation. We further
If lazy cancellation does succeed most of the time, then assume that the state variable of a logical process is the
the performance of the optimistic simulation is improved number of portables move in the corresponding cell after
by eliminating the cost of cancelling the computation which time 0.
would have to be reexecuted. If lazy cancellation fails, then
the performance degrades, because erroneous computations [Figure 15 about here.]
are not cancelled as early as possible. In our PCS simula- Portable 1 moves from cell C to cell A at time 4 and moves
tion, we may exploit situations that lazy cancellation does from cell A to cell B at time 60. Portable 2 moves from
not fail (as described above), and a logical process can be cell C to cell B at time 10. Portable 3 moves from cell B to
switched between aggressive cancellation and lazy cancel- cell A at time 20. Portable 4 moves from cell A to cell C
lation to reduce the rollback cost. at time 8. Portable 5 moves from cell A to cell C at time
B. Memory Management 7. Portable 6 moves from cell A to cell B at time 1 and
moves from cell B to cell C at time 15.
To support rollback, it is necessary to save the \history" Figure 16 illustrates the
(the already executed elements in the input, the output, elements in the input/output/state queues of LPA ; LPB ,
and the state queues) of a logical process. However, it and LPC after all transient messages arrive at their desti-
may not be practical to save the whole history of a logical nations, and the GVT value (which is 8 = min(60; 20; 8))
process because memory is likely to be exhausted before is found.
the simulation completes. Thus, it is important that we
only save \recent history" of logical processes to reduce [Figure 16 about here.]
the memory usage.
Memory management for the optimistic simulation is Figure 17 illustrates the
based on the concept of global virtual time (GVT). The elements in the input/output/state queues of LPA ; LPB ,
GVT at (execution) time t is the minimum of the times- and LPC after the fossil collection procedure is completed.
tamps of the not-yet executed messages (these messages are
either in the input queue or are in transit) in the optimistic [Figure 17 about here.]
simulation at time t. (Several other operational denition All messages with timestamps smaller than 8 were fossil
of GVT are given in [47], [48].) It has been pointed out [44] collected. Note that fossil collection for the state queue is
that at any given time t, a logical process cannot be rolled not exact the same as that for the input/output queues.
back to a timestamp earlier than the GVT at t. Therefore In the state queue, the element with the largest times-
the storage for all messages with timestamps smaller than tamp smaller than the GVT value (i.e., 8) must not be
the GVT value can be reclaimed for other usage. The pro- removed (see Figure 17). The other elements with times-
cess of reclaiming the storage for the obsolete elements is tamps smaller than 8 are removed.
called fossil collection.
The GVT computation is not trivial in a distributed sys-
tem because it may be dicult to capture the messages in C. Performance Evaluation
transit. Several GVT algorithms have been developed in The performance of an optimistic PCS PDES implemen-
the systems with the FIFO communication property [49] tation has been investigated in [4]. In this study, a version
or without the FIFO communication property [50], [51]. of Time Warp has been developed that executes on 8 DEC
In GIT/Bellcore PCS PDES (where eight workstations 5000 workstations connected by an Ethernet.
are connected by a local area network), all logical processes In the experiments, speedp was used as the output mea-
are frozen during GVT computation. By utilizing the low sure where the sequential simulator used the same priority
level communication mechanism, all transient messages are queue mechanism as that of PDES for managing the pend-
guaranteed to arrive at their destinations before the GVT ing set of events, but did not have the state saving, rollback
computation starts. The fossil collection procedure works and fossil collection overheads associated with the PDES
as follows. A coordinator initiates the procedure by freez- implementation. 1024 cells are simulated for 2.5105 sim-
ing the execution of every logical process. After all tran- ulated seconds. Figure 18 shows the performance of the
sient messages arrive at their destinations, every logical optimistic PDES.
process reports its local minimum value (the minimum of
the timestamps of all unprocessed messages in the input [Figure 18 about here.]
queue) to the coordinator. The coordinator then compute
the GVT value as the minimum of the received local mini- The gure indicates good performance of PDES for the
mums. The GVT value is broadcast to all logical processes PCS application. PDES is particularly ecient when the
for fossil collection. number of portables is large, the cell residence time is long,
To illustrate the storage reclaimed in fossil collection, and the call interarrival time is short.
LIN AND FISHWICK: ASYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION 11
VI. Future Directions for PDES [10] Fishwick, P.A. and Zeigler, B. P., \A Multimodel Methodol-
ogy for Qualitative Model Engineering", ACM Transactions on
This tutorial describes the asynchronous parallel dis- Modeling and Computer Simulation, vol. 2, no. 1, pp. 52{81,
crete event simulation (PDES) mechanisms and optimiza- 1992.
tion techniques by examples of personal communications [11] Fishwick, P.A., \A simulation environment for multimodeling",
Discrete Event Dynamic Systems: Theory and Applications, vol.
services (PCS) network simulation. We described the con- 3, pp. 151{171, 1993.
servative and the optimistic PDES mechanisms and several [12] Ebling, M., Di Loreto, M., Presley, M., Wieland, F. and Jeer-
optimizations tailored for the PCS simulation. The per- son, D., \An Ant Foraging Model Implemented on the Time
Warp Operating System", Proc. 1991 SCS Multiconference on
formance of the optimistic method was brie
y discussed. Distributed Simulation, pp. 21{26, March 1991.
Since the conservative optimizations (tailored for PCS) in- [13] Hontalas, P., Beckman, B., Diloreto, M., Blume, L., Reiher, P.,
troduced in this paper are new and were not previously Sturdevant, K., Warren, L., Wedel, J., Wieland, F. and Jeerson,
D., \Performance of the Colliding Pucks Simulation on the Time
reported, no performance studies have been conducted. In- Warp Operating Systems (Part 1: Asynchronous Behavior &
vestigating the performance of these optimizations will be Sectoring)", Proc. 1989 SCS Multiconference on Distributed
one of our future research directions. Simulation, pp. 3{7, March 1989.
[14] Fujimoto, R.M., \Time Warp on a Shared Memory Multipro-
The optimization techniques described in the paper are cessor", Proc. 1989 International Conference on Parallel Pro-
general and apply to other simulation applications such as cessing, vol. Volume III, pp. 242{249, August 1989.
battleeld simulation, VLSI simulation, queueing network [15] Ayani, R. and Rajaei, H., \Parallel simulation of a general-
ized cube multistage interconnection network", Proc. 1990 SCS
simulation and computer architecture simulation. How- Multiconference on Distributed Simulation, pp. 60{63, January
ever, these optimization techniques may need to be tailored 1990.
for specic applications. Many studies have devoted to this [16] Thomas, G.S. and Zahorjan, J., \Parallel simulation of perfor-
mance Petri Net: Extending the domain of parallel simulation",
issue (see [1], [2], [52], [53], [54] and references therein). The Proc. 1991 Winter Simulation Conference, pp. 564{573, 1991.
PCS example can be seen as being a member of a larger [17] Reed, D.A. and Malony, A., \Parallel Discrete Event Simulation:
The Chandy-Misra Approach", Proc. 1988 SCS Multiconference
class of simulation model where one rst discretizes the on Distributed Simulation, pp. 8{13, February 1988.
spatial domain into a grid, and then simulates moving en- [18] Wieland, F., Hawley, L., Feinberg, A., Di Loreto, M., Blume, L.,
tities from one grid cell to another. In this sense, the PCS Reiher, P., Beckman, B., Hontalas, P., Bellenot, S. and Jeer-
son, D., \Distributed Combat Simulation and Time Warp: The
problem is isomorphic to the problems of particle/n-body Model and Its Performance", Proc. 1989 SCS Multiconference
simulation. on Distributed Simulation, pp. 14{20, March 1989.
An important research direction that has not been fully [19] Soule, L. and Gupta, A., \An Evaluation of the Chandy-Misra-
Bryant Algorithm for Digital Logic Simulation", ACM Transac-
exploited is the building of user-friendly PDES environ- tions on Modeling and Computer Simulation, vol. 1, no. 4, pp.
ments. Such an environment should provides convenient 308{347, 1991.
tools to develop simulation application. Methods should [20] Beazner, D., Lomow, G. and Unger, B., \A parallel simulation
environment based on Time Warp", To appear in International
also be provided to tailor general optimization techniques Journal in Computer Simulation, 1995.
to t a specic simulation application. We anticipate that [21] Turner, S. and Xu, M., \Performance evaluation of the bounded
these user-friendly environments can be constructed by the Time Warp algorithm", The 6th Workshop on Parallel and Dis-
tributed Simulation, 1992.
object-oriented models described in [6]. [22] Lubachevsky, B., \Ecient Distributed Event-Driven Simu-
lations of Multiple-Loop Networks", Communications of the
Acknowledgments ACM, vol. 21, no. 2, March 1989.
[23] Ghosh, K., Panesar, K., Fujimoto, R.M. and Schwan, K.,
C. Carothers and Y.C. Wong provided useful comments \PORTS: A parallel, optimistic, real-time simulator", Proc. 8th
to improve the quality of this paper. Workshop on Parallel and Distributed Simulation, 1994.
[24] Gaujal, G., Greenberg, A.G. and Nicol, D.M., \A sweep algo-
References rithm for massively parallel simulation of circuit-switched net-
works", Journal of Parallel and Distributed Computing, vol. 18,
[1] Fujimoto, R.M., \Parallel Discrete Event Simulation", Com- no. 4, pp. 484{500, 1993.
munications of the ACM, vol. 33, no. 10, pp. 31{53, October [25] Cox, D.C., \Personal communications { A viewpoint", IEEE
1990. Commun. Mag., vol. 128, no. 11, pp. 8{20, 1990.
[2] Nicol, D. M. and Fujimoto, R. M., \Parallel simulation today", [26] Cox, D.C., \A radio system proposal for widespread low-power
Annals of Operations Research, vol. 53, pp. 249{286, December tetherless communications", IEEE Trans. Commun., vol. 39,
1994. no. 2, pp. 324{335, February 1991.
[3] Richter, R. and Walrand, J.C., \Distributed Simulation of Dis- [27] Glynn, P.W. and Heidelberger, P., \Analysis of Initial Transient
crete Event Systems", Proceedings of the IEEE, vol. 77, no. 1, Deletion for Parallel Steady-State Simulation", SIAM Journal
pp. 99{113, January 1989. on Scientic and Statistical Computing, vol. 13, no. 4, pp. 904{
[4] Carothers, C., Fujimoto, R.M., Lin, Y.-B. and England, P., \Dis- 922, 1992.
tributed Simulation of PCS Networks Using Time Warp", Proc. [28] Heidelberger, P., \Discrete Event Simulations and Parallel Pro-
International Workshop on Modeling, Analysis and Simulation cessing: Statistical Properties", SIAM Journal on Scientic and
of Computer and Telecommunication Systems, pp. 2{7, 1994. Statistical Computing, vol. 9, no. 6, pp. 1114{1132, November
[5] Carothers, C., Lin, Y.-B. and Fujimoto, R.M., \A Re-dial Model 1988.
for Personal Communications Services Network", To appear in [29] Lin, Y.-B., \Parallel Independent Replicated Simulation on A
45th Vehicular Technology Conference, 1995. Network of Workstations", To appear in SIMULATION, 1995.
[6] Fishwick, P.A., Simulation Model Design and Execution: Build- [30] Wong, W.C., \Packet reservation multiple access in a metropoli-
ing Digital Worlds, Prentice Hall, 1995. tan microcellular radio environment", IEEE J. Select. Areas
[7] Peterson, J.L., Petri Net Theory and the Modeling of Systems, Commun., vol. 11, no. 6, pp. 918{925, 1993.
Prentice-Hall, Inc., Englewood Clis, N.J., 1981. [31] Wong, W.C., \Dynamic allocation of packet reservation multiple
[8] Law, Averill M. and Kelton, David W., Simulation Modeling & access carriers", IEEE Trans. Veh. Technol., vol. 42, no. 4, 1993.
Analysis, McGraw Hill, 1991, Second edition. [32] Lin, Y.-B., \Determining the user locations for personal com-
[9] Tooli, T. and Margolus, N., Cellular Automata Machines: A munications networks", IEEE Trans. Veh. Technol., vol. 43, no.
New Environment for Modeling, MIT Press, 2nd edition, 1987. 3, pp. 466{473, 1994.
12 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
[33] Lin, Y.-B., Mohan, S. and Noerpel, A., \Channel Assignment Yi-Bing Lin received his BSEE degree from
Strategies for Hand-o and Initial Access fora PCS Network", National Cheng Kung University in 1983, and
IEEE Personal Communications Magazine, vol. 1, no. 3, pp. his Ph.D. degree in Computer Science from the
47{56, 1994. University of Washington in 1990. Between
[34] Lin, Y.-B., Mohan, S. and Noerpel, A., \Queueing Priority 1990 and 1995, he was with the Applied Re-
Channel Assignment Strategies for Hando and Initial Access search Area at Bell Communications Research
for a PCS Network", IEEE Trans. Veh. Technol., vol. 43, no. 3, (Bellcore), Morristown, NJ. In 1995, he was ap-
pp. 704{712, 1994. pointed full professor of Department and Insti-
tute of Computer Science and Information En-
[35] Lin, Y.-B., Noerpel, A. and Harasty, D., \Sub-rating Channel gineering, National Chiao Tung University. His
Assignment Strategy for Hand-os", To appear in IEEE Trans. current research interests include design and
Veh. Technol., 1995. analysis of personal communications services network, distributed
[36] Chandy, K.M. and Misra, J., \Distributed Simulation: A Case simulation, and performance modeling. He is a subject area editor
Study in Design and Verication of Distributed Programs", of the Journal of Parallel and Distributed Computing, an associate
IEEE Trans. on Software Engineering, vol. SE-5, no. 5, pp. 440{ editor of the International Journal in Computer Simulation, an as-
452, September 1979. sociate editor of SIMULATION, a member of the editorial board of
International Journal of Communications, a member of the editorial
[37] Chandy, K.M. and Misra, J., \Asynchronous Distributed Simu- board of Computer Simulation Modeling and Analysis, Program Co-
lation via a Sequence of Parallel Computations", Communica- Chair for the 8th Workshop on Distributed and Parallel Simulation,
tions of the ACM, vol. 24, no. 11, pp. 198{206, April 1981. and General Chair for the 9th Workshop on Distributed and Parallel
[38] Misra, J., \Distributed Discrete-Event Simulation", Computing Simulation.
Surveys, vol. 18, no. 1, pp. 39{65, March 1986.
[39] Fujimoto, R.M., \Performance Measurements of Distributed
Simulation Strategies", Proc. 1988 SCS Multiconference on Dis-
tributed Simulation, pp. 14{20, February 1988. Paul A. Fishwick is an associate professor
[40] Lin, Y.-B. and Lazowska, E.D., \Exploiting Lookahead in Par- in the Department of Computer and Informa-
allel Simulation", IEEE Trans. on Parallel and Distributed Sys- tion Sciences at the University of Florida. He
tems, vol. 1, no. 4, pp. 457{469, October 1990. received the BS in Mathematics from the Penn-
sylvania State University, MS in Applied Sci-
[41] Nicol, D.M., \Parallel Discrete-Event Simulation of FCFS ence from the College of William and Mary,
Stochastic Queueing Networks", Proc. ACM SIGPLAN Sympo- and PhD in Computer and Information Science
sium on Parallel Programming: Experience with Applications, from the University of Pennsylvania in 1986.
Languages and Systems, pp. 124{137, 1988. He also has six years of industrial/government
[42] Wagner, D.B. and Lazowska, E.D., \Parallel Simulation of production and research experience working at
Queueing Networks: Limitations and Potentials", Proc. 1989 Newport News Shipbuilding and Dry Dock Co.
ACM SIGMETRICS and Performance '89 Conference, pp. 146{ (doing CAD/CAM parts denition research) and at NASA Langley
155, 1989. Research Center (studying engineering data base models for struc-
tural engineering). His research interests are in computer simulation
[43] Kuek, S.S. and Wong, W.C., \Ordered Dynamic Channel As- modeling and analysis methods for complex systems. He is a senior
signment Scheme with Reassignment in Highway Microcells", member of the IEEE and the Society for Computer Simulation. He is
IEEE Trans. Veh. Technol., vol. 41, no. 3, pp. 271{277, 1992. also a member of the IEEE Society for Systems, Man and Cybernetics,
[44] Jeerson, D., \Virtual Time", ACM Transactions on Program- ACM and AAAI. Dr. Fishwick founded the comp.simulation Internet
ming Languages and Systems, vol. 7, no. 3, pp. 404{425, July news group (Simulation Digest) in 1987, which now serves over 15,000
1985. subscribers. He was chairman of the IEEE Computer Society techni-
cal committee on simulation (TCSIM) for two years (1988-1990) and
[45] Fujimoto, R.M., \Optimistic Approaches to Parallel Discrete he is on the editorial boards of several journals including the ACM
Event Simulation", Transactions of the Society for Computer Transactions on Modeling and Computer Simulation, IEEE Trans-
Simulation, vol. 7, no. 2, pp. 153{191, June 1990. actions on Systems, Man and Cybernetics, The Transactions of the
[46] Gafni, A., \Rollback Mechanisms for Optimistic Distributed Society for Computer Simulation, International Journal of Computer
Simulation", Proc. 1988 SCS Multiconference on Distributed Simulation, and the Journal of Systems Engineering.
Simulation, pp. 61{67, February 1988.
[47] Jeerson, D., \Virtual Time II: The Cancelback Protocol for
Storage Management in Time Warp", Proc. 9th Annual ACM
Symposium on Principles of Distributed Computing, pp. 75{90,
August 1990.
[48] Lin, Y.-B., \Memory Management Algorithms for Parallel Simu-
lation", Information Sciences, vol. 77, no. 1, pp. 119{140, 1994.
[49] Lin, Y.-B., \Determining the Global Progress of Parallel Simu-
lation", Information Processing Letters, vol. 50, 1994.
[50] Samadi, B., Distributed Simulation, Algorithms and Perfor-
mance Analysis, PhD thesis, Computer Science Department,
University of California, Los Angeles, 1985.
[51] Mattern, F., \Ecient Distributed Snapshots and Global Vir-
tual Time Algorithms for Non-FIFO Systems", Journal of Par-
allel and Distributed Computing, vol. 18, no. 4, pp. 423 { 434,
1993.
[52] Fujimoto, R.M., \Parallel Discrete Event Simulation: Will the
Field Survive?", ORSA Journal on Computing, vol. 5, no. 3,
1993.
[53] Arvind, D., Bagrodia, R. and Lin, Y.-B., Ed., Proc. 8th Work-
shop on Parallel and Distributed Simulation. ACM, 1994.
[54] Bailey, M. and Lin, Y.-B., Ed., Proc. 9th Workshop on Parallel
and Distributed Simulation. ACM, 1995.
LIN AND FISHWICK: ASYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION 13
List of Figures
1 Anatomy of a logical process (LP). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Cells, logical processes, and processors. A PCS cell is represented by a logical process (LP) in PDES. More
than one LP may be mapped to a processor for execution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 A simple PCS example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 PDES synchronization problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 The input waiting rule. In (a), the number below a car represents the time when the portable crosses the
cell boundary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6 The output waiting rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7 Deadlock and deadlock resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8 Examples for lookahead exploiting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
9 Portables entering and leaving cell A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
10 A PCS example for optimistic PDES. Events 1 and 2 will be represented by messages m5 and m1 respectively
in the optimistic PDES (see the next gures). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
11 The data structures of LPC before/after rollback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
12 The data structures of LPC before/after rollback (cont.). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
13 The data structures of LPC before/after rollback (cont.). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
14 Situations when lazy cancellation applies (in these situations, t2 > t1 ). . . . . . . . . . . . . . . . . . . . . 27
15 An PCS example for fossil collection in optimistic PDES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
16 The optimistic PDES before fossil collection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
17 The optimistic PDES after fossil collection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
18 Speedup of the Optimistic PDES (The call holding time is exponentially distributed with mean 3 minutes.
Eight processors are used in the parallel simulation.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
14 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
LP LVT schedule
FEL
schedule
F
E G
A
D B Cells
C
(Hexagonal PCS Network Model)
LPF
LPE LPG
LPA Logical Processes
LPD LPB
LPC (Parallel Simulation Software)
(Multiprocessor Hardware)
Fig. 2. Cells, logical processes, and processors. A PCS cell is represented by a logical process (LP) in PDES. More than one LP may be
mapped to a processor for execution.
16 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
A C
0 10 13 16 20 24 time
Legend:
A LPA
m1
p1 B
20
LPB
C p2
13
LPC m2
B
LPB
C G
p1
LPC (m1,30) LPG
30 p6
p2
10
14 (m2,10) (m6,14)
A LPA
p5 (m3,26) (m5,12)
p3
12
26
4 LPD (m4,4) LPF
p4
D F
LPE
E
Fig. 5. The input waiting rule. In (a), the number below a car represents the time when the portable crosses the cell boundary.
LIN AND FISHWICK: ASYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION 19
p3 p2 p1 A p3 p1 p2
30 20 10 36 29 24
LVT: 0 LVT: 10
FEL: (m’1,29)
FEL: empty
(m3,30) (m2,20)
(m3,30) (m2,20) (m1,10)
LPA LPA
LVT: 30
LVT 20 FRL: (m’3,36)
FRL: (m’2,24) (m’1,29)
LVT: 0
FEL: empty
B
LPB
8
A LPA
LVT: 0 LPC
FEL: (m1,8)
C LVT: 0
FEL: empty
LVT: 0
FEL: empty LVT: 6
FEL: empty
LPB LPB
(null, 6)
LVT: 0 LPC
LVT: 0 LPC FEL: (m1,8)
FEL: (m1,8)
LVT: 0
LVT: 0 FEL: empty
FEL: empty
LVT: 6
LVT: 6 FEL: empty
FEL: empty
LPA LPA
(null, 18)
LVT: 0 LPC
FEL: (m1,8) LVT: 18 LPC
FEL: empty
LVT: 12
FEL: empty LVT: 12
FEL: empty
(e) LPC sends a null message to LPA .
(f) m1 is processed and sent to LPB .
p3 p2 p1 A p3 p2 p1
p2 p1
A
portable arrivals
t
p4 p3 p2 p1 A p4 p3 p1 p2
A D
event 2
C
Portable 1
0 16 20 24 time
B event 1
Portable 2
10 13 21 time
Legend:
Fig. 10. A PCS example for optimistic PDES. Events 1 and 2 will be represented by messages m5 and m1 respectively in the optimistic
PDES (see the next gures).
24 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
m2
LPA m1 LPA
m1
timestamp 16 State Queue
State Queue
portable p1 timestamp 0 16
portable
event move
in no. idle ch. 1 1
timestamp m-2
0
no. idle ch. 1
timestamp 20
portable p1
call
Output Queue event arrival
m3 LPA
LPA m4
LPC LPD
LPC LPD
LPB
LPB
pointer pointer
Input Queue Input Queue
m1 m2 m3 m1 m2 m3
timestamp 16 20 24 timestamp 16 20 24
portable p1 p1 p1 portable p1 p1 p1
portable call portable portable portable
move move call
event in arrival out event move
in arrival
move
out
Output Queue
Output Queue
m-2 m-3
m-2 m-3 m-4
timestamp 20 24
timestamp 20 24 24
portable p1 p1
portable portable p1 p1 p1
event call move
arrival out portable portable
call move move
destination LPC LPC event arrival out in
m-2 m-3
LPA
m-4
LPC LPD
LPB
m5 LPC LPD
LPB
LPA
pointer
Input Queue
m5 m1 m2 m3 LPC LPD
timestamp 10 16 20 24
portable p1 p1
portable
p1 p1
portable
LPB
portable call move
move move
event in in arrival out
State Queue
timestamp 0 16 20 24
pointer
no. idle ch. 1 1 0 0 Input Queue
Output Queue
m5 m1
m-2 m-3 m-4
timestamp 10 16
timestamp 20 24 24
portable p1 p1
portable
p1
portable portable p1 p1
call move move
event arrival out in
portable portable
LPC LPC LPD move move
destination event in in
t1
(portable p1)
real time
t2
(portable p2)
t1 (portable p1)
real time
t2 (portable p2)
t1 (portable p1)
Fig. 14. Situations when lazy cancellation applies (in these situations, t2 > t1 ).
28 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
B
p1
60
1
7 p3
8
20
C 4 p5 p4
10 p2
p6
15
LPA LPB
pointer
pointer
Input Queue
Input Queue
m0 m1 m2 m3 m4
m5 m6 m7
timestamp 1 4 7 8 60
timestamp 1 15 20
portable p6 p1 p5 p4 p1
portable portable portable portable portable p6 p6 p3
portable
move move move move move portable
event out in out out out portable
move
portable
move move
event in out out
State Queue
State Queue
timestamp 0 1 4 7 8 timestamp 0 1 15
move-in count 0 0 1 1 1
move-in count 0 1 1
Output Queue
Output Queue
m-5 m-9 m-10 m-4
m-6 m-11
timestamp 1 7 8 60
timestamp 15 15
portable p6 p5 p4 p1
portable portable portable portable portable p6 p6
event move
in
move move
in
move
out portable portable
in event move
out
move
in
destination LPB LPC LPC LPA
destination LPB LPC
LPC
pointer
Input Queue
timestamp 0 4 7
move-in count 0 0 1
pointer pointer
pointer
Input Queue
m6 m7 m10 m12 m11
m3 m4
timestamp 15 20 8 10 15
8 60
p6 p3 p4 p2 p6
portable p4 p1
portable portable portable portable portable
portable portable move move move move move
move move in out in
event out out out out
State Queue
1 15 7
timestamp 7 8
1 1 1
move-in count 1 1
Output Queue
m-10 m-4 m-6 m-11
null
8 60 15 15
timestamp
p4 p1 p6 p6
portable
portable portable portable portable
move move move move
event in out out in
8
.....................
7
.......................
......................
......................
...................................
.............................................
.............................................
........................................ ............................................
........................................ .............................................
...................................... .............................................
....................................
6
.................................... .............................................
.................................... ............................................
................................... .............................................
.................................... .............
....................................
....................................
Sp
...... ....................................
...... ....................................
................................
5
....... .....................
......
......
......
......
ee
......
......
......
......
4
......
.......
......
......
..............
du
....................................................
.....................................................
.....................................................
.....................................................
3
.....................................................
.....................................................
.....................................................
p
.................
...........................
..........................
7
............................
................................ ...........................
................................... ..........................
.................................. ...........................
....................... ............................
....................... ...........................
........................ ...........................
....................... ..........................
....................... ............................
6
....................... ...........................
....................... ...........................
........................ ..........................
....................... ...........................
....................... ............................
....................... ......
Sp
...................................... .......................
......................................
.....................................................
........................
.......................
5
.......................................................................... ........................
.......................................................................... .......................
.......................................................................... .................
..........................................................................
.......................................................................
ee
4
du
p 3
: the mean cell residence time = 15 minutes
2 : the mean cell residence time = 45 minutes
: the mean cell residence time = 75 minutes
1
0
5 15 20 25 30 10
Call interarrival time (minutes)
(b) The expected number of portables per cell is 75.
Fig. 18. Speedup of the Optimistic PDES (The call holding time is exponentially distributed with mean 3 minutes. Eight processors are
used in the parallel simulation.)
32 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS, VOL. XX, NO. Y, MONTH 1995
List of Tables
I Attributes and methods of an LP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
LIN AND FISHWICK: ASYNCHRONOUS PARALLEL DISCRETE EVENT SIMULATION 33
TABLE I
Attributes and methods of an LP.