Sie sind auf Seite 1von 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228799175

Sequential Estimation Of The Steady-State Variance In Discrete Event


Simulation

Article · June 2009


DOI: 10.7148/2009-0630-0635

CITATIONS READS

3 21

3 authors:

Adriaan Schmidt Krzysztof Pawlikowski


Fraunhofer Institute for Embedded Systems and Communication T… University of Canterbury
5 PUBLICATIONS   6 CITATIONS    146 PUBLICATIONS   1,411 CITATIONS   

SEE PROFILE SEE PROFILE

Donald C. Mcnickle
University of Canterbury
90 PUBLICATIONS   733 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

PhD studies View project

Analysis of Spot Price Stochasticity in Deregulated Wholesale Electricity Markets View project

All content following this page was uploaded by Krzysztof Pawlikowski on 04 June 2014.

The user has requested enhancement of the downloaded file.


SEQUENTIAL ESTIMATION OF THE STEADY-STATE
VARIANCE IN DISCRETE EVENT SIMULATION

Adriaan Schmidt Krzysztof Pawlikowski


Institute for Theoretical Information Technology Computer Science and Software Engineering
RWTH Aachen University University of Canterbury
D-52056 Aachen, Germany Christchurch, New Zealand
Email: Adriaan.Schmidt@rwth-aachen.de Email: Krys.Pawlikowski@canterbury.ac.nz
Don McNickle
Department of Management
University of Canterbury
Christchurch, New Zealand
Email: Don.McNickle@canterbury.ac.nz

KEYWORDS and which is important in the methods of standardized time


Steady-State Simulation, Variance Estimation, Sequential series (Schruben, 1983) and various methods using this
Output Analysis, Batch Means concept.
Applications for the estimators we propose can be found
in the performance analysis of communication networks.
ABSTRACT In audio or video streaming applications, for example, the
For sequential output data analysis in non-terminating actual packet delay is less important than the packet delay
discrete-event simulation, we consider three methods of variation or jitter (see e.g. Tanenbaum, 2003). Other appli-
point and interval estimation of the steady-state variance. cations include estimation of safety stock or buffer sizes,
We assess their performance in the analysis of the output and statistical process control.
of queueing simulations by means of experimental cover- Our estimation procedures are designed for sequential
age analysis. Over a range of models, estimating variances estimation. As more observations are generated, the esti-
turns out to involve considerably more observations than mates are continually updated, and simulation is stopped
estimating means. Thus, selecting estimators with good upon reaching the required precision.
performance characteristics is even more important. In simulation practice, one observes an initial transient
phase of the simulation output due to the initial conditions
INTRODUCTION of the simulation program, which are usually not repre-
sentative of its long-run behaviour. It is common practice
The output sequence {x} = x1 , x2 , . . . of a simulation pro-to let the simulation “warm up” before collecting observa-
gram is usually regarded as the realisation of a stochastic tions for analysis. For many processes, σ 2 converges to its
process {X} = X1 , X2 , . . .. In the case of steady-state steady-state value slower than the process mean; therefore,
simulation, we assume this process to be stationary and er- existing methods of detection of the initial transient period
godic. with regard to the mean value may sometimes not be ap-
Current analysis of output data from discrete event simu- plicable for variance estimation. A method based on distri-
lation focuses almost exclusively on the estimation of mean butions, which includes variance, is described in (Eickhoff
values. Thus, the literature on “variance estimation” mostly et al., 2007). This is, however, not the focus of this paper,
deals with the estimation of the variance of the mean, which so we use a method described in (Pawlikowski, 1990).
is needed to construct a confidence interval of the estimated In the next section we present three different methods
mean values. of estimating the steady-state variance. We assessed these
In this paper, we are interested in finding point and inter-
estimators experimentally in terms of the coverage of con-
val estimates of the steady-state variance σ 2 = Var[Xi ] and fidence intervals. The results of the experiments are pre-
the variance of the variance, from which we can construct sented in Section 3. The final section of the paper sum-
confidence intervals for the variance estimates. Similar to marises our findings and gives an outlook on future re-
the estimation of mean values, one problem in variance es- search.
timation is caused by the fact that output data from steady-
state simulation are usually correlated.
The variance we estimate is not to be confused with the
ESTIMATING THE STEADY-STATE VARIANCE
quantity σ02 = limn→∞ n Var[X(n)], sometimes referred
to as variance parameter (Chen and Sargent, 1990) or In the case of independent and identically distributed ran-
steady-state variance constant (Steiger and Wilson, 2001), dom variables, the well-known consistent estimate of the

Proceedings 23rd European Conference on Modelling and


Simulation ©ECMS Javier Otamendi, Andrzej Bargiela,
José Luis Montes, Luis Miguel Doncel Pedrera (Editors)
ISBN: 978-0-9553018-8-9 / ISBN: 978-0-9553018-9-6 (CD)
variance is As point estimate we apply (1) to the sequence {y}:
n
1 X m
2 2
σ̂ (n) = s (n) = (xi − x̄(n))2 , (1) 1 X 2
n − 1 i=1 σ̂22 (n) = (yi − ȳ(i))
m − 1 i=1

and its variance is known to be (see e.g. Wilks, 1962, p.


where m = bn/k0 c is the length of the sequence {y}. To
200)
calculate the variance of the estimate, we use (2), replacing
1

n−3 4
 the actual values of σ 2 and µ4 with their estimates:
Var[s2 (n)] = µ4 − σ , (2)
n n−1
Var[σ̂22 (n)] =
Pi !
where x̄(n) = j=1 xj /i, is the sample mean of the first 1 X
m
m−3 4
4
n observations, and µ4 is the fourth central moment of the (yi − ȳ(m)) − σ̂ (n)
m i=1 m−1 2
steady-state distribution.
In the case of correlated observations usually encoun-
The half-width of the confidence interval is then
tered in simulation output data, s2 is no longer unbiased,
but in fact (see Anderson, 1971, p. 448) q
∆2 = z1−α/2 Var[σ̂22 ].
Pn−1 !
2 2 j=1 (1 − j/n)ρj
E[s (n)] = σ 1 − 2 , (3) We do not know of any general results on the distribu-
n−1
tion of S 2 (n); however, we can use a normal distribution
where ρj is the lag j autocorrelation coefficient of the se- because we assume the yi to be uncorrelated, and their sam-
quence {x}. ple size to be large.
2
We propose three estimators of σ : Method 1 treats it To find an appropriate value for the spacing k0 , we suc-
as a mean value and thus avoids dealing with the statistical cessively test values by extracting the respective subse-
2
problems associated with the estimate s , Method 2 uses quences, and analysing their autocorrelation.
uncorrelated observations to overcome the bias problem of
s2 , and Method 3 compensates for this bias.
Method 3: Batch Means
We have seen in (3) that the sample variance of a finite,
Method 1: Variance as a Mean Value correlated sample is a biased estimate of σ 2 . When con-
Variance is defined as the mean squared deviation of a ran- sidering batches of observations of size m, we know that
dom variable from its mean value: Var[X] = E[(X − their means have a variance of (see Law and Kelton, 1991,
E[X])2 ]. This suggests that it should be possible to esti- p. 285)
mate variance as a mean value. We estimate the variance of  
a sequence {x} as the mean of the new sequence {y}, de- 2 m−1
σ  X
fined by yi = (xi − x̄(i))2 . This makes the point estimate Var[x̄(m)] = 1+2 (1 − j/m)ρj  .
m j=1
n
1X
σ̂12 (n) = yi .
n i=1 Based on an unpublished paper by (Feldman et al., 1996),
we show that adding s2 (m) and Var[x̄(m)] yields a consis-
2
To obtain a confidence interval, any existing procedure of tent estimate of σ . One can think of this as splitting the
mean value estimation can be used. We use the method of variance into two components, a local variance describing
spectral analysis, as proposed by (Heidelberger and Welch, the short term variations of the process, and a global vari-
1981). ance representing the long term variations. The local vari-
ance is calculated as the mean variance inside equal-sized
batches, and the global variance is the variance of the means
Method 2: Uncorrelated Observations of the same batches.
Processes typically encountered as simulation output data
To this end, we define a number of statistics: Given
have a monotonically decreasing autocorrelation function;
b equal-sized batches of size m, we calculate the sample
so, observations that are spaced “far apart” are less cor-
mean and sample variance of each batch j, containing the
related, and if we choose a sufficiently large spacing, we
observations xj,1 , xj,2 , . . . , xj,m as
can assume that the observations are (almost) uncorrelated.
This enables us to use (1) as a point estimate of the vari- n
ance. 1X
x̄j = xj,k
We define a secondary sequence {y} as yi = xk0 i , where n
k=1
k0 is the spacing distance needed to make the observations 1X
n
approximately uncorrelated. The ergodic property of the s2j = (xj,k − x̄j )2 .
2 n
process {X} ensures that Var[Xi ] = Var[Yi ] = σ . k=1
Furthermore, we calculate implemented using the Akaroa2 framework (Ewing et al.,
1999).
b
1X We analyse the steady-state variance of the waiting times
¯=
x̄ x̄j
b j=1 of customers in the single server queueing models M/M/1,
M/E2 /1, and M/H2 /1. The model parameters are selected
b
1 X such that the coefficient of√variation of the service √ time
s2X = ¯ )2
(x̄j − x̄
b − 1 j=1 is 1 for the M/M/1 model, 0.5 for M/E2 /1, and 5 for
M/H2 /1.
b
1X 2 Simulations are run sequentially and stopped upon reach-
v̄ = s .
b j=1 j ing a relative precision of 0.05 at the 0.95 confidence level.
The coverage of confidence intervals is then calculated
¯ and s2X are the sample mean and sample variance of the
x̄ from the frequency with which the generated confidence
batch means, and v̄ is the sample mean of the batch vari- interval covers the actual (theoretical) variance, and a con-
ances. fidence interval for the coverage is calculated at the 0.95
We now define the point estimator as confidence level. Our procedure to analyse coverage fol-
lows the rules proposed in (Pawlikowski et al., 1998):
σ̂32 = v̄ + s2X .
1. Coverage analysis is done sequentially.
We know that v̄ is an unbiased estimator of E[Sj2 ] =
2. Our results include a minumum number of 200 “bad”
E[ m−1 2 2
m S (m)], and that sX is an unbiased estimator of confidence intervals.
Var[X] . So we can see that
3. Simulation runs shorter than the mean run length mi-
E[σ̂32 ] = E[v̄] + E[s2X ] nus one standard deviation are discarded.
m−1 2
= E[ S (m)] + Var[X]
m Each of the estimators is used on the three models at dif-
2
=σ , ferent system loads, ranging from 0.1 to 0.9. To deal with
bias due to the initial transient period of the simulation we
To obtain a confidence interval, we consider the statistic use the method described in (Pawlikowski, 1990), which is
a combination of a heuristic and a statistical test for station-
b
yj = s2j + ¯)2 ,
(x̄j − x̄ arity of the output sequence.
b−1 The Estimator σ̂12 (Figure 1) shows a coverage of around
whose sample mean is equal to the point estimate σ̂32 . Be- 0.93, which is slightly lower than the required 0.95, but still
cause we can express the variance estimate as the sample acceptable for most practical purposes. Because of its easy
mean of a random sample, we can justify calculating the implementation (using an existing method of mean value
confidence interval using the sample variance of this sam- estimation), this estimator can be interesting for practical
ple: use.
The estimator σ̂22 has generally good coverage (Figure
1 X
b
2 2), which shows that the method of “far apart” uncorrelated
s2Y = yj − σ̂ 2 . observations works.
b − 1 j=1
Of the three estimators proposed, σ̂32 has the best cover-
age of confidence intervals (Figure 3).
The half-width of the confidence interval is then
In addition to the coverage of confidence intervals, we
examine the convergence of the estimators on the basis of
q
∆3 = tb−1,1−α/2 s2Y /b.
the number of observations needed to reach the required
precision. Table 1 shows the mean number of observations
Finding an appropriate batch size m is crucial in this esti-
collected before reaching that precision. Overall, Method
mator. If the batches are too small, the yj will be correlated,
2 3 needs the fewest observation before the stopping crite-
and their sample variance sY does not accurately describe
2 rion is satisfied, Method 2 needs the most. With increasing
the variance of the estimate σ̂3 .
system load the simulation run lengths of all three estima-
To determine the batch size we use a simple procedure
tors show a similar behaviour. It is worth noting that the
which tests the first autocorrelation coefficients of the se-
run lengths in Table 1 are roughly an order of magnitude
quence of batch means, continually increasing the batch
greater than those that would be required for estimating the
size until the correlation becomes negligible.
mean waiting times to the same precision.
For the estimator σ̂22 , we analyse the value of the param-
EXPERIMENTAL COVERAGE ANALYSIS eter k0 , which is automatically determined during the esti-
To assess the performance of the proposed estimators, we mation process. Table 2 shows the average spacing used to
use the method of sequential coverage analysis, as de- produce almost independent observations. We see that the
scribed in (Pawlikowski et al., 1998). The estimators were models with a higher coefficient of variation of the service
2 2 2
^ , M/M/1 Queue
Estimator σ ^ , M/E2/1 Queue
Estimator σ ^ , M/H2/1 Queue
Estimator σ
1 1 1

actual coverage actual coverage actual coverage


0.98

0.98

0.98
required coverage required coverage required coverage
0.97

0.97

0.97
0.96

0.96

0.96
Coverage

Coverage

Coverage
0.95

0.95

0.95
0.94

0.94

0.94
0.93

0.93

0.93
0.92

0.92

0.92
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

ρ ρ ρ

(a) M/M/1 Queue (b) M/E2 /1 Queue (c) M/H2 /1 Queue

Figure 1: Coverage of Estimator σ̂12

^ 2, M/M/1 Queue
Estimator σ ^ 2, M/E2/1 Queue
Estimator σ ^ 2, M/H2/1 Queue
Estimator σ
2 2 2

actual coverage actual coverage actual coverage


0.98

0.98

0.98
required coverage required coverage required coverage
0.97

0.97

0.97
0.96

0.96

0.96
Coverage

Coverage

Coverage
0.95

0.95

0.95
0.94

0.94

0.94
0.93

0.93

0.93
0.92

0.92

0.92

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

ρ ρ ρ

(a) M/M/1 Queue (b) M/E2 /1 Queue (c) M/H2 /1 Queue

Figure 2: Coverage of Estimator σ̂22

2 2 2
^ , M/M/1 Queue
Estimator σ ^ , M/E2/1 Queue
Estimator σ ^ , M/H2/1 Queue
Estimator σ
3 3 3

actual coverage actual coverage actual coverage


0.98

0.98

0.98

required coverage required coverage required coverage


0.97

0.97

0.97
0.96

0.96

0.96
Coverage

Coverage

Coverage
0.95

0.95

0.95
0.94

0.94

0.94
0.93

0.93

0.93
0.92

0.92

0.92

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

ρ ρ ρ

(a) M/M/1 Queue (b) M/E2 /1 Queue (c) M/H2 /1 Queue

Figure 3: Coverage of Estimator σ̂32


M/M/1 Queue M/E2 /1 Queue M/H2 /1 Queue
ρ σ̂12 σ̂22 σ̂32 σ̂12 σ̂22 σ̂32 σ̂12 σ̂22 σ̂32
0.1 160 570 155 117 407 112 577 1736 552
0.2 131 365 123 94 265 89 556 1302 517
0.3 140 339 131 101 248 95 624 1291 567
0.4 172 372 158 125 275 118 752 1437 672
0.5 234 461 214 172 346 160 973 1744 862
0.6 359 652 325 266 494 245 1383 2342 1222
0.7 636 1078 571 481 826 435 2317 3646 2010
0.8 1491 2331 1322 1134 1776 996 5019 7282 4317
0.9 6237 9096 5509 5079 6942 4146 22044 26930 16750

Table 1: Mean Number of Observations Needed to Reach Required Precision (in 1000 Observations)

ρ M/M/1 M/E2 /1 M/H2 /1 The sample sizes required for a certain precision turned
out to be substantially larger than those needed to estimate
0.1 6.6 6.2 10.7 comparable means, so selecting the best estimator is an im-
0.2 8.6 7.8 17.4 portant problem.
0.3 11.9 10.4 27.4
The problem of the initial transient period has not been
0.4 17.4 14.6 43.3
addressed with regard to the steady-state variance. The
0.5 26.4 21.7 70.2
method used in this paper (Pawlikowski, 1990) was de-
0.6 43.5 35.1 119.8
signed for the estimation of mean values. Analysis of the
0.7 80.5 64.0 228.3
warm-up period in estimation of steady-state variance is a
0.8 188.2 146.5 535.8
subject of our future studies.
0.9 769.4 589.5 2213.5

ACKNOWLEDGEMENTS
Table 2: Mean Value of k0 in Estimator σ̂22
The resarch leading to this paper was in part supported
by the Summer Scholarship Programme of the University
time need larger values of k0 due to the higher correlation of Canterbury in Christchurch, New Zealand. In addition,
of the output sequence. the first author received financial support from the German
Academic Exchange Service (DAAD).
CONCLUSIONS AND FUTURE WORK
In this paper, we introduced three different approaches for References
estimating the steady-state variance in discrete event simu-
Anderson, T. (1971). The Statistical Analysis of Time Series. John
lation. We applied the estimators to single-server queueing Wiley & Sons.
models and compared them in terms of coverage and sam-
ple size needed to obtain confidence intervals of a certain Chen, B. and Sargent, R. (1990). Confidence interval estimation
precision. for the variance parameter of stationary processes. Manage-
So far, the proposed estimators have only been tested on ment Science, 36(2):200–211.
single-server queues. Although we expect them to be ap-
plicable to a much broader range of models, further study Eickhoff, M., Pawlikowski, K., and McNickle, D. (2007). Detect-
ing the duration of initial transient in steady state simulation of
is required to confirm this.
arbitrary performance measures. In Proceedings of ACM Val-
Methods 1 and 3 estimate the variance as the mean of a ueTools07 (Nantes).
secondary sequence of observations. This suggests that the
method of Multiple Replications in Parallel, as proposed in Ewing, G., Pawlikowski, K., and McNickle, D. (1999). Akaroa2:
(Pawlikowski et al., 1994), can be used to speed up vari- Exploiting network computing by distributing stochastic simu-
ance estimation by utilising multiple computers in parallel. lation. In Proceedings of the European Simulation Multiconfer-
While this is indeed confirmed by preliminary experiments ence (ESM 99, Warsaw), pages 175–181. International Society
reported in (Schmidt, 2008), further study is needed. for Computer Simulation.
Method 2 has the obvious weakness of only using a frac-
Feldman, R., Deuermeyer, B., and Yang, Y. (1996). Estimation of
tion of the generated observations. Investigation is needed
process variance by simulation. unpublished paper.
to determine if the method can be improved so that it does
not discard as many observations. Heidelberger, P. and Welch, P. (1981). A spectral method for con-
Method 3 required the smallest number of observations fidence interval generation and run length control in simula-
and produced estimates with good coverage properties. tions. Communications of the ACM, 25:233–245.
Law, A. and Kelton, W. (1991). Simulation Modeling and Analy- AUTHOR BIOGRAPHIES
sis. McGraw-Hill, 2nd edition.

Pawlikowski, K. (1990). Steady-state simulation of queueing pro- ADRIAAN SCHMIDT is a 2008 graduate of
cesses: A survey of problems and solutions. ACM Computing Electrical Engineering, with a specialisation
Surveys, 22(2):123–170. in Technical Computer Science, from RWTH
Aachen University. His research interests
Pawlikowski, K., Ewing, G., and McNickle, D. (1998). Coverage include software for embedded and real-time
of confidence intervals in sequential steady-state simulation.
systems, stochastic simulation, and sub-symbolic AI. His
Journal of Simulation Practice and Theory, 6(3):255–267.
email address is <adriaan.schmidt@rwth-aachen.de>.
Pawlikowski, K., Yau, V., and McNickle, D. (1994). Distributed
stochastic discrete-event simulation in parallel time streams. In KRZYSZTOF PAWLIKOWSKI is a Pro-
Proceedings of the 1994 Winter Simulation Conference, pages fessor of Computer Science and Software
723–730. Engineering at the University of Canterbury.
Schmidt, A. (2008). Automated analysis of variance in quanti-
His research interests include quantitative
tative simulation by means of multiple replications in parallel. stochastic simulation; and performance mod-
Master’s thesis, RWTH Aachen University. elling of telecommunication networks. He received a PhD
in Computer Engineering from the Technical University
Schruben, L. (1983). Confidence interval estimation using stan- of Gdansk; Poland. He is a Senior Member of IEEE
dardized time series. Operations Research, 31(6):1090–1108. and a member of SCS and ACM. His email address is
Steiger, N. and Wilson, J. (2001). Convergende properties of the <krys.pawlikowski@canterbury.ac.nz> and his webpage is
batch means method for simulation output analysis. Informs <http://www.cosc.canterbury.ac.nz/krys/>.
Journal on Computing, 13(4):277–293.
DON MCNICKLE is an Associate Professor
Tanenbaum, A. (2003). Computer Networks. Prentice Hall, 4th of Management Science in the Department of
edition. Management at the University of Canterbury.
Wilks, S. (1962). Mathematical Statistics. John Wiley & Sons, His research interests include queueing the-
2nd edition. ory; networks of queues and statistical aspects
of stochastic simulation. He is a member of INFORMS. His
email address is <don.mcnickle@canterbury.ac.nz>.

View publication stats

Das könnte Ihnen auch gefallen