Combination of Two Simple Moving Averages

Journal of the Operational Research Society (1999) 50, 11991204 #1999 Operational Research Society Ltd.
ety Ltd. All rights reserved. 0160-5682/99 $15.00

http://www.stockton-press.co.uk/jors
A robust forecasting system, based on the
combination of two simple moving averages
FR Johnston
1
*, JE Boylan
2
, E Shale
1
and M Meadows
1
1
University of Warwick and
2
Buckinghamshire University College, UK
For series with negligible growth and seasonality, simple moving averages are frequently used to estimate the current
level of a process, and the resultant value projected as a forecast for future observations. This paper shows that a linear
combination of two simple moving averages (SMA) can provide an improved estimate of the underlying level of the
process. The proposition is demonstrated by simulation, and good combinations are listed. The theory underlying the
improvement is developed. The general rules are then illustrated through an application in an inventory situation.
Keywords: forecasting; time series; moving averages; combinations of forecasts
Introduction
Many inventory systems, and especially those where issues
are stored by month, frequently use simple (equally
weighted) moving averages as a basis for the forecast of
demand. Such systems are often handling many thousands
of stock keeping units (SKUs) and a robust method of
forecasting is required. Employing simple moving averages
(SMA) requires only one control value to be selected,
namely the number of data points to include in the average.
In principle, the optimal number of points in the average
would vary from one item line to another, but in most
practical systems a single common value is selected for all
SKUs. A suitable length of the average, for typical data,
employing monthly revision, might be somewhere between
3 and 12 points, and if using weekly revision, between 8
and 24 points. The selected single value is a compromise,
and this paper demonstrates that combinations, in roughly
equal proportions, of the averages at the extremes of the
reasonable range give signicantly better performance than
any single length average. For example, for monthly revi-
sion, an equal combination of a 3 period and a 12 period
average out perform all single length moving averages.
This paper rst looks at the properties of the single
length SMA. Simulation is used to illustrate the sampling
errors on articial data. The simulation is extended employ-
ing combinations of averages and the improvement is
illustrated. A formal denition follows and the sampling
properties are then derived. The practical implications of
the proposition are shown followed by an application in
practice.
Simple moving averages
Simple moving averages (SMA) are frequently used in
forecasting systems when the data exhibits negligible
growth and seasonality, but when the underlying level
changes through time. The SMA is employed as an esti-
mate of the current level of the process, which is used as the
expected value of future forecasts. A key measure of
accuracy is the variance of the sampling error of the
mean, which will be abbreviated as SEV.
The steady state model
1
(SSM) provides a realistic
characterisation of data generation. In this formulation,
observations are treated as stochastic disturbances about
an unknown mean which itself undergoes a random walk. It
can be represented by the following two equations
Observation equation
y
t
= y
t
v
t
v
t
[0Y V]
System equation
y
t
= y
t1
w
t
w
t
[0Y W]
where y
t
is the observation at time t; y
t
is the underlying
level at time t; v
t
and w
t
are stochastic terms from distribu-
tions with zero means and xed variances V and W respec-
tively.
An important determinant of the model is the ratio of the
two variances VaW often denoted by r, the noise to signal
ratio.
An accompanying paper
2
has examined the properties of
a SMA for this model. For a known SSM (Vand W known),
the variance of the estimate around the true mean, the SEV,
which is denoted here as C
n
, depends on the number of
terms in the average (n) the relationship being
C
n
=
V
n
W(2n
2
3n 1)
6n
(1)
*Correspondence: Dr FR Johnston, The University of Warwick, Coventry
CV4 7AL, UK.
E-mail: ORSRJ@wbs.warwick.ac.uk
The optimal length of the average (N) is related to r by
N
2
= 3r 1a2
and using this optimal length the resulting SEV, denoted
C
N
is
C
N
=
V(2N 3a2)
N
2
1a2
(2)
or expressed in terms of the more measurable one-step
ahead forecast error (Q
N
)
C
N
=
Q
N
(2N 3a2)
(N 1)
2
Figure 1 illustrates the SEV observed after simulation
employing SMA of different lengths on data from a SSM
with a one-step ahead error variance of 100 and designed to
have an optimal length average of 16 points. The necessary
values of V and W come from the equations above and are
approximately 88.4 and 1.04 respectively.
With a realistic length of real or articial data, it is
frequently difcult to estimate the optimal length of a SMA
as the relationship between the SEV and the number of
terms is not smooth (unlike the relationship observed with
different smoothing constants and EWMA) . However,
using simulation, it is possible to generate very long data
series from the SSM.
The difference between the SMA and the (known) true
level of the mean can be measured. In order to eliminate
any autocorrelation effects these errors were measured only
every 25 periods (or longer if longer averages were being
considered). These deviations directly lead to the SEV
illustrated in Figure 1.
Combinations of averages
In order to see if a smooth curve might join the points
displayed in Figure 1, the simulation was re-run employing
combinations of two adjacent averages. The combination of
two averages of length n and n+1, in proportion l f and f
respectively, was treated as having a length n f . The
resultant SEV values are on the continuous line in Figure
2. Surprisingly the SEV is less than the weighted average of
the integer values.
Increasing the separation of the averages used in combi-
nation generated the SEV illustrated in Figure 2. The
hatched line illustrates the effect of using combinations in
differing proportions of average two apart (n and n 2)
and the dotted line the combinations four apart (n and
n 4). Again, improved performance over simple averages
is noticed.
Denitions and derivations
Consider two SMA of length s and g, where s is shorter than
g. If they were used separately to estimate the current mean
for a known SSM, the resultant variances of the estimate
(the SEV values) , using the notation of the previous section
will be C
s
and C
g
respectively. Let the two averages be
combined in proportions (1 f ) and f . When these combi-
nations are used to estimate the level of the process the
resultant SEV will be denoted as C
sYgYf
.
Basic probability theory requires
C
sYgYf
= (1 f )
2
C
s
f
2
C
g
2(1 f )fB
where B is the covariance between the two estimates. If the
two estimates were perfectly correlated the combined
variance would always be greater than the lower single
value. However, although s of the terms will be common
to both estimates, g s will not. As a result, the combined
variance can be less than the linear combination, leading to
the results described in the previous section. Of course,
combining two EWMA having different smoothing
constants (both less than unity) does not generate a similar
improvement, for the averages are perfectly correlated as all
previous observations are present in both values.
The accompanying paper
2
specied some properties of
the moving average and drew attention to the correlation Figure 1 SEV for SMA of different integer lengths.
Figure 2 SEV for combinations of SMA.
1200 Journal of the Operational Research Society Vol. 50, No. 12
which exists between sums of previous observations. Using
these results and simply counting the terms involved (see
Appendix) gives
B = [1a(sg)][sV WS(s i)(g i)] (i = 1 to s 1)
= Vag W(s(3g s) 3g 1)a(6g)
Employing the relationships from 1 results in
C
sYgYf
= (1 f )
2
[Vas W(2s
2
3s 1)a6s]
f
2
[Vag W(2g
2
3g 1)a6g]
2(1 f )f [Vag W(s(3g s) 3g 1)a(6g)]
Work is continuing to derive a general relationship for the
optimal values of s, g and f for any combination, however, a
simple search over all reasonable values in the above
relationship enables the construction of the values presented
in Table 1.
The table assumes that the SSM has a noise to signal
ratio which would lead to the tabulated `optimal' value for
the length of a single SMA. The `best' combination of any
two averages is listed, rst the lengths of the shorter (s) and
the longer average (g), followed by the proportion of the
shorter (1- f ) to yield minimum variance of the estimate of
the true mean. Using the optimal SMAwould yield a SEVof
C
N
. Let the equivalent variance when using the best combi-
nation be C
C
. The nal column lists the percentage improve-
ment the combination yields over the best single length
average, that is 100 (C
N
C
C
)aC
N
).
From observation of the values in the table a useful
working rule can be proposed. If the optimal length of an
average is known (N) then a combination of two averages of
length s and g, where s is approximately half N, and g is N
plus s, weighting the shorter average by 0.6 and the longer
by 0.4, improves the sampling error of the average by about
10%.
A practitioner could be concerned how this system
would behave when the assumed model was not suitable,
for example, when the data exhibited growth. Under these
conditions, the average will lag behind the data. A practi-
tioner has to balance the effects of this lag with the
sampling error to arrive at a suitable smoothing constant.
The magnitude of the lag, for the weighted sum of simple
averages will be exactly equivalent to the lag of the single
average with length equivalent to the weighted sum of the
components. However, whilst this delay effect does not
change, the variance decreases, providing a bonus on using
this method.
Practical implications
The previous section assumed that for the data series under
examination, the optimal average length was known. The
combination of averages was used to improve the forecast-
ing performance over the single model. In the real world,
the problem is rather different as the optimal single length
is probably never known, and the real task is to nd a
robust forecasting system.
As a broad generalisation, simple averages are often to
be found in forecasting systems employing monthly data. In
practice, a decision has to be made on the number of
periods to include in the average and, ideally, each indivi-
dual data series should use its own optimal value. The
optimal length will depend on the characteristics of the
data, and the noise to signal ratio of the SSM which
underpins the estimation procedure. However, this ratio
(VaW) will be unknown, and the analysis of any historical
data, for each and every data series, would be time
consuming, fraught with error, and unlikely to be under-
taken. Therefore, in practice, a single, slightly arbitrary or
compromise value N
/
, has to be selected, which may, or
may not be ideal for any individual series, but which is
satisfactory over all the series. The equations for the SEV
enable the optimality of this process to be investigated, and
to show that the combination of averages leads to an
improved and robust procedure.
Suppose the mean of a specic data series would best be
estimated by an average of length N. Equation (1) enables
the SEV to be calculated using an average with the optimal
length and determine the inefciency for other non optimal
values. Illustrative results are shown in Figure 3, wherein the
points, joined by a hatched line are the SEV employing
different lengths of averages. In this example, the optimal
length of the average was 5, yielding a lowest value of the
SEV of 23.6. Employing any other length produces higher
error, at the extreme, a length of 12 leads to an SEVof 35.0.
Combining a SMA of length 3, with that of length 12, in
differing proportions, so as to produce a weighted length
displayed on the X-axis, produces a pattern of SEV gener-
ally lower than that obtained from any single average. The
equal combination of the two extreme values has a weighted
Table 1 The best combinations of averages
Optimal Length Proportion Percentage
single shorter longer shorter improvement
N s g 1 f %
2 1 3 0.58 14.44
3 1 4 0.71 11.46
4 2 6 0.58 11.11
5 2 7 0.66 10.56
6 3 9 0.58 10.32
7 3 10 0.64 10.14
8 4 12 0.58 9.96
10 5 15 0.58 9.76
12 6 18 0.58 9.63
16 7 23 0.63 9.51
20 9 29 0.62 9.43
24 11 35 0.61 9.37
30 14 44 0.61 9.31
40 18 58 0.62 9.25
FR Johnston et alA robust forecasting system, based on the combination of two simple moving averages 1201
length of 7.5, and whilst not the best, is better than any
single length average. The SEV is 22.1, an improvement of
6% on the best single length average.
The choice of the lengths of 3 and 12 for the averages is
not optimal for this model, as Table 1 indicates. They have
been selected as many managers have an intuitive feel for a
three month and a twelve month average. Again, an equal
(50 : 50) combination is not optimal, but experience has
shown that this mixture is a good compromise combination.
The valuable (6%) improvement obtained from this slightly
non optimal choice gives some indication of the robustness
of the method.
Different data series would have different best single
average lengths, and be modelled by SSM with different
values of the ratio VaW. Unfortunately, in a real situation,
these values are neither known, nor easily assessed. Some
SKUs would have a best length of 5, whilst a different
group length 6 etc.
However, let us suppose that all the SKUs in an inven-
tory system, can be placed in one of ten groups whose best
values lie between 3 and 12. If one knew which items
belong to the group with a specic best length and, if this
value were employed for this series, then the resultant SEV
of the estimate would be inversely related to the optimal
length, and the values are illustrated in Figure 4 as points
joined by a hatched line. For example, those items whose
optimal length of average is 5, would best be estimated
using N
/
of ve, and the resultant error of 23.6 is on the
hatched line above 5. If the worst length of average had
been chosen, (but still in the range 3 to 12), the resultant
value of the SEV is displayed at the upper end of the bar,
and joined by the dotted points for the ten different groups
of items. Again, for the previously considered group, the
worst choice would be an average of length 12, giving an
SEV of 35.0.
In practice some single compromise value N
/
is used for
all items, each of which have their own optimal value, and
the resultant SEV values would lie along the bar between
the best and worst values. For example, if the compromise
value had been 5, then for those items whose optimal is
ve, the lowest point on the bar would be reached, but for
all other groups of items a non optimal value would be
recorded. A length of 5 is never the worst value for any
group of items, that being reserved for the extreme values
of 3 and 12. If the compromise value had been set at 12,
then those items whose real optimal length of average is 3
would have very high SEV.
The continuous line in Figure 4, represents the SEV
obtained by using a 50 : 50 combination of averages length
3 and 12. It is frequently better than the unknown single
optimal value, and, even at the extremes, always reasonably
close to the best result given perfect information. Given
that the results using any compromise gure N
/
will lie
away from the optimal value for all but one group of items,
the combination of averages will out perform any single
gure.
The gain on using the combined method will depend on
the distribution of the number of SKU in each group. If
there were an equal number of items in each group, a
50 : 50 mix of lengths 3 and 12 has an average improvement
of 9% over the best single length average (which is length
6).
One nal note on the gures presented in this illustration.
Clearly the actual numbers depend on the choice of V
within the steady state model. In all the examples, values
have been selected which result in the one-step ahead
forecast error (using the best integer length average) being
100. The numbers presented in the illustrations above are
therefore percentages of the one-step ahead forecast error,
and with that interpretation can be generalised to any group
of items.
Proposal
If simple moving averages (SMA) are to be used as the
basis for a forecasting scheme improved performance can
be obtained by using combinations of two separate SMAs.
If the optimal length for a single series is known then
Figure 4 SEV for SMA and combined averages, employing
values from `correct' length to `most incorrect' length. Figure 3 SEV for integer SME and combinations.
employing the pair of lengths given in Table 1 will lead to
about a 10% reduction in the uncertainty of the underlying
mean. When the optimal value for a series is unknown, or a
collection of data series are being examined, a range of
suitable lengths of individual averages can usually be
specied. Under these conditions, employing an equal
combination of the averages of the two extreme lengths
will have better properties than single individual value. For
monthly data, equal combination of a three month and
twelve period average is recommended. For weekly revi-
sion a mixture of lengths 8 and 24 is proposed.
Application
Ipeco Europe are an engineering company making seats for
aircrew. They operate a computer information system at
whose heart is a commercial package (MANMAN supplied
by Computer Associates) . It provides all the accounting
and production control facilities required, (order proces-
sing, MRP etc). However, an increasing part of the compa-
nies business was supplying spare parts for repair and
refurbishment activities. These requests had been passed
on to the production facilities as and when they were
received, resulting in many small jobs, with consequent
interrupts to the scheduled activity of the factory.
Consideration was given to holding some stock for the
spares items. An initial exercise investigated the costs and
consequences of holding stock for all, or for subsets of the
range. From this work, holding stock was adjudged to be
advantageous, the necessary investment being outweighed
by the improvement in service to customers and the
batching of requirements on the factory. Initially it was
decided to invest money into a stock for the fastest moving
500 parts, with the possibility of extending this across the
whole range of spares items at a later date.
The computer package used by the Company does not
have a facility for handling unscheduled demand, nor for
recognising unallocated stock. Modifying the package
would have been both costly and inexible to changes
which may be required as experience in stock management
is gained. Therefore it was decided to produce a free-
standing stock management package, running on a PC,
linked to interface with the standard system. This stock
system provides standard replenishment reports to create
the requirements on the production facility or on external
sources of supply. A further important feature is the
management control report, showing the investment neces-
sary for different levels of service and the impact on stock,
of changing the number of orders on the production facil-
ities. Within the stock module is a standard re-order level,
re-order quantity stock system which required a simple
robust forecasting system. The remainder of this note
describes the selection, design and properties of the fore-
casting system.
The choice of forecasting system
The pattern of orders for the spares items was virtually non
seasonal, and though the gures showed statistically signif-
icant growth when aggregated together, at item level the
forecasts proved to be better when this term was ignored.
Although the initial provision of stock for non-production
line orders was restricted to the faster selling lines many of
these, exhibited irregular (lumpy) demand patterns. As the
intention was to extend the system over most of the range,
lumpy demand patterns would be the norm. Eighty per cent
of all the SKUs moved 3 or less times per year. Under these
conditions, EWMA is likely to produce biased forecasts
35
and recent work by Sani and Kingsman
6
has shown the
superiority of a simple moving average. The absence of
accurate information on the time between orders prohibited
any use of methods based on the size and interval between
demands,
7
though the distribution of size was taken into
account in setting re-order levels.
Given that a simple moving average was to be used, a
single value for the length of the SMA had to be selected. A
comparison was made, based on the errors of one step
ahead forecasts, employing different lengths of averages for
500 parts. The average SEV for all items, is plotted against
the length of the average and displayed as Figure 5. A very
similar graph was obtained if standardised errors were
evaluated. The optimal length appeared to be ve points.
The second, and lower curve on Figure 5, illustrates the
SEV, obtained by combining a three and a twelve point
average in different proportions. The proportions are
chosen so that the weighted average of the length corre-
sponds to the values on the X-ordinate. An equal mix of the
two single averages is 4% better than the best single value
and 9% better than using an average of 12 months (a gure
for which there had been some intuitive support).
The results quoted above, and illustrated in Figure 5, are
from 500 items over 36 months. The particular application
imposed this restriction on the comparison as the computer
package used by Ipeco only stored data for three years and
the 500 items were the only ones intended to be stocked.
Figure 5 SEV for SMA and combined averages of 504 SKUs.
FR Johnston et alA robust forecasting system, based on the combination of two simple moving averages 1203
The variance of the estimate has been used as a method of
comparison as it is directly related to the decision process.
The aim of the exercise was to keep inventory low whilst
achieving a nominated service level. For any item the buffer
stock is some multiple of the standard deviation of the
forecast error over the relevant lead time. The idle resource
is represented by sum of the buffer stocks which is related to
the sum of the individual standard deviations. When
comparing two forecasting systems the decision maker
will select the one with the lowest implications for total
stock, that is, lowest sum of the standard deviations. The
graph, Figure 5, uses variance rather than standard deviation,
because the variance would be necessary to calculate the
gure for the lead time value (if it is not exactly one month)
and to link with the theory in the earlier part of the paper. (As
would be expected, the graph using standard deviation is
very similar to the one presented using the variance).
Within the implementation of the inventory system it
was not practical to conduct a test on the relative efciency
of the two forecasting methods. However the view of the
users of the system is that the forecasts were `as good as
could be expected'. It is hoped this endorsement together
with the results reported above will encourage others to
implement the suggested combination, and if possible
measure the benets over a larger sample and longer time
period.
Conclusions
A considerable literature exists on combining different
forecasting methods, see for example the review by
Clemen,
8
but combinations of simple moving averages
seem to have been ignored. This paper demonstrates
theoretically, then by simulation and nally in practice
the improvement which can be obtained from such a
combination. Most importantly, the method is extremely
robust, giving several percentage points improvement even
over the optimal single length average.
It is suggested that when handling monthly data, an equal
combination of a three month and twelve period average
should be used. For systems employing a weekly revision
of the underlying mean, a mixture of lengths 8 and 24 is
recommended.
Acknowledgements
The authors would like to express their thanks to the
Directors of Ipeco Europe for permission to publish the
results quoted in this paper.
Appendix
The covariance between two averages
The deviation of the covariance term requires the equations
of the SSM. Let the current time be 0 (t = 0), when the
(unknown) underlying level is y
0
with observation y
0
,
observational noise v
0
etc. However it is convenient to
adopt a different sufx notation for earlier time periods.
Designate an observation i earlier (conventionally t i) as
y
i
, the observational noise as v
i
etc.
Consider two averages (m
s
and m
g
) of length s and g
(where g b s) each used to estimate the current level (y
0
),
then
m
s
= (1as)Sy
i
and m
g
= (1ag)Sy
i
Using the equations of the SSM
y
0
= y
0
v
0
y
1
= y
0
v
1
w
1
F
F
F
t
s1
= y
0
v
s1
w
1
w
2
w
s1
y
s
= y
0
v
s
w
1
w
2
w
s1
w
s
F
F
F
y
g1
= y
0
v
s
w
1
w
2
w
s1
w
s
w
g1
When looking at the covariance of m
s
with m
g
, the
disturbances v
1
v
s1
and w
1
w
s1
are common to
both averages. The observational noise terms v
i
occur
only once in each value but the system disturbances are
repeated. For example the disturbance w
1
occurs s 1
times in m
s
and g 1 in m
g
. Assuming all the stochastic
terms w
i
have variance W, the common terms are
S(s i)(g i) (i = 1 to s 1)
Therefore the covariance of m
s
with m
g
is
(1asg)[Vs WS(s i)(g i)] (i = 1 to s 1)
or
Vag W(s(3g s) 3g 1)a(6g)
References
1 Harrison PJ (1967). Exponential smoothing and short-term sales
forecasting. Mgmt Sci 13: 821842.
2 Johnston FR, Boylan JE, Meadows M and Shale E (1998). Some
properties of a simple moving average when applied to forecast-
ing a time series. J Opl Res Soc 50: 12611265.
3 Johnston FR and Boylan JE (1996). Forecasting for Items with
Intermittent Demand. J Opl Res Soc 47: 113121.
4 Willemain TR Smart, CN, Shocker JH and Desautels PA (1994).
Forecasting intermittent demand: a comparative evaluation of
Croston's method. Int J of Forecasting 10: 529538.
5 Johnston FR and Boylan JE (1996). Forecasting for intermittent
demand, a comparative evaluation of Croston's method. Int J of
Forecasting 12: 297298.
6 Sani B and Kingsman BG (1997). Selecting the best periodic
inventory control and demand forecasting methods for low
demand items. J Opl Res Soc 48: 700713.
7 Croston JD (1972). Forecasting and stock control for intermittent
demands. Oper Res Q 23: 289304.
8 Clemen RJ (1989). Combining forecasts: a review and annotated
bibliography. Int J of Forecasting 5: 559584.
Received May 1998;
accepted July 1999 after two revisions

Combination of Two Simple Moving Averages

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Combination of Two Simple Moving Averages

Hochgeladen von

Copyright:

Verfügbare Formate

Journal of the Operational Research Society (1999) 50, 11991204 #1999 Operational Research Society Ltd.

ety Ltd. All rights reserved. 0160-5682/99 $15.00

Das könnte Ihnen auch gefallen