Sie sind auf Seite 1von 4

Economics Letters 113 (2011) 259–262

Contents lists available at SciVerse ScienceDirect

Economics Letters
journal homepage: www.elsevier.com/locate/ecolet

Mean absolute percentage error and bias in economic forecasting


Jordi McKenzie ∗
School of Economics, Merewether Building H04, University of Sydney, Sydney 2006, Australia

article info abstract


Article history: This article develops a simple theoretical framework to show how forecasters may bias downward point
Received 17 August 2010 predictions under the assumption that the asymmetric loss function is directly related to the (Mean)
Received in revised form Absolute Percentage Error (M)APE.
18 July 2011
© 2011 Elsevier B.V. All rights reserved.
Accepted 18 August 2011
Available online 25 August 2011

JEL classification:
C53
C70
D80

Keywords:
Mean absolute percentage error
Asymmetric loss
Forecasting bias

1. Introduction (1992), Makridakis (1993), and Goodwin and Lawton (1999). More
generally, asymmetric loss functions have been studied by a
Forecasting plays an important role in many areas of the econ- number of authors and a useful survey of both theoretical and
omy. For example, GDP, inflation, company profits, unemploy- empirical contributions is provided by Elliot and Timmermann
ment, housing starts, and interest rates – to name but a few – are (2008). It is well known that asymmetric loss functions can lead to
all important variables commonly considered by forecasters. In biased forecasts and a number of empirical studies have sought to
practical applications, a widely used metric of evaluating forecast explain over-prediction bias.2 This paper, however, explores bias in
accuracy is the Mean Absolute Percentage Error (MAPE), or some- the opposite direction — that is, under-prediction by showing that
times just the Absolute Percentage Error (APE) if only considering forecasters bias downward point predictions under three strategy
the relative performance of a single forecast. This measure is in- scenarios related to the (M)APE loss function: (i) they minimise
tuitively appealing as it penalises under- and over-prediction, rel- expected loss, (ii) they equate expected loss above and below their
ative to the actual outcome, in a symmetrical way. However, it is point prediction, or (iii) they minimise the maximum of possible
also well known that for a given prediction, actual outcomes above loss.
and below the prediction are treated asymmetrically; hence related
measures of accuracy such as the symmetrical MAPE (sMAPE), etc. 2. Model
are used. Given its relative simplicity, however, the MAPE remains
a common method for assessing forecast accuracy.1 In light of this, Suppose there is an actual outcome, defined A ∈ R+ , of a
it is possible that forecasters respond directly to this metric and random variable. Prior to the realisation of A, a forecaster holds
pursue forecast strategies that are somehow related to the (M) APE belief with probability 1 that A will be on the interval [AL , AH ],
measure. over which there is a uniform distribution. The forecaster is then
This paper is certainly not the first to realise the inherent assumed to make a single point prediction, P, such that P ∈
weakness in the MAPE— see, for example, Armstrong and Collopy [AL , AH ].
Assuming there is only one (outcome) period, the forecaster
formulates their prediction in one of three ways.3 First, the
∗ Tel.: +61 2 90367816; fax: +61 2 93514341.
E-mail address: Jordi.mckenzie@sydney.edu.au.
1 Other common metrics of forecast accuracy include Mean Absolute Error (MAE) 2 Lim (2001) and Gu and Wu (2003) provide examples on earnings forecasts.
and Mean Square Error (MSE), both of which imply symmetric loss functions for 3 In the following exposition, it is assumed the forecaster makes only one forecast
which the results of this paper do not apply. so that APE is the appropriate metric to measure accuracy.

0165-1765/$ – see front matter © 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.econlet.2011.08.010
260 J. McKenzie / Economics Letters 113 (2011) 259–262

forecaster who minimises expected total loss pursues the follow- 1


[2 ln P (i) − (ln AH + ln AL )] = 0
ing objective AH − AL
  (6)
∫ AH

(i) ln AH + ln AL
P (i)
= arg min APEdF (A) . (1) ∴P = exp .
P 2
AL

Second, the forecaster who equates expected loss above and below The minimisation implicitly balances the marginal expected
their point prediction solves loss/gain from outcomes above and below P (i) . Because outcomes
below P M are associated with greater expected loss than outcomes
P (ii)
∫ 
AH
above P M , the forecaster will reduce their point prediction, thereby

(ii)
P = arg min max APEdF (A), APEdF (A) . (2) reducing expected loss for outcomes below P (i) , until the marginal
P AL P (ii)
reduction in expected loss is equal to the marginal increase in
And third, the forecaster who minimises the maximum possible expected loss for outcomes above P (i) . This is illustrated in Fig. 1(b).
APE solves (ii) Equate expected loss above and below point prediction
P (iii) = arg min max{APE }. (3) This strategy implicitly balances expected loss above and below
P
P (ii) . This implies
In (1)–(3), APE is defined as follows,
P (ii)
(P (ii) − A) AH
(A − P (ii) )
∫ ∫
|P − A| f (A)dA = f (A)dA
APE = and AL ≤ P ≤ AH . A A
A AL P (ii)

With a standard minimum distance (or mean square error) loss 1 (ii) (ii) 1
(P (ii) ln A |PAL −A|PAL ) = (A |P H(ii) −P (ii) ln A|P H(ii) ) (7)
A A
function, and the assumption of uniformity over the interval [AL , AH − AL AH − AL
AH ], the optimal point prediction – and solution to (1)–(3) – is well AH − AL
∴ P (ii) = .
known to be the mid-point of the interval. Define this ln AH − ln AL
AL + AH Once again, because there is greater expected loss below P M than
PM ≡ . (4)
above P M , the forecaster will reduce their point prediction until the
2
However, under the APE forecast losses are defined asymmetrically expected loss below P (ii) is exactly matched by the expected loss
for outcomes above and below P M , as Lemma 1 states. above P (ii) . This is shown in Fig. 1(c).
(iii) Min–max strategy
Lemma 1. Suppose a forecaster predicts P ′ . If there is an outcome A The forecaster who pursues the min–max strategy reduces their
below P ′ , and an outcome A above P ′ , and if |P ′ − A| = |A − P ′ |, then prediction such that the maximum possible APE falls if the outcome
it is always the case that APEA,P ′ > APEA,P ′ . AL is realised. However, by reducing the point forecast, the APE
under outcome AH implicitly increases. Equilibrium occurs at the
Proof. This is straightforward provided that A > A. 
point where the APE from outcomes AH and AL are equalised
This means that the mid-point is not an optimal prediction to implying
satisfy (1), (2) or (3). Fig. 1(a) demonstrates this where the non-
linear loss function increases at an increasing rate for outcomes |P (iii) − AL | |AH − P (iii) |
=
below P M , and increases at a decreasing rate for outcomes above AL AH (8)
P M . At outcome AL , the expected payoff would be (iii) 2AL AH
∴P = .
|P M − AL | AH − AL AL + AH
= ,
AL 2AL If the forecaster selects P (iii) then necessarily the expected loss are
and at outcome AH , the expected payoff would be the same at AL and AH defined

|P M − AH | AH − AL |P (iii) − AL | |AH − P (iii) | AH − AL


= . = = . (9)
AH 2AH AL AH AL + AH
The difference between the two can be shown to simplify to (AH − This outcome is shown in Fig. 1(d).
AL )2 /2AL AH . The relationship between P (i) , P (ii) , P (iii) , and P M is summarised
in Proposition 1.
3. Strategies
Proposition 1. The optimal point prediction from the min–max
The forecaster is assumed to formulate their optimal point strategy is always less than the optimal point prediction from expected
prediction using one of the three strategies related to the (M)APE total loss minimisation, which is always less than balancing expected
asymmetric loss function: loss above and below the point prediction. Also, all of these are always
(i) Minimise total expected loss less than the mid-point. That is, P (iii) < P (i) < P (ii) < P M ∀AL ,
First, rewrite (1) as follows AH > 0.
P (i)
∫ 
(P (i) − A) AH
(A − P (i) ) Proof. See Appendix. 

min f (A)dA + f (A)dA
p(i) AL A P (i) A
 4. Beyond non-uniform distributions
1 (i) P (i) P (i)
= min [(P ln A| AL − A| AL ) The preceding results have been premised on the assumption of
p(i) AH − AL
 a uniform distribution,4 but the broader conclusion – that optimal
+ (A|P H(i) − P (i) ln A|P H(i) )]
A A
(5)

where f (A) is the uniform density. The first order condition from 4 Although such an assumption may be unrealistic for many situations, there
(5) gives may similarly be situations where it is not so unrealistic. For example, participants
J. McKenzie / Economics Letters 113 (2011) 259–262 261

a b

c d

Fig. 1. Forecaster loss functions.

predictions are below the midpoint of the interval – extend to any values of AL and AH . P (i) , P (ii) , and P (iii) are depicted in Fig. 2 as ratios
symmetrical distribution over expected outcomes. Consider, for P (.) /(AL + AH ), relative to the mid-point prediction ratio P M /(AL +
example, a symmetrical triangular distribution with support [AL , AH ) = 0.5. In all six simulations, the initial value of AL is one and
AH ]. Given the prediction P M = (AL + AH )/2, it must always be the the initial value of AH is five. The simulations are now summarised.
case that (S1) ∆AH /∆AL = 1 as AL , AH → ∞. AH /AL → 1 as AL , AH → ∞.
E (Loss | A < P ) > E (Loss | A > P ),
M M
(10) (S2) ∆AH /∆AL = 8 as AL , AH → ∞. AH /AL → 8 as AL , AH → ∞.
(S3) ∆AH /∆AL = 2 as AL , AH → ∞. AH /AL → 2 as AL , AH → ∞.
where
(S4) ∆AH /∆AL = 5 as AL , AH → ∞. AH /AL = 5 as AL , AH → ∞.
∫ PM
E (Loss | A < P ) =
M
APEf (A) dA

and (S5) ∆AH /∆AL is increasing as AL , AH → ∞. AH /AL is ‘U’-shaped as
AL AL , AH → ∞.
AH (S6) ∆AH /∆AL is decreasing as AL , AH → ∞. AH /AL → 1 as (AH –AL )

E (Loss | A > P M ) = APEf (A)+ dA, → 0.
PM
The simulations demonstrate how the optimal point prediction
and where f (A) = 4(A − AL )/(AH − AL )2 and f (A)+ = 4(AH −

varies considerably under different strategies and parameters.
A)/(AH − AL )2 . The symmetry of the expected outcome distribution, Whilst one may envisage certain circumstances where prediction
combined with the asymmetry of the loss function (as described in intervals remain relatively constant at higher AL and AH – as
Lemma 1) implies that it is always optimal to shade predictions described in Simulation 1 for example – it is also possible, and
below P M to some degree depending on strategic attitude. Even indeed likely, that higher values of AL and AH will be associated
with the relatively straightforward triangular density, however, with wider intervals for prediction. With increasing intervals at
closed form solutions analogous to those outlined above are higher values of AL and AH – as described in Simulations 2, 3, 4,
analytically intractable. Numerical techniques on the other hand and 5 – it is non-trivial how optimal point predictions are affected.
could readily be employed to further investigate the relationships
between strategies and the extent of the prediction bias but this 6. Conclusions
exercise is not pursued further here for brevity of space.
This paper has examined optimal point forecasts under the
5. Simulations of forecast strategies assumption that forecasters formulate predictions in relation to
a (M)APE loss function. Under the assumption that the forecaster
The relationships between the optimal point predictions considers a uniform distribution over an interval of (positive)
discussed in Section 3 are now investigated in relation to different outcomes, the theoretical models show that the optimal forecast
from a min–max strategy is lower than that minimising expected
total loss, which in turn is lower than the forecast that equalises
expected loss above and below the point prediction. Also, all
in the Survey of Professional Forecasters (http://www.philadelphiafed.org/) are
of the point predictions are necessarily less than the mid-point
required to provide probability estimates over outcome intervals (in addition to
point predictions) and it is not uncommon for individual forecasters to place prediction which would be optimal under a symmetrical loss
probability of 100% over a particular interval. Although this of course does not function. Simulations reveal that the various point predictions
imply that an individual has a uniform distribution over the interval, it also does not approach P M for higher AL and AH when the prediction interval
imply they do not assume such a distribution over the interval – or subset thereof – remains constant or decreases, but behave differently when the
when formulating their prediction. Indeed, Zarnowitz and Lambros (1987) employ
the assumption of uniformity over intervals in their empirical analysis of GNP and
interval increases. The extent and magnitude of downward bias
inflation forecasting accuracy. Engelberg et al. (2009), on the other hand, employ a under each forecast strategy is therefore best judged on a case by
triangular (or beta) distribution for predictions over one, two (or three) intervals. case basis.
262 J. McKenzie / Economics Letters 113 (2011) 259–262

Fig. 2. Simulations. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Acknowledgements 1
f (Y ) = Y − − 2 ln Y
Y
I would like to thank Vladimir Smirnov and an anonymous
1 2 (Y − 1)2
referee for useful suggestions. All remaining errors are my own. ∴ f′ = 1+ − = > 0.
Y2 Y Y2
Appendix Finally, for Proposition 1 to hold requires P M > P (ii) . This implies
 
AL + AH ln AH + ln AL
Proof of Proposition 1. For Proposition 1 to hold requires P (i) > P M − P (ii) = − exp >0
P (iii) . This implies 2

2

ln AH + ln AL
∴ ln(AH + AH ) − ln 2 − >0
 
ln AH + ln AL 2AL AH
P (i) − P (iii) = exp − >0 2
2 AL + AH
(AH + AL )2
 
ln AH AL
⇒ − ln 2 − ln AH AL + ln(AH + AL ) > 0 ln >0
4AH AL
  2
ln
4AH AL
<0 (AH − AL )2 > 0. 
(AH + AL )2
(AH − AL )2 > 0. References

Similarly, Proposition 1 requires P (ii) > P (i) . This implies Armstrong, S., Collopy, F., 1992. Error measures for generalising about forecasting
methods: empirical comparisons. International Journal of Forecasting 8, 69–80.
Elliot, G., Timmermann, A., 2008. Economic forecasting. Journal of Economic
 
AH − AL ln AH + ln AL
P (ii) − P (i) = − exp >0 Literature 46 (1), 3–56.
ln AH − ln AL  2 Engelberg, J., Manski, C., Williams, J., 2009. Comparing the point predictions
and subjective probability distributions of professional forecasters. Journal of

AH 
⇒ AH − AL − ln AH AL > 0 Business and Economic Statistics 27 (1), 30–41.
AL Goodwin, P., Lawton, R., 1999. On the asymmetry of the symmetric mape.
  International Journal of Forecasting 15, 405–408.
  Gu, Z., Wu, J., 2003. Earnings skewness and analyst forecast bias. Journal of
AH AL AH
− − ln > 0. Accounting and Economics 35, 5–29.
AL AH AL Lim, T., 2001. Rationality and analysts’ forecast bias. Journal of Finance 56 (1),
369–385.

Define Y = AH /AL , and define f (Y ) = Y − Y1 − 2 ln Y . Note that Makridakis, S., 1993. Accuracy measures: theoretical and practical concerns.
International Journal of Forecasting 9, 527–529.
f (1) = 0. Therefore, to show f (Y ) > 0, it is sufficient to show that Zarnowitz, V., Lambros, L., 1987. Consensus and uncertainty in economic prediction.
f ′ > 0 when Y > 1 Journal of Political Economy 95 (3), 591–621.

Das könnte Ihnen auch gefallen