Sie sind auf Seite 1von 125

Cass Business School

Faculty of Finance
MSc Energy, Trade and Finance
MSc Shipping, Trade and Finance

Academic year 2014-2015

Advanced Quantitative Methods

Lecturer: Dr. Amir H. Alizadeh

Contents
1

MULTIPLE REGRESSION MODEL ......................................................................................................... 4

1.1 INTRODUCTION ............................................................................................................................................... 4


1.2 ESTIMATION OF MULTIPLE REGRESSION MODEL ............................................................................................ 4
1.3 COEFFICIENT OF DETERMINATION, R-SQUARED .............................................................................................. 6
1.4 JOINT SIGNIFICANCE OF COEFFICIENTS AND F TEST ........................................................................................ 7
1.5 DUMMY VARIABLES ........................................................................................................................................ 9
1.5.1
Intercept dummies ............................................................................................................................. 10
1.5.2
Slope, irregular and event dummies .................................................................................................. 14
1.5.3
Seasonal dummies: Construction, Estimation and Interpretation .................................................... 15
1.6 APPENDIX 1.A. MATRIX REPRESENTATION OF THE MULTIPLE REGRESSION MODEL .................................... 18
2

DIAGNOSTIC TESTS OF REGRESSION MODELS ............................................................................ 22

2.1 INTRODUCTION ............................................................................................................................................. 22


2.2 HETERSOSCEDASTICITY ................................................................................................................................ 22
2.2.1
The effect of heteroscadasticity on OLS estimates of CLRM ............................................................. 23
2.2.2
Detecting Heteroscedasticity ............................................................................................................. 24
2.2.3
What to do in the presence of heteroscedasticity ............................................................................... 30
2.3 SERIAL CORRELATION ................................................................................................................................... 33
2.3.1
The effect of serial correction on regression estimates ..................................................................... 33
2.3.2
How to detect serial correlation ........................................................................................................ 33
2.3.3
Correction for serial correlation ....................................................................................................... 38
2.4 NORMALITY .................................................................................................................................................. 40
2.4.1
Jarque Bera test ................................................................................................................................. 40
3

FORECASTING TIME SERIES ............................................................................................................... 42

3.1 INTRODUCTION ............................................................................................................................................. 42


3.2 FORECASTING MODELS ................................................................................................................................. 42
3.3 EXTRAPOLATION OR SIMPLE TIME SERIES METHODS ................................................................................... 42
3.3.1
Random series ................................................................................................................................... 43
3.3.2
Random Walk series .......................................................................................................................... 43
3.3.3
Series with Linear Trend ................................................................................................................... 46
3.3.4
Exponential Trend Model .................................................................................................................. 47
3.3.5
Autocorrelated series ......................................................................................................................... 48
3.4 MOVING AVERAGE (MA) MODELS................................................................................................................ 51
3.4.1
Exponential Weighted Moving Average (EWMA) model ................................................................... 52
3.4.2
Single Smoothing ............................................................................................................................... 53
3.4.3
Exponential Smoothing with Trend (Holt Model) .............................................................................. 55
3.5 SEASONAL TIME SERIES ................................................................................................................................ 56
3.5.1
Holt-Winter Model for seasonal time series ...................................................................................... 57
3.5.2
Regression Analysis of Seasonality ................................................................................................... 60
3.5.3
Forecasting with regression model ................................................................................................... 61
4

ECONOMETRIC FORECASTS & FORECAST EVALUATION ........................................................ 63

4.1 EX POST VS. EX ANTE FORECAST ................................................................................................................... 63


4.1.1
Unconditional vs. Conditional Forecasts .......................................................................................... 63
4.2 UNCONDITIONAL FORECASTING .................................................................................................................... 63
4.2.1
Forecast error ................................................................................................................................... 64
4.3 EVALUATING FORECASTS .............................................................................................................................. 65
4.3.1
Multi-step ahead forecasts................................................................................................................. 69
4.3.2
Static versus dynamic forecast........................................................................................................... 69
4.3.3
Decomposition of the Theils U ......................................................................................................... 71
5
5.1
5.2
5.3
5.4

LINEAR & NON LINEAR PROGRAMMING ........................................................................................ 74


INTRODUCTION ............................................................................................................................................. 74
MATHEMATICAL INEQUALITIES .................................................................................................................... 74
GRAPHICAL PRESENTATION OF INEQUALITIES ............................................................................................... 74
THE PROBLEM OF LINEAR PROGRAMMING ..................................................................................................... 76

5.5 THE GRAPHICAL SOLUTION OF THE LINEAR PROGRAMMING PROBLEM ......................................................... 76


5.6 NUMERICAL SOLUTION OF THE LINEAR PROGRAMMING PROBLEM ............................................................... 77
5.6.1
Existence of multiple solutions .......................................................................................................... 80
5.7 OPPORTUNITY COST AND SHADOW PRICE ..................................................................................................... 81
5.8 FINANCIAL APPLICATION OF LINEAR PROGRAMMING................................................................................... 82
5.9 NONLINEAR PROGRAMMING, (NLP) ............................................................................................................. 87
5.10 PORTFOLIO OPTIMIZATION............................................................................................................................ 89
6

SIMULATION ANALYSIS ........................................................................................................................ 93

6.1 INTRODUCTION ............................................................................................................................................. 93


6.2 SPREAD SHEET SIMULATION ......................................................................................................................... 94
6.3 SIMULATION WITH @RISK ............................................................................................................................ 97
6.4 APPLICATIONS OF SIMULATION IN FINANCE ............................................................................................... 102
6.4.1
Capital Budgeting & NPV Analysis ................................................................................................. 102
6.4.2
Simulating Stock Prices ................................................................................................................... 104
6.4.3
Pricing an Option Using Simulation ............................................................................................... 107
7

VALUE AT RISK ESTIMATION AND ANALYSIS ............................................................................ 111

7.1 INTRODUCTION ........................................................................................................................................... 111


7.2 WHAT IS EXACTLY VAR ............................................................................................................................. 111
7.3 MEASURING FINANCIAL RISK ..................................................................................................................... 111
7.4 VOLATILITIES: DAILY VS. YEARLY .............................................................................................................. 113
7.5 SIMPLE VAR CALCULATION USING VOLATILITIES ....................................................................................... 114
7.6 TWO ASSET PORTFOLIO ............................................................................................................................... 114
7.7 MULTIPLE ASSET PORTFOLIOS ..................................................................................................................... 115
7.8 EXPECTED TAIL LOSS (ETL)....................................................................................................................... 117
7.9 VAR ESTIMATION METHODOLOGIES .......................................................................................................... 119
7.9.1
Parametric VaR Estimation ............................................................................................................. 120

Reading List
1- Data Analysis and Decision Making with Microsoft Excel: WITH Infotrac, AND Decision
Tools AND Statistic Tools Suite, by S. Christian Albright, Wayne L. Winston, and
Christopher J. Zappe (2006)

2- Basic Econometrics, by Damodar N Gujarati & Dawn c. Porter (2009)


3- Simulation Modelling Using @Risk by Wayne L. Winston (2000)
4- Quantitative Methods for Business and Management by Frank Dewhurst (2006)

CHAPTER 1

Multiple Regression Model

1 Multiple Regression Model


1.1 Introduction
The two variable regressions that we considered so far in QM, may be sufficient in explaining
certain random variables, might not be sufficient (appropriate model) in many cases where
the dependent variable(s) is related to several explanatory or independent variable. For
example, car sales, Yi, might be directly related to peoples disposable income, X1i, but it
might as well be indirectly related to the cost of borrowing, X1i. Therefore, we need to
consider two explanatory variables in this case, which result in the following multiple
regression model,

Yi

X 1,i

) , Yi is the dependent variable, X 1i and X 2i are independent


0 is the intercept terms, and 1 and
2 are coefficient of the regression for X 1i

where, once again


variables,

X 2,i

~ (0,

and X 2i , respectively.
Now the underlying assumption for the validity of the OLS for multiple regression are the
same as two-variable regression; that is,
1- The relationship between Y and Xs is linear,
2- The error term has a zero mean,
3- The error term has a constant variance,
4- Errors corresponding to different observations are independent,
5- The error terms are normally distributed,
6- There is no correlation between error terms and independent variables,
7- Finally, there should not be any correlation between the independent variables (no
multicolinearity).
The last assumption is in addition to those for bivariate regression.

1.2 Estimation of Multiple Regression Model


In order to estimate the parameter of the three-variable regression mentioned above, consider
the Error Sum of Squares (ESS) once more,

ESS

(Yi Yi ) 2

and we know that Yi

ESS

(Yi

X
1 1i

X
1 1i

However, we know that 0

X , therefore,
2
2i

X )2
2 2i
Y

X
1 1

X , therefore we can write


2
2

ESS
(Yi (Y 1 X 1 2 X 2 )
rearranging the above equation we can write
ESS

[(Yi Y )

(X
1
1i

X1 )

X
1 1i

X )2
2 2i

(X
2
2i

X 2 )]2

and using the notation for deviation of variables from their means as xi

yi

Xi

X and

Yi Y , we can write
ESS

( yi

x
1 1i

x )2
2 2i

Thus minimising the above ESS with respect to


FOC:
ESS

2 x1i

( yi

1 1i

2 2i

2 x2i

( yi

1 1i

2 2i

and

means

x ) 0

ESS

x ) 0

and when solved with respect to

and

( x1i yi )( x22i ) ( x2i yi )( x1i x 2i )


( x12i )( x22i ) ( x1i x 2i ) 2

( x2i yi )( x12i ) ( x1i yi )( x1i x 2i )


( x12i )( x22i ) ( x1i x 2i ) 2

yields

SOC: We will look at this later when we use matrix notation.


In the above notation, 1 measures the change in Yi variable associated with a unit change
in X1i variable when X2i is held constant, and 2 measures the change in Yi variable
associated with a unit change in X2i variable when X1i is held constant.
See Appendix for matrix representation of multiple regression.

1.3 Coefficient of determination, R-squared


As mentioned earlier, coefficient of determination or R-squared measures the proportion of
the variation in the dependent variable, which can be explained by the independent variables.
In other words, the R-squared tells us how good the fit of regression line is. That is way it is
sometimes defined as a measure of goodness of fit of the regression.
Let us now derive the formula for R-squared of a regression using matrix notation. Assuming
that Y has a zero mean, we can write Y in terms of X and error terms ( ) in the following
form

Y X
Then the total variation in Y can be found as

Y' Y (X )'(X ) ' X' X ' X ' X' '


since it is assumed that there is no correlation between X and
we can write

, that is X =0 and X =0 ,

Y' Y ' X' X '


which is in fact
Total sum of squares = Regression sum of squares + Error sum of squares
TSS = RSS+ESS
Therefore the proportion of the variation in Y explained by X (remember X now is a vector
of one, two or more variables) can be found as

R2

ESS
'
1
TSS
Y' Y

' X' X
Y' Y

However, if Y is not a zero mean variable, the formula for R2 should be modified to

R2

where y' y

ESS
TSS

' X' X nY 2
y' y

Y' Y nY and yi

Yi Y

There is however a problem with the above formula for R2. The problem is because it
increases as the number of regressors increase, regardless of whether incremental explanatory
variables have in fact any explanatory power. Therefore, the R2 should be adjusted for the
number of variables included in the Right-hand-side of the regression. This is done by
applying calculating R-bar-squared ( R 2 ), which is R2 adjusted for the number of regressors
(degrees of freedom) as follows

R2 1

(' ) /(n k )
1
(y' y ) /(n 1)

'
y' y

(n 1)
(n k )

We can note that:


when k=1, then R 2 R 2
when k>1, then R 2 R 2
and while 0 R 2 1, R 2 can be negative.

1.4 Joint Significance of coefficients and F test


In most econometric models estimated it is important to test the joint significance of two or
more variables. For example, in the following multivariable regression one needs to test
whether the coefficients of the explanatory variables are jointly significant.

yi

X 2i

X 3i

X ki

0 against the alternative of at least one or more of


this means the null of 2
3
k
these coefficients are significantly different from zero. The appropriate test statistic that we
can use for this purpose is the following F test with k-1 and n-k degrees of freedom
Fk

RSS /(k 1)
ESS /(n k )

1, n k

RSS n k
ESS k 1

R2 n k
1 R2 k 1

Note that ESS is error sum of squares and RSS is regression sum of squares
Furthermore, if the hypothesis involves testing certain parameters of the regression and not
all of them at the same time, means that we need to estimate a restricted regression with
reduced number of variables in the following form

yi

X 2,i

X 3,i

k p

Xk

p ,i

0 against the alternative that one or


this means that the null hypothesis is k p 1 ...
k
more of these parameters are significant. In order to test the above hypothesis we use an F
test which is based on the Error Sum of Squared of the restricted and unrestricted regressions,
ESSR and ESSUR, respectively. The F test is constructed in the following form
F

( ESS R ESSUR ) / p
~ F( p ,n
ESSUR /(n k )

k)

or

2
( RUR
RR2 ) / p
~ F( p ,n
2
(1 RUR
) /(n k )

if the F test is greater than the Fcrit then the null of


alternative.

k p 1

...

k)

0 is rejected in favour of

Let us now consider the following example where we perform F test on some of variables.

Example: Interest rate determination (Pindyck and Rubinfeld 1997)


This example investigate the relationship between the US three months T-bill and Federal
Reserve board index of industrial production as well as money supply
IR = 3-month annualised US T-bill (%)
IP = Federal Reserve Board index of industrial production (1982=100)
M2 = Nominal money supply ($bn)
PW= producers price index for all commodities (1982=100)
We can use percentage change in M2 and PW in the model as

GM 2 t

( M 2t

M 2t 1 ) / M 2t

GPWt

and

( PWt

PW t 1) / PW t

The theory suggest that

IRt

f ( IP t , GM 2 t , GPWt 1 )

thus, the econometric model would be

IRt

IPt

GM 2t

GPWt

~ N(0,

estimating the above econometric model results in


Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08
Included observations: 428
Variable
C
IP
GM2
GPW(-1)
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733

Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000

Coefficient
2.816457
0.042703
0.124664
0.122609
2.616006
2915.326
-1017.889
0.043288

Std. Error
t-Statistic
0.445743
6.318565
0.005482
7.789104
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0000
0.0000
6.145764
2.792815
4.765836
4.784804
60.67014
0.000000

And results for the restricted model


Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08
Included observations: 428
Variable
C
IP
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

Constructing the F test


F

( ESS R ESSUR ) / p
ESSUR /(n k )

F(crit
2 , 428

4 ), 5%

(2915.326 2609.927) / 2
2609.927 /(428 4)

24.80

3.017

which means that the two variables which are restricted form the model have explanatory
power over the dependent variable (interest rates).

1.5 Dummy variables


Sometimes economic variables may change fundamentally during certain periods or even in
cross sectional analysis over certain parameters. For example, the income of individuals
might be different in different regions, over sex, or even their level of education. As another
example in time series models, consider the change in policy of a government in determining
the short term interest rate at certain date. Another example is seasonal effects in time series
data, which are regular changes in the variable due to weather or calendar effects. Changes in
policies, taste, habits, etc. may change the behaviour of some economic variable(s)
completely.
This means that when modelling such variables these factors should be taken into account.
Therefore, certain variables should be constructed which take distinct values according to the
change in the nature or the behaviour of the independent variable. These variables are known
as dummy variables. Therefore, dummy variables are used as explanatory variables in
regression models to pick up distinct changes in the nature or the behaviour of the dependent
variables. There are different types of dummy variable such as intercept dummies, slope
dummies, seasonal dummies and irregular dummies. These are used in regression models
depending on what a researcher needs to detect.
For example, when modelling salaries in universities in the UK, if a researcher is interested to
allow for the difference between the level of the salaries of male and female workers an
intercept dummy variable should be included in the regression. Similarly if the purpose is to
investigate the seasonal changes in shipping freight rates, oil trade, electricity prices (or
consumption) and many other instances, seasonal dummies should be incorporated in the
regression models to take into account the seasonal fluctuations in those variables.

Seaborne Crude oil Imports of Japan, EU-4 and USA

Freight rates for two major routes of Aframax and MR tankers

1.5.1 Intercept dummies


These are binary variables, which take values of zero and one. To understand how these
variable are constructed and used consider the following example.

Example: Alizadeh and Talley (2011)


Alizadeh & Talley (2011) examine the microeconomic determinant of freight rates in the dry
bulk market. They argue that the difference between a single fixture rate (FRi,t) and the
benchmark index rate (FRB,t) should be a function of the route over which the vessel operates,
RT, the laycan period of the fixture, LC, the size of the vessel, SZ, the age of the ship, AG,

10

and the volatility in the market, VOL. Therefore, hypothesized freight (charter) rate
regression model:
K

dfri ,t

1 LC i ,t

2 SZ i

3 AGi

2
4 AGi

5VOLt

RTi , j vi ;

vi ~ iid(0,

j 1

Where dfri,t is the difference between the log of fixture rate for contract i at time t, fri,t,, and
the log of Baltic benchmark freight rate (Baltic Average 4TC Rates) at time t, bfit=ln(B4TCt).
They use dummy variables to distinguish between fixtures in different routes. For instance, 9
binary dummy variables are used to distinguish 10 panamax routes. Table below presents the
number of fixtures and statistics for the laycan period in each route over the sample period
(January 2003 to July 2009, 9076 fixtures).

Plot of Individual Panamax Fixtures and the Baltic 4TC rates


140000
120000
100000
80000
60000
40000
20000
0
2500

5000
FRATE

Route
1
2
3
4
5
6
7
8
9
10

Panamax
Trans Atlantic Round Voyage
Continent to Far East
Trans Pacific Round Voyage
Far East to Continent
Mediterranean to Far East
PG - Indian Ocean to Far East
Far East to PG-Indian Ocean
Continent to PG-Indian Ocean
PG - Indian Ocean to Continent
Other routes
Total Fixtures

Number of
Fixtures
No
%
1397
15.4%
1033
11.4%
3174
35.0%
503
5.5%
104
1.1%
865
9.5%
376
4.1%
218
2.4%
208
2.3%
1198
13.2%
9076

11

7500

BPI_4TC

Laycan Period (days)


Mean
3.7
4.1
4.2
4.8
5.5
5.7
3.7
3.3
5.2
4.8
4.4

Med
3.0
3.0
3.0
4.0
5.0
4.0
3.0
3.0
4.0
3.0
3.0

Min
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0

Max
40.0
38.0
61.0
34.0
27.0
32.0
30.0
17.0
30.0
47.0
61.0

SD
4.4
4.5
4.2
4.5
5.2
5.5
3.9
3.6
5.5
5.3
4.6

Estimation results clearly shows that all the variables considered except freight market
volatility are significant in determination of freight rates. Also, there are significant
differences between freight rates differences in different routes. For instance, front haul
routes (e.g. Continent or Med to Far East) are at premium to back haul routes (e.g. Far East to
the Continent or Med).

dfri ,t

1 LC i ,t

2 SZ i

3 AGi

2
4 AGi

5VOLt

RTi , j

vi ;

vi ~ iid(0,

j 1

0
1
2
3
4
5

Constant
Laycan
Size
Age

LCi,t
SZi
AGi
AG2i
VOLt

Volatility
Route
Transatlantic Round Voyage
1
Continent to Far East
2
Trans Pacific Round Voyage
3
Far East to Continent
4
Mediterranean to Far East
5
PG Indian Ocean to Far East
6
Far East to PG-Indian Ocean
7
Continent to PG-Indian Ocean
8
PG Indian Ocean to Continent
9
R-bar-Square
BG test
White test
JB test

Coeff

P-val

-0.4435
0.0035
0.0060
0.0066
-0.0006
0.0020

0.000
0.000
0.000
0.000
0.000
0.878

0.0368
0.1494
-0.0910
-0.1709
0.2474
0.0809
-0.0757
0.1900
-0.1504
0.346
111.44
801.53
1.4*104

0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000

Example (Pindyck and Rubinfeld 1997)


To investigate whether female workers are relatively underpaid compared to male workers a
cross sectional study is conducted from a survey of the US Bureau of the Census. The
following variable are used in a multiple regression equation to model workers wages
W = workers wages in $
Sex = 1 if the person is female and 0 if male
ED = years of education
Age = age of the worker
Nonwh=1 if the person is non-Hispanic and non-white, 0 otherwise
Hisp = 1 if the person is Hispanic, 0 otherwise

12

OBS

WAGE

SEX

ED

AGE

NONWH

HISP

1
2
3
4
5
6
7
8
9
10
11
.
.
198
199
200
201
202
203
204
205
206

8.999779
5.499735
3.799996
10.50026
14.99925
8.999779
9.569682
14.99925
11.00005
4.99981
24.97562
.
.
4.99981
14.99925
5.550012
8.999779
24.97562
8.490093
4.99981
22.20017
9.239613

0
0
1
1
0
1
1
0
0
1
0
.
.
1
0
1
0
0
1
1
0
0

10
12
12
12
12
16
12
14
8
12
17
.
.
12
12
11
16
17
12
14
12
16

43
38
22
47
58
49
23
42
56
32
41
.
.
46
60
62
29
54
42
37
44
27

0
0
0
0
0
0
0
0
0
0
0
.
.
0
0
0
1
0
0
0
0
1

0
0
0
0
0
0
0
0
0
0
1
.
.
0
0
0
0
0
0
1
0
0

Plot of male and female wages


31
26
21
16
11
6
1
0

50

100

Male 8. 999778807 5. 499735499

150

200

Fem ale

These variables are then used in a multiple regression in the following form

Wi

Sexi

EDi

Agei

Nonwhi

Hispi

Estimating the above regression yields the following results


Variable
C
SEX
ED
AGE
NONWH
HISP
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

Coefficient
-6.409050
-2.761399
0.992564
0.116709
-1.060823
0.238682
0.367537
0.351725
4.217573
3557.584
-585.7443
1.776918

Std. Error
t-Statistic
1.895795
-3.380667
0.598422
-4.614464
0.116158
8.544971
0.025227
4.626442
0.986848
-1.074961
1.069428
0.223187
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

13

Prob.
0.0009
0.0000
0.0000
0.0000
0.2837
0.8236
9.596389
5.238212
5.745090
5.842019
23.24480
0.000000

The results indicate that sex, education and age are significant variables in determination of
wage level while race and ethnic background are not significant variables. Furthermore, the
coefficient of sex variable (-2.761) indicates that on average the wage level for female
workers is less than their male counterparts by 2.761$/hour. Similarly the coefficient of
education variable (0.993) indicates that on average there is an increase of 0.99$/hour in the
wage level for every one year education. Furthermore, the age coefficient indicates that the
average salary increases by 0.12$/hour for every one year age.
It is however, difficult to justify the last finding (age) economically as increase in age may
decline performance and salaries. Therefore this relationship might not be completely linear.
In order to test whether there is a nonlinear relationship between age and level of salaries of
workers, we can modify the model to the following where we add square of the age of
workers to the model.

Wi

Sexi

Variable
C
1
SEX
2
ED
3
AGE
4
NONWH
5
HISP
6
2
AGE
7
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

EDi

Coefficient
-14.79336
-2.641083
0.923222
0.623939
-1.177989
0.297989
-0.006308
0.398196
0.380051
4.124402
3385.128
-580.6262
1.753024

Agei

Nonwhi

Hispi

Std. Error
t-Statistic
3.220386
-4.593660
0.586421
-4.503730
0.115660
7.982170
0.161202
3.870530
0.965748
-1.219768
1.045969
0.284893
0.001981
-3.184043
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Agei2

Prob.
0.0000
0.0000
0.0000
0.0001
0.2240
0.7760
0.0017
9.596389
5.238212
5.705109
5.818192
21.94541
0.000000

The estimated coefficients of the Age and Age2 suggest that there is a nonlinear relationship
between age and workers wages in the US. The sign of coefficients indicate that increase in
age results in an increase in salary to up to a certain age and then for every one year increase
in age salaries on average decrease by 0.0063$/hour.
Furthermore comparing the R-bar-square of the second regression (0.38) with the first one
(0.35) indicate that including the age2 in the regression has increase the explanatory power of
the regression.
1.5.2 Slope, irregular and event dummies
In some economic cases, the impact of one explanatory variable on the dependent variable
may change after some time due to changes in policy or other factors which we may know (or
not know). But what we know is the date that the influence of the explanatory variable on the
dependent variable has changed. In order to allow such change in the model, we need to let
the slope coefficient (coefficient of the explanatory variable) to change after certain time
period.

14

1.5.3 Seasonal dummies: Construction, Estimation and Interpretation


Seasonal dummies are used frequently in time series econometrics to take into account of
seasonal behaviour of variables. It is important to identify seasonality due to two main
reasons. First, ignoring seasonal behavour of variables which behave seasonaly when
estimating economic modelling may results in invalid regression models, bias estimates, and
misleading inferences. Second, seasonal behavour of certain variables are important for
policy and decision making processes in many situations. For example, modelling demand for
sales, airline tickets, fuel and electricity price.

1.5.3.1 Zero and one seasonal dummies


Zero and one seasonal dummies are S (periodicity of the data) dummies of values of zero and
one of the following form; (for example S=4)
Period
1990Q1
1990Q2
1990Q3
1990Q4
1991Q1
1991Q2
1991Q3
1991Q4

D1
1
0
0
0
1
0
0
0

D2
0
1
0
0
0
1
0
0

D3
0
0
1
0
0
0
1
0

D4
0
0
0
1
0
0
0
1

They can be used in regressions to let the intercept of the regression take different values at
different time periods. Therefore, significance of the coefficient of a dummy variable means
that the level of the series tends to change at that point in time. Zero and one seasonal
dummies enter in a regression equation in two ways,
A) Estimation with a constant term;
Yt = +

D2 +

D3 + . . . +

12

D12 +

(1)

If a constant term is included in the regression one of the dummy variables must be dropped
in order to avoid the problem of perfect colinearity. However, in equation (1), the constant
term indicates the level of the dependent variable at the season for which the dummy variable
is dropped (in this case D1), while, significance of each coefficient shows the change in the
level with respect to the base month (January in this case) due to seasonal factors in that
month. For example, in February the dependent variable would be equal to constant plus 2.
Equation (1) can be estimated by OLS. The base month can be chosen, e.g. Jan. or Dec.,
according to the form of interpretation required by the researcher. The major disadvantage of
this type of seasonal specification is that the model does not tell us anything about the

15

changes in January or the changes for each month (Jan to Dec.) with respect to the overall
mean of the dependent variable.
Example: Seasonality in shipping freight rates
To see how seasonality in time series data is detected, consider the following example on
shipping freight rates. In this case we collected monthly freight rates for panamax vessels for
the period January 1980 to October 1999. We also found the logarithmic changes in monthly
freight rates using the following formula

ln pmxt

ln( pmxt ) ln( pmxt 1 )

and using a series of zero and one dummies we run the following regression

ln pmxt

D2,t

D3,t

12

D12,t +

It can be seen that mean change in log of panamax freight rate (growth rate) over the sample
period including the January effect was 0.0586, while there has been significant seasonal
increases in freight rates during February, March, September, October and November.
Furthermore the adjusted R-square suggests that around 12% variation in freight rates can be
due to seasonal factors.

Dependent Variable: D1LPMX


Method: Least Squares
Sample(adjusted): 1980:02 1999:10
Included observations: 237 after adjusting endpoints
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
-0.058652
0.034769
-1.686898
0.0930
S2
0.094271
0.048552
1.941649
0.0534
S3
0.154036
0.048552
3.172590
0.0017
S4
0.042431
0.048552
0.873930
0.3831
S5
0.071848
0.048552
1.479803
0.1403
S6
-0.042651
0.048552
-0.878450
0.3806
S7
-0.055968
0.048552
-1.152735
0.2502
S8
0.038163
0.048552
0.786029
0.4327
S9
0.086492
0.048552
1.781416
0.0762
S10
0.159315
0.048552
3.281314
0.0012
S11
0.094613
0.049171
1.924169
0.0556
S12
0.035060
0.049171
0.713024
0.4766
R-squared
0.163509 Mean dependent var
-0.002016
Adjusted R-squared
0.122614 S.D. dependent var
0.161798
S.E. of regression
0.151554 Akaike info criterion
-0.886438
Sum squared resid
5.167967 Schwarz criterion
-0.710840
Log likelihood
117.0429 F-statistic
3.998249
Durbin-Watson stat
2.399570 Prob(F-statistic)
0.000024

If we try to exclude insignificant variables one by one, starting from the most insignificant
dummy, we finally end up with the following model

16

Dependent Variable: D1LPMX


Method: Least Squares
Sample(adjusted): 1980:02 1999:10
Included observations: 237 after adjusting endpoints
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
-0.000512
0.012093
-0.042361
0.9662
S3
0.095897
0.035975
2.665678
0.0082
S6
-0.100790
0.035975
-2.801698
0.0055
S7
-0.114107
0.035975
-3.171879
0.0017
S10
0.101176
0.035975
2.812414
0.0053
R-squared
0.137857 Mean dependent var
-0.002016
Adjusted R-squared
0.122992 S.D. dependent var
0.161798
S.E. of regression
0.151522 Akaike info criterion
-0.915304
Sum squared resid
5.326449 Schwarz criterion
-0.842139
Log likelihood
113.4636 F-statistic
9.274215
Durbin-Watson stat
2.403237 Prob(F-statistic)
0.000001

17

1.6 Appendix 1.A. Matrix Representation of the Multiple Regression


Model
Let us now present the regression equation using matrix notation as it makes some
calculations very much easier than using the normal notation. We know that if there are n
observations we can write the regression equation as

Y1

X 11

Y2

Yn

X 21 ...

X 12

X 22 ...

X 1n

X 2 n ...

X k2

X k1

X kn

This means that we can write our parameters and variable in the following matrix form
Y=XB+
Where

1 X11 X 21 X k1

Y1
Y

Y2

Yn

1 X12 X 22 X k2

1 X1n X 2n X kn

it can be seen that:


Y is a (n x 1) vector of dependent variable
X is a (n x k) matrix of independent variables
B is a (n x 1) vector of regression parameters
is a (n x 1) vector of error terms
Furthermore, the expected value of is zero, E( )=0 while the variance of can be written as

E(' )

E(

2
1

E(

2 1

) E(

2
2

E(

n 1

) E(

n 1

E(

) E(

E(

E(

2
n

which is in fact

Var (
E( ' )

Cov(

Cov(

2
1

Cov(

2 1

) Var (

n 1

) Cov(

1
2
2

) Cov(

Cov(

Var (

2
n

n 1

18

The above result follows because according to the underlying assumptions of the OLS, Var
Cov( 2i)= 2 and Cov( i j)=0.
Let us now look at how the parameters can be estimated using matrix notation. Once again
consider that we are minimising the ESS.
n

i2

ESS

'

i 1

where

Y Y

and

therefore we have
)'(Y Y
) (Y X )'(Y X ) Y' Y ' X' Y Y' X ' X' X
' (Y Y
Y' Y 2 ' X' Y ' X' X
the FOC yields
ESS

2X' Y 2X' X

(X' X) -1 (X' Y)

the cross-product matrix, XX, will have an inverse only and only if the matrix X is full rank;
that is, there is no linear relationship between any two columns of the matrix. This condition
is known as non-colinearity (as opposed to perfect colinearity or multicolinearity) condition.
If there is colinearlity between explanatory variables in matrix X, the matrix is reduced rank
and does not have an inverse. Furthermore, XX should be positive definite for the SOC to
hold.
SOC:
1
2

ESS
2

2 X' X

X 11 X 12

X n1 X n2
n

X 1n

X nn

X 1i

X 2i

X ni

X 11

X n1

X 12

X n2

X 1n

X 2i

2
X 1i

X 1i
2

X 1i X 2i

X 2i X 1i

2
2i

X ni X 2i

X nn
X ni

X 1i X 1n

X 2i X ni

X ni X 1i

(2.1)

X 2ni

The above cross-product matrix is positive definite in the sense that

H1

n 0, H 2

n X1i2 ( X1i ) 2

0 , and so on until H

X' X

0.

Therefore, the condition for minimum ESS with respect to regression parameters holds.
From the above we can see that

19

X'

X' (Y X ) X' Y X' X

and since

X'

(X' X) 1 (X' Y) , we can write

X' (Y X )

X' Y X' X(X' X) 1 (X' Y)

which satisfies the underlying assumption of the OLS.

20

X' Y (X' Y)

CHAPTER 2

Diagnostic Tests of Regression Models

21

2 Diagnostic Tests of Regression Models


2.1 Introduction
Empirical research, which is considered to be an interactive process, begins with a
specification of the economic relationship to be estimated and continues with selection of an
appropriate specification for this relationship. The model selection process usually involves
several issues. For example, the variables to be included in the model, the functional form
connecting these variables, and if the data are time series, the dynamic structure of the
relationship between the variables.
Inevitably, there is uncertainty regarding the appropriateness of the initial specification of the
econometric model. Therefore, once the model is estimated, different diagnostic tests should
be performed in order to evaluate the quality of model specification along a number of
dimensions. The result of the diagnostic process in turn influences the chosen specification,
and the process is repeated until the model with the best performance in terms of diagnostic is
found.
In what follows we discuss an extensive menu of specification test statistics that are used in
econometric modelling. Nowadays many of these tests have become an integrated part of
many econometric packages.

2.2 Hetersoscedasticity
One of the main underlying assumptions of the classical linear regression model (CLRM) is
the assumption of Homoscedasticity, which requires the disturbance terms to appearing in
the population regression to be have a constant variance in the form
E(

2
i

as opposed to Heteroscedasticity in the form


E(

2
i

2
i

giving a subscript of i to sigma implies that it is not constant anymore.


Consider the example of savings and income which is assumed to be related in the following
form

Yi

Xi

Plot of savings against income shows that as income increases, savings on the average also
increase. However, it might also be the case that as income increases, not only on average the
level of savings increase but also the variance of savings increases too. This can be seen in

22

graphical form in the figure below, where the density of disturbances are more concentrated
for low income levels that for high income levels.

Density

savings

1+ 2X

Income
2.2.1 The effect of heteroscadasticity on OLS
estimates of CLRM
We know that when CLRM is estimated, one of the objectives is to perform hypothesis tests
on coefficients of the regression in order to, e.g. investigate the validity of an economic
theory. We also know that the standard errors of the estimated coefficients of the following
linear regression model

Yi

Xi

are calculated using the following formulas


2

SE( 1 )

Var ( 1 )
n

n X12i

n
and

SE( 2 )

Var ( 2 )
n

n X12i

X12i
( X1i ) 2
n

X12i
n (X1i X ) 2

( X1i ) 2
n

Note that we use estimate of 2 instead of

n (X1i

X )2

to calculate the standard errors. Now, if


i2
heteroscedasticity is present, then the estimate of 2
will be a biased estimator of
n 2

23

and as a result, the confidence intervals for hypothesis test are no longer valid. In other
words, conclusions drawn about hypothesis tests on regression coefficients in the presence of
heteroscedasticity are not valid.
In fact a Monte-Carlo study by Davidson and MacKinnon (1993), based on a bivariate
regression model, shows that the standard errors of coefficients, when heteroscedasticity is
ignored is wider than the ones corrected for heterscedasticity. Their model is based on the
following regression
Yi

Xi

~ N(0, X i )

which shows that the variance of error terms is not constant and depends on values of Xi. The
power means that the relationship between variance of error terms and the independent
variable might be nonlinear. Davidson and MacKinnon assume that
1= 2 =0 and run a
series of simulations with different values of to show that the standard error of estimated
coefficients depends on the presence of heteroscedasticity. They report the following results.

Value of
0.5
1.0
2.0
3.0
4.0

OLS
0.164
0.142
0.116
0.100
0.089

Standard Error of 1
OLSHET
GLS
0.134
0.110
0.101
0.048
0.074
0.0073
0.064
0.0013
0.059
0.0003

Standard Error of 2
OLS
OLSHET
GLS
0.285
0.277
0.243
0.246
0.247
0.173
0.200
0.220
0.109
0.173
0.206
0.056
0.154
0.195
0.017

It can be seen clearly that OLS consistently overestimate the standard error of estimates
compared to the Generalised Least Square (GLS) method. Even when the OLS estimates of
standard errors are corrected for heteroscedasticity, the result is not as good as the GLS.
Therefore, it is better to use GLS in the presence of heteroscedasticity, but when such method
cannot be applied, heteroscedasticity corrected standard errors should be used and reported
for hypothesis testing to ensure valid conclusion. We will discuss how standard errors can be
corrected for heterscedasticity and how GLS method can be used for estimating models with
heteroscedastic errors after we explained tests for detecting heteroscdeasticity.
2.2.2 Detecting Heteroscedasticity
In order to correct standard error for the effects of heteroscedasticity in error terms, first we
need to investigate whether error terms show signs of heteroscedasticity. There are a number
of methods proposed in the literature over the years by econometricians. In what follows we
discuss a few of these tests and show how each one can be performed using examples in
Eviews.
Graphical method
Sometimes heteroscedasticity in error terms can easily be detected if they are plotted against
another variable. If error terms show a particular pattern, then heteroscedasticity might be

24

present. The shape of the diagram may also suggest that what form of heteroscedasticity is
present in error terms.
However, what we need is some form of formal statistical test which can be formulated and
examined using a set of distribution and critical values. Most of the tests proposed to detect
heteroscedasticity are based on finding a relationship between the variance of error terms and
an explanatory variable.
Goldfeld and Quandt test
Goldfeld and Quandt (1972) suggest that in a linear regression variance of error terms might
be related to the explanatory variable; that is

Yi

Xi

2
i

2
i

X i2

where

Therefore, if the above is true, then large (small) variances are associated with large (small)
values of X. Goldfeld and Quandt (1972) therefore splits the sample into two and investigates
the equality of the equality of variances of error terms of two subsamples, 12 and 22 . This
can be written as
H0 :

2
1

2
2

aganist H1 :

2
1

2
2

Hence they suggest following steps to investigate the validity of the above relationship
1- Order all observations according to the values of Xi, starting with the lowest value
of Xi,
2- Choose a middle observation c, omitting that observation divide the remaining
sample into two subsamples with (n-c)/2 observation each,
3- Run two separate OLS on each subsample, and estimate the ESS for each
(n c)
regression, ESS1 and ESS2. We know that each of these ESS has
k
2
degrees of freedom, where k is the number of estimated parameters (e.g. for
bivariate regression k=2),
4- Compute the ratio of ESSs as
ESS1 / df
~ F( df ,df )
ESS 2 / df

note that ESS1>ESS2

5- Compare the Fobs with Fcrit to test the hypothesis that heteroscedasticity is present
or not.
One should note that this test largely depends on the sample size and the way c, the central
observation(s) is chosen.

25

Example: consumption and income (Gujarati 11.3)


Following data is collected to investigate the relationship between consumption and income
across 26 families. The first part of table presents the data, while the second half presents the
data sorted by the values of income (inc_r).
con
55
65
70
80
79
84
98
95
90
75
74
110
113
125
108

inc
80
100
85
110
120
115
130
140
125
90
105
160
150
165
145

Con
115
140
120
145
130
152
144
175
180
135
140
178
191
137
189

inc
180
225
200
240
185
220
210
245
260
190
205
265
270
230
250

con_r
55
70
75
65
74
80
84
79
90
98
95
108
113
110
125

inc_r
80
85
90
100
105
110
115
120
125
130
140
145
150
160
165

con_r
115
130
135
120
140
144
152
140
137
145
175
189
180
178
191

inc_r
180
185
190
200
205
210
220
225
230
240
245
250
260
265
270

Next we drop 4 middle observations and perform two separate regressions on the remaining
sub-samples.

Consi

Inci

Dependent Variable: CONSUMPTIONR


Sample: 1 13
Included observations: 13
Variable
Coefficient
C
3.409429
INCOMER
0.696774
R-squared
0.888651
Adjusted R-squared
0.878528
S.E. of regression
5.855582
Sum squared resid
377.1663
Log likelihood
-40.33649
Durbin-Watson stat
2.123530

Std. Error
t-Statistic
8.704924
0.391667
0.074366
9.369531
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.7028
0.0000
83.53846
16.80087
6.513306
6.600221
87.78810
0.000001

Dependent Variable: CONSUMPTIONR


Sample: 18 30
Included observations: 13
Variable
Coefficient
C
-28.02717
INCOMER
0.794137
R-squared
0.768054
Adjusted R-squared
0.746969
S.E. of regression
11.81986
Sum squared resid
1536.800
Log likelihood
-49.46750
Durbin-Watson stat
1.476579

Std. Error
t-Statistic
30.64214
-0.914661
0.131582
6.035307
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.3800
0.0001
155.8462
23.49768
7.918077
8.004993
36.42493
0.000085

The F statistic can be constructed as

26

1536.8 / 11
4.07 ~ F(11,11)
377.17 / 11
crit
F(11
2.8179
,11), 5%

Therefore, it can be concluded that there is significant heteroscedasticity in the above model
based on the Goldfeld-Quandt test.
White test
An alternative test for heteroscedasticity is proposed by White (1980). The main advantage of
this test is that unlike the Breusch-Pagan-Godfrey test does not depend on the normality of
the error terms, and unlike the Goldfeld-Quandt test does not require playing with the sample.
However, the principle of this test is very similar to the one for Breusch-Pagan-Godfrey. The
steps, which need to be taken in order to perform the White test for heteroscedasticity on the
following regression, are;
Yi

X 2,i

X 3,i

1- Estimate the model using OLS and save the residuals, i ,


2- Then we regress the squared residuals on a series of variables using the following
auxiliary regression

i2

X 2,i

X 3,i

X 22,i

X 32,i

X 2,i X 3,i

vi

we use different combination of Xk,i in order to detect any form of relationship


between variance of residuals and explanatory variables. Note that higher power
of Xk,i can also be included in the auxiliary regression too.
3- It is not difficult to see that, if heteroscedasticity is present it should be detected
by the auxiliary regression and at least one or more of the coefficients of this
regression should be significant. On the other hand, if there is no
0 is true.
heteroscedasticity in the residuals, then H 0 : 2
6
4- In order to test the above hypothesis, White argues that H0 can be tested against
the alternative using the product of the number of observations (n) and the R2 of
the auxiliary regression as
n R2 ~

asy

2
df

the df is the number of degrees of freedom, which in the above example is 5.

27

Example: We run the regression of consumption on income once again

Consi

Inci

Dependent Variable: CONSUMPTION


Method: Least Squares
Sample: 1 30
Included observations: 30
Variable
Coefficient
C
9.290307
INCOME
0.637785
R-squared
0.946638
Adjusted R-squared
0.944732
S.E. of regression
9.182968
Sum squared resid
2361.153
Log likelihood
-108.0538
Durbin-Watson stat
1.702261

Std. Error
t-Statistic
5.231386
1.775879
0.028617
22.28718
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0866
0.0000
119.7333
39.06134
7.336918
7.430332
496.7183
0.000000

Once residuals are obtained, we regress the squared residuals on income and squared income
as follows
2
i2
vi
1
2 Inci
2 Inci
which yields the following results
Dependent Variable: RESID_MAINSQR
Method: Least Squares
Sample: 1 30
Included observations: 30
Variable
Coefficient
C
-12.29621
INCOME
0.197385
INCOMESQR
0.001700
R-squared
0.177697
Adjusted R-squared
0.116785
S.E. of regression
105.8043
Sum squared resid
302252.7
Log likelihood
-180.8355
Durbin-Watson stat
0.791307

Std. Error
t-Statistic
191.7731
-0.064119
2.368760
0.083329
0.006707
0.253503
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.9493
0.9342
0.8018
78.70511
112.5823
12.25570
12.39582
2.917301
0.071274

And the white test would be


White n R 2 ~

asy

2
df

White 30 * 0.1777 5.331

the 5% significant level for this test which is distributed as chi-square with 2 df (two variable
in the auxiliary regression) is 5.99. Therefore, based on the White tests for heteroscedasticity,
we can reject the existence of heteroscedasticity at 5% significant level.
ARCH test
Sometimes, the squared residuals do not show relationship with any variable, however, they
may depend on their past values. In other words, squared residuals might be autocorrelated.
This is also a different form of heteroscedasticity, which is called Autoregressive
Conditional Heteroscedasticity, ARCH. ARCH effects are quite common in time series
analysis or regression, where variables are measured over time.

28

The first test for ARCH is proposed by Engle (1982) where he suggest testing squared
residuals for any serial dependence or autocorrelation using the following regression
t2

2
1 t 1

t2 k

vi

The existence of ARCH can be tested through the significance of coefficients of lagged
squared residuals using an F test or a Wald test. Therefore, the null of no ARCH is

0 and the alternative of existence of ARCH is that at least one or more


1
k
coefficients are significant.

Example:
We run the regression of consumption on income once again, but this time we use the proper
time subscripts as Income and Consumption are measured over time

Const

Inct

Dependent Variable: CONSUMPTION


Method: Least Squares
Sample: 1 30
Included observations: 30
Variable
Coefficient
C
9.290307
INCOME
0.637785
R-squared
0.946638
Adjusted R-squared
0.944732
S.E. of regression
9.182968
Sum squared resid
2361.153
Log likelihood
-108.0538
Durbin-Watson stat
1.702261

Std. Error
t-Statistic
5.231386
1.775879
0.028617
22.28718
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0866
0.0000
119.7333
39.06134
7.336918
7.430332
496.7183
0.000000

As usual we save residuals and square them for any tests of heteroscedasticity, and run the
following autoregression.
t2

2
1 t 1

t2 k

vi

the result is
Dependent Variable: RESID_MAINSQR
Method: Least Squares
Date: 06/03/02 Time: 16:48
Sample(adjusted): 4 30
Included observations: 27 after adjusting endpoints
Convergence achieved after 3 iterations
Variable
Coefficient
Std. Error
t-Statistic
C
82.75053
32.33636
2.559055
AR(1)
0.931161
0.211672
4.399066
AR(2)
-0.436685
0.305868
-1.427689
AR(3)
-0.066562
0.271402
-0.245254
R-squared
0.492564 Mean dependent var
Adjusted R-squared
0.426377 S.D. dependent var
S.E. of regression
89.50718 Akaike info criterion
Sum squared resid
184265.3 Schwarz criterion
Log likelihood
-157.4933 F-statistic
Durbin-Watson stat
1.978088 Prob(F-statistic)

29

Prob.
0.0175
0.0002
0.1668
0.8084
82.42946
118.1802
11.96247
12.15444
7.441975
0.001179

The value of the F test and its significance shown at the table suggests that there is ARCH
effect in the consumption-income model. However, the coefficients of the auxiliary
regression and their significance level suggest that the ARCH effect is of order one or two.
The ARCH test can be directly performed using the Eviews regression menu by going to
View/residualtest/ARCH LM test. You notice that the result would be the same.
2.2.3 What to do in the presence of heteroscedasticity
So far we explained the effects of heterescedasticity on estimated parameters and different
tests that can be used to detect heteroscedasticity. But what can we do about it. In what
follows we discuss a few methods that can be use as remedial measures to overcome
problems associated with models where heteroscedasticity is present.
The method of Generalised Least Squares
The method of Generilsed Least Squares (GLS) and Weighted Least Squares (WLS) are used
to estimate a regression when error terms are heteroscedastic. In this method, we divide the
variables in the regression by the independent variable which causes heteroscedasticity. For
instance, in the following model

Yi

X 2,i

X k ,i

2
i

~ NID(0,

let us assume that


2
i

var( i )

CX 22,i

where, C is a constant which relates the variance with the independent variable, X 22,i . Using a
similar procedure followed in the case of the know variance, we redefine variables as
Yi *

X 2,i

Yi
, X 2*,i
X 2 ,i

X 2,i

1 , X k*,i

X k ,i
X 2,i

*
i

1 and

vi

X 2 ,i

therefore, the transformed regression will be


Yi
X 2 ,i

X 2 ,i

X 2 ,i

X 2 ,i

X k ,i

X 2 ,i

X 2 ,i

X 2,i

~ NID(0, C)

It can be seen that the parameters of the equation should be interpreted with care when we
need to explain the relationships in the original regression as

Yi*

X 2*,i

X k2,i

*
i

*
i

~ NID(0, C)

30

It can be seen that in the above regression


2
2 ,i

is constant and

is reflects the impact of

(1 / X ) on Yi .
Note that when i2 is known to the researcher, i.e. the form of the heteroscedasticity is clear,
the researcher should take heteroscedasticity into account when estimating the model. This
can be easily done by using a method known as Weighted Least Squares (WLS). In this
method we divide the regression by the variance, i2 , which is known.
Example: Housing expenditure (Pindyck and Rubinfeld 6.1)
In this cross section study the relationship between housing expenditures and annual incomes
of four groups of families is investigated.
Dependent Variable: EXPENDITURE
Method: Least Squares
Sample: 1 20
Included observations: 20
Variable
Coefficient
C
0.890000
INCOME
0.237200
R-squared
0.933511
Adjusted R-squared
0.929817
S.E. of regression
0.373021
Sum squared resid
2.504600
Log likelihood
-7.602738
Durbin-Watson stat
1.363966

Std. Error
t-Statistic
0.204312
4.356086
0.014921
15.89724
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0004
0.0000
3.855000
1.408050
0.960274
1.059847
252.7223
0.000000

And when we test for heteroscedasticity using the Eviews Menu we get
White Heteroskedasticity Test:
F-statistic
5.979575
Obs*R-squared
8.259324

Probability
Probability

0.010805
0.016088

This means the residuals are heteroscedastic. Now, let us assume that squared residuals are
related to income level, and then we can weigh the variables to get the new variables and
rerun the regression in the following form

Dependent Variable: EXPENDITURE/INCOME


Method: Least Squares
Sample: 1 20
Included observations: 20
Variable
Coefficient
Std. Error
t-Statistic
C
0.249487
0.011723
21.28124
1/INCOME
0.752923
0.098255
7.662934
R-squared
0.765382 Mean dependent var
Adjusted R-squared
0.752348 S.D. dependent var
S.E. of regression
0.025567 Akaike info criterion
Sum squared resid
0.011766 Schwarz criterion
Log likelihood
46.00404 F-statistic
Durbin-Watson stat
1.493042 Prob(F-statistic)

Prob.
0.0000
0.0000
0.327917
0.051375
-4.400404
-4.300831
58.72056
0.000000

In the above regression, the corresponding coefficient of income is in fact C=0.249 which is
quite similar to 0.237 in the original regression. Also note that you can implement the

31

correction directly in Eviews by choosing the option WLS in the regression menu this will
give you the following result
Dependent Variable: EXPENDITURE
Method: Least Squares
Sample: 1 20
Included observations: 20
Weighting series: INCOME
Variable
Coefficient
C
1.131742
INCOME
0.221935
Weighted Statistics
R-squared
0.974215
Adjusted R-squared
0.972782
S.E. of regression
0.521160
Sum squared resid
4.888948
Log likelihood
-14.29122
Durbin-Watson stat
1.217059
Unweighted Statistics
R-squared
0.928268
Adjusted R-squared
0.924283
S.E. of regression
0.387450
Durbin-Watson stat
1.143174

Std. Error
0.440283
0.025634

t-Statistic
2.570486
8.657760

Prob.
0.0193
0.0000

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

4.448000
3.158963
1.629122
1.728695
680.0714
0.000000

Mean dependent var


S.D. dependent var
Sum squared resid

3.855000
1.408050
2.702117

White Correction for Heteroscedasticity


White (1980) has derived a heteroscedasticity consistent covariance matrix estimator which
provides correct estimates of the coefficient covariances in the presence of heteroscedasticity
of unknown form. This is quite important as it is not always the case that the form of
heteroscedasticity is known. The White covariance matrix is given by:

T
W

(X ' X )

2
t t

x x't ( X ' X )

t 1

where T is the number of observations, k is the number of regressors, and is the least squares
residual.
The above expenditure and income model is re-estimated again, but this time we use the
White Heteroscedasticity Corrected covariance to get the following result
Dependent Variable: EXPENDITURE
Method: Least Squares
Sample: 1 20
Included observations: 20
White Heteroskedasticity-Consistent Standard Errors & Covariance
Variable
Coefficient
Std. Error
t-Statistic
C
0.890000
0.157499
5.650847
INCOME
0.237200
0.016710
14.19495
R-squared
0.933511 Mean dependent var
Adjusted R-squared
0.929817 S.D. dependent var
S.E. of regression
0.373021 Akaike info criterion
Sum squared resid
2.504600 Schwarz criterion
Log likelihood
-7.602738 F-statistic
Durbin-Watson stat
1.363966 Prob(F-statistic)

32

Prob.
0.0000
0.0000
3.855000
1.408050
0.960274
1.059847
252.7223
0.000000

2.3 Serial correlation


Another important underlying assumption for the OLS estimate of CLRM to be BLUE was
the assumption of no serial correlation (autocorrelation) in the residuals. In the following
sections we study the effects of serial correlation on OLS estimates, different tests for
detecting serial correlation and remedies for such effects in regression residuals.
2.3.1 The effect of serial correction on regression estimates
Serial correlation often occurs in time series model where consecutive residuals are usually
found to be dependent. The effects may also occur in cross sectional models where collected
variables are somehow ordered by size or geography. Serial correlation can be positive or
negative. A positive serial correlation implies that a errors in prediction today have a positive
effect on prediction error tomorrow. This mainly occurs in time series analysis because either
there are some errors in the measurement of variables or because of high degree of
correlation between consecutive observations.
Serial correlation does not affect the unbaisedness and consistency of OLS estimators,
but it affects their efficiency. This problem is more serious when lags of dependent
variables are also present as explanatory variables.
In the presence of serial correlation the estimates of SE of coefficients will be smaller that
what they should be, i.e. there is a downward bias in the estimates of SEs. This consequently
leads to drawing invalid inferences and hypothesis testing. The downward bias will cause the
null to be rejected more often than it should be.

2.3.2 How to detect serial correlation


There are several tests proposed in the literature to investigate the presence of serial
correlation in the regression residuals. Some of them are widely used and many econometric
packages nowadays report some of these tests as standard regression output. In the following
sections we examine a few of them.
Graphical Method
The simplest way to detect serial correlation in a regression model is to plot the residual and
see whether there is a consistent pattern exists. For example, positive residuals followed by a
series of positive residuals and negative residuals followed by negative residuals. This is
because such pattern is against the assumption of randomness of the residuals (being white
noise). This can be seen in the following line, which is in fact the sign of the residual series.
(- - - - - -) (+ + + + + +) (- - - -) (+ + + + + + + + +)
In graphical form, one can use either the scatter plot of current residuals against lagged ones
and if there is any relationship observed between the two, errors terms might be serially

33

correlated. Alternatively, one can plot the residual against time and see whether there is a
consistent pattern exists.
Residuals with serial correlation
12

-4

-8
60

65

70

75

80

85

90

95

R3 Residuals

Residuals with no serial correlation


0.15

0.10

0.05

0.00

- 0.05

- 0.10
12/29/99

5/17/00

10/04/00

2/21/01

RAACB

Durbin Watson test


Perhaps one of the most important and the first tests proposed in the literature to examine the
existence of serial correlation by Durbin and Watson (1951). This test is for the first order
serial correlation and therefore is not a valid test when the lagged dependent variable is used
as an explanatory variable.
Let us consider the following regression

Yt

X 2,t

X k ,t

~ (0,

where the residuals are correlated in the following form


t

t 1

vt

where vt is white noise error term. If serial correlation of the first order is present then
and if it is not =0.

34

The Durbin-Watson test involves calculation of the test statistic based on the residual from
the OLS regression and it is defined as
T

DW

( t

t 1 ) 2

t 2
T

( t ) 2

t 1

The DW statistic is approximately

DW

2(1

The DW is between 0 and 4, when

1 or close to 1, then DW=0 or close to 0


0 or close to 0, then DW=2 or close to 2

1 or close to -1, then DW=4 or close 4


A DW value of 2 means no serial correlation. Now if the successive values of t are close to
each other, the DW statistic will be smaller than 2 indicating the presence of a positive serial
correlation. Similarly if there is negative serial correlation then DW is larger than 2.
For the exact interpretation of the DW test statistic, one need certain values between 0 and 4
to perform the test. Therefore, two limits are usually given with test statistic, which depends
on the sequence of residuals and independent variables as well as number of observations,
which called are dl and du.
For example, when investigating the possibility of positive serial correlation a value of DW
below dl means that the null of no serial correlation should be rejected, and if DW is greater
than du then the null of no serial correlation should not be rejected. When DW is between dl
and du the result is inconclusive.
In the case of negative serial correlation the decision is taken by comparing the DW with the
base of 4. Therefore, when 4- dl <DW<4, reject the null of no serial correlation and when 2
<DW<2- du, we cannot reject the null of no serial correlation.

In summary
Value of DW

Result and decision

4-dl<DW<4
4-du<DW<4- dl
2<DW<4- du
du<DW<2
dl<DW<du
0<DW<dl

Reject the null of no serial correlation (negative)


Inconclusive result
Accept the null of no serial correlation
Accept the null of no serial correlation
Inconclusive result
Reject the null of no serial correlation (positive)

35

Example: Interest rate determination (Pindyck and Rubinfeld)


This example investigate the relationship between the US three months T-bill and Federal
Reserve board index of industrial production as well as money supply
IR = 3-month annualised US T-bill (%)
IP = Federal Reserve Board Index of industrial production (1982=100)
M2 = Nominal Money Supply ($bn)
PW= Producers Price Index for all commodities (1982=100)

We can use percentage change in M2 and PW in the model as

GM 2 t

( M 2t

M 2t 1 ) / M 2t

GPWt

and

( PWt

PW t 1) / PW t

The theory suggest that


IRt

f ( IP t , GM 2 t , GPWt 1 )

thus, the econometric model would be

IRt

IPt

GM 2t

GPWt

~ N(0,

estimating the above econometric model results in

Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08
Included observations: 428
Variable
C
IP
GM2
GPW(-1)
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733

Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000

It can be seen that the value of the DW test is quite low (0.183733). This suggests that error
terms are positive correlated. We can see this graphically by plotting the residuals against
time. It can be noticed that when residuals are positive they tend to remain positive and when
they are negative they tend to remain negative.

36

12

-4

-8
60

65

70

75

80

85

90

95

R3 Residuals

This means that the estimates should be corrected for the presence of serial correlation,
otherwise not only inferences would be invalid but also there might be some problems with
using this model for forecasting.
Ljung-Box test for serial correlation (autocorrelation)
Ljung and Box (1978) propose a test for autocorrelation, which is is defined as
k2

LB

T (T

2)
k 1

2
p

In large sample LB statistics follows the chi-squared distribution with p degrees of freedom.
Sometimes the LB test is called as LB-Q test statistic rather than LB and it should not be
confused with Box-Pierce Q statistic. LB test can be performed on residuals of the regression
directly in EViews.

Example: Interest rate determination (Pindyck and Rubinfeld)


IRt

IPt

GM 2t

GPWt

~ N(0,

estimating the above econometric model results in


Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08
Included observations: 428
Variable
C
IP
GM2
GPW(-1)
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733

Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

37

Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000

The LB-Q test can be performed directly in Eviews using the regression menu following
residual test (View/Residual tests/correlogram Q-statistic) to get
Sample: 1960:01 1995:08
Included observations: 428
Autocorrelation
Partial Correlation
.|*******|
.|*******|
.|*******|
.|*
|
.|****** |
.|*
|
.|****** |
.|.
|
.|****** |
.|*
|
.|****** |
.|.
|
.|***** |
.|.
|
.|***** |
.|*
|
.|***** |
.|.
|
.|***** |
*|.
|
.|***** |
.|.
|
.|***** |
.|.
|

k2
428(428 2)
k 1 428 k
2

LB Q(2)

1
2
3
4
5
6
7
8
9
10
11
12

AC
PAC Q-Stat
0.907 0.907 354.32
0.856 0.188 670.53
0.818 0.100 959.96
0.773 -0.013 1219.2
0.760 0.168 1470.8
0.732 -0.017 1704.5
0.712 0.048 1925.9
0.713 0.134 2148.5
0.700 0.022 2363.7
0.661 -0.164 2556.0
0.637 0.016 2735.1
0.617 0.047 2903.7

(0.907) 2
184040
427

Prob
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000

(0.856) 2
426

184040(0.0019266 0.00172)

670.53 ~

2
2

5.99

2.3.3 Correction for serial correlation


So far we have learned about the effects of serial correlation on regression estimates and how
to detect serial correlation in regression residuals. But how do we take this effect into account
when we run regressions? In other words, what are the remedial measures for serial
correlation? In the following sections we discuss a few method proposed in the literature to
correct regression results in the presence of serial correlation.
Newey-West correction for serial correlation
The White covariance matrix described earlier assumes that the residuals of the estimated
equation are serially uncorrelated.

T
T k

( X' X)

2
t t

x x't ( X' X)

t 1

Newey and West (1987) have proposed a more general covariance matrix estimator that is
consistent in the presence of both heteroskedasticity and autocorrelation of unknown form.
The Newey-West estimator is given by

NW

( X' X) 1 ( X' X)

where

38

T k

t 1

2
t t

x x't

1 v /( p 1
1

( xt
t

x't

t t

xt

x't )

In most of econometric packages nowadays the New-West correction for serial correlation
and heteroscedasticity can be applied automatically. The variance covariance estimated using
the Newey-West is called heteroscedasticity/autocorrelation consistence covariance matrix
(HAC).

Example: Interest rate determination (Pindyck and Rubinfeld)

IRt

IPt

GM 2t

GPWt

~ N(0,

estimating the above econometric model results in

Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08, Included observations: 428
Variable
Coefficient
Std. Error
t-Statistic
C
1.214078
0.551692
2.200644
IP
0.048353
0.005503
8.786636
GM2
140.3261
36.03850
3.893784
GPW(-1)
104.5884
17.44218
5.996295
R-squared
0.216361 Mean dependent var
Adjusted R-squared
0.210816 S.D. dependent var
S.E. of regression
2.481026 Akaike info criterion
Sum squared resid
2609.927 Schwarz criterion
Log likelihood
-994.2079 F-statistic
Durbin-Watson stat
0.183733 Prob(F-statistic)

Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000

Correcting the covariance matrix of coefficient using consistent covariance estimate in


Eviews we get
Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08
Included observations: 428
Newey-West HAC Standard Errors & Covariance (lag truncation=5)
Variable
Coefficient
Std. Error
t-Statistic
C
1.214078
0.705989
1.719684
IP
0.048353
0.009516
5.081429
GM2
140.3261
56.52930
2.482361
GPW(-1)
104.5884
28.62085
3.654274
R-squared
0.216361 Mean dependent var
Adjusted R-squared
0.210816 S.D. dependent var
S.E. of regression
2.481026 Akaike info criterion
Sum squared resid
2609.927 Schwarz criterion
Log likelihood
-994.2079 F-statistic
Durbin-Watson stat
0.183733 Prob(F-statistic)

39

Prob.
0.0862
0.0000
0.0134
0.0003
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000

It can be seen that while coefficient remain the same as before, the SEs are now corrected for
serial correlation using Newey-West and the lag truncation used is 5 as indicated in the
output.
Coefficient
1.214078
0.048353
140.3261
104.5884

Std. Error
0.551692
0.005503
36.03850
17.44218

Newey-West Std. Error


0.705989
0.009516
56.52930
28.62085

The Newey-West corrected standard errors are all larger than the uncorrected standard error
in every case as expected. This means that if the model is not corrected for the presence of
serial correlation, standard errors would be smaller than the true standard errors and any
hypothesis test based on them would be invalid. This also applies to all hypothesis tests such
as F test and Wald test where variance covariance matrix of coefficients is used to construct
the test.

2.4 Normality
The assumption of normality is needed for inference in the regression models. Under the
assumption of Normality the ML estimators are equivalent to the OLS estimators. It is
important therefore that this assumption is tested for. There are different tests proposed in the
literature for Normality of residuals. Among these are: Jarque and Bera, Shapiro and Wilk,
and Kolmogorov-Smironov tests. These tests are all based on measuring the departure of
residuals for normality based on the 3rd and the 4th moments of the residuals.
2.4.1 Jarque Bera test
To test normality of a variable (error terms) we can use the test proposed by Jarque and Bera
(1982). Jarque-Bera is a test statistic for testing whether the series is normally distributed.
The test statistic measures the difference of the skewness and kurtosis of the series with those
from the normal distribution. The statistic is computed as:
JB

n k
SK 2
6

1
( KU
4

3) 2

where SK is the skewness, KU is the kurtosis, and k represents the number of estimated
coefficients used to create the series.
Under the null hypothesis of a normal distribution, the Jarque-Bera statistic is distributed as
chi-squared distribution with 2 degrees of freedom. The reported Probability is the
probability that a Jarque-Bera statistic exceeds (in absolute value) the observed value under
the nulla small probability value leads to the rejection of the null hypothesis of a normal
distribution.

40

CHAPTER 3

Forecasting Time Series

41

3 Forecasting Time Series


3.1 Introduction
Almost any decision-making application depends on some form of forecast about some
quantity or event in the future. For example, when an institution plans to invest in stocks,
bonds or other financial instruments, it essentially attempts to forecast the future movements
in those instruments and choose the most suitable one(s) in order to maximise profit at the
end of the investment horizon. Another example is when a company attempts to set an
ordering or production schedules for a particular product it sells. The company has to have
some idea about the demand for this product in the future in order to set an appropriate
ordering or production schedule. Other examples include policy decisions, asset allocations,
investments, etc.
One of the objectives of estimating econometrics and time series models is forecasting. A
forecast is a quantitative estimate about the likelihood of the future events or the outcome of a
process, which is based on the current and the past information. The information, which is
embedded in a model, is based on the history of the process or events.
There are generally two types of forecasts, a point forecast, which predicts a single value in
each forecast period, and an interval forecast, which is a range of values with certain
assigned probability.

3.2 Forecasting Models


1234-

3.3

Judgmental forecasting
Extrapolation or simple time series methods
Econometric or causal methods
Combination forecasting

Extrapolation or Simple Time Series Methods

We start with time series models. These models are simple quantitative extrapolation
methods that use past data of a time series and sometimes time trend to forecast future values
of the variable. The idea is that the variable follows some patterns and we try to determine the
pattern and project it into future. There are many extrapolation methods and here we focus on
only a few important and widely used ones.
As mentioned in time series forecasting we try to determine the existence pattern in historical
data. This can be done in a number of ways. First, we can plot the series and investigate the

42

graphical pattern of the series. Second, we can use statistical analysis such as correlation
coefficients to examine the dependencies of the variable on its past values. We can also look
whether there are any seasonal patterns in the time series and model such seasonal movement
and then forecast. Let us look at some time series.

3.3.1 Random series


A random variable is a variable, whose value changes randomly (stochastically) perhaps
around a constant mean, , with constant variance. The values of a random variable are
independent of each other. Also, note that the constant mean can be zero.
Yt

~ iid(0,

t is an identically and independently distributed error term with zero mean and constant variance,
which is also called white noise.

3.3.2 Random Walk series


A random walk series is a series in which
Yt

Yt

~ iid(0,

By the same token a random walk with drift is


Yt

Yt

~ iid(0,

As an example consider the time series of Dow Jones index over the period January 1988 to
March 1992 on a monthly basis.
Time series plot of the variable shows that the variable seems to be wandering around a
constant upward trend. However, this is not enough to show that Dow Jones Index follows a
random walk process. We must assess the autocorrelation function of this variable to gain
more insight as whether Dow Jones is a RW process. It can be seen that coefficients of
autocorrelation are all close to one (0.912, 0.8161 and so on) and significant. This can be
regarded as an indication that this variable is close to being a RW process. Furthermore,
looking at the correlogram of the series (plot of coefficients of autocorelation) presents a
graphical assessment of the nature of autocorrelation of the series. In this case, it can be seen
that the coefficients of autocorrelation die out very slowly, meaning that the series has a long
memory.

43

Time series chart of Dow


3500
3250

Dow

3000
2750
2500
2250

Jan-92

Mar-92

Nov-91

Sep-91

Jul-91

Mar-91

May-91

Jan-91

Nov-90

Jul-90

Sep-90

May-90

Jan-90

Mar-90

Nov-89

Sep-89

Jul-89

Mar-89

May-89

Jan-89

Jul-88

Sep-88

Mar-88

May-88

Jan-88

1750

Nov-88

2000

Date

Autocorrelations for Dow


(significant values colored red)
1.0

0.5

0.0
1

10

11

12

-0.5

-1.0
Lag

On the other hand, when we study the autocorrelation function of first differences of the Dow
Jones index over the sample period, we find that the series wiggle around a constant mean,

44

which indicate that the series are highly mean reverting. Also looking at the autocorrelation
functions and correlogram of the returns, we note that the series are not correlated; that is, the
observations are independent of each other. This in turn suggests that the return on Dow
Jones index is a random variable while the Dow Jones index level is a random walk process.

Time series chart of Dow _Diff1


300

Dow_Diff1

200
100
0
-100
-200

Observation Number

Autocorrelations for Dow_Diff1


(significant values colored red)
1.0
0.5

0.0
1

-0.5
-1.0
Lag

45

10

11

12

51

49

47

45

43

41

39

37

35

33

31

29

27

25

23

21

19

17

15

13

11

-300

3.3.3 Series with Linear Trend


A series with linear trend means that the variable under consideration changes its value by
certain amount every period. This change can be positive or negative. Negative trend means
that the series is declining by a constant amount every period. For example, consider the
quarterly sales data for Reebok from the first quarter 1986 through the second quarter 1996.
The following figure illustrates the Reebok sales.
Time series chart of Sales

1200
1000

Sales

800
600
400
200
0
1

11

13

15

17

19

21

23

25

27

29

31

33

35

37

39

41

Observation Number

It can be seen that the series is increasing around a constant (linear) trend. Therefore, one can
use a constant trend to model the series. This will result in the following model.
Yt

~ iid(0,

The estimation result will be

Yt

244.8154 16.5304t
(8.6206)

~ iid(0,

(14.3665)

Results of simple regression for Sales


Summary measures
Multiple R
R-Square
StErr of Est
ANOVA table
Source
Explained
Unexplained

0.9152
0.8377
90.3844

df
1
40

SS
1686121.5545
326773.3761

MS
1686121.5545
8169.3344

F
206.3964

Coefficient
244.8154
16.5304

Std Err
28.3989
1.1506

t-value
8.6206
14.3665

p-value
0.0000
0.0000

Regression coefficients
Constant
Time

46

p-value
0.0000

The fitted values of the linear trend model can be seen in the following figure.

Time series chart of Sales and Fitted_Values


1200
1000

Sales

800
600
400
200
0

Time

Sales

Fitted Values

Now if we need to forecast quarterly sales for Reebok, we simply project the trend values
into future and hope that they give us a good indication of future Reebok sales.

Time series chart of Sales and Fitted_Values

1200
1000

Sales

800
600
400
200
0

Time
Sales

Fitted Values

3.3.4 Exponential Trend Model


In some series the trend might not be linear, therefore, the linear trend model may not be able
to capture or explain the trend in the series. Examples of series with exponential trend or
decay are stock indices, price indices, GDP, etc. In the simplest form, these series must be
modelled using exponential trend models.

Yt

t
t

~ iid(0,

If we take the log of both sides,


ln Yt

ln

t ln

47

~ iid(0,

This model suggests that the value of the variable Yt increase by certain percentage, , every
period. As an example consider the quarterly sales data for the computer chip manufacturing
firm Intel for the period 1986 to 1996.

Time series chart of Sales

4500
Sales

3750
3000
2250
1500
750
0

Time

3.3.5 Autocorrelated series

Yt

1 t 1

or

Yt

1 t 1

Where <1,
Note that if 1=1, then we have a random walk series.
Similarly the order of autocorrelation may be higher than one. In this case the series can be
written as
Yt

1 t 1

2 t 2

...

p t p

Yt

i t i

i 1

As an example consider the monthly stereo sales (in $000) between January 1995 and
December 1998. Using Statpro we produced the autocorrelation function and correlogram for
the series.

48

Autocorrelations for Sales


(significant values colored red)

1.0

0.5

0.0
1

-0.5

-1.0
Lag

Since we found that period on period stereo sale values are dependent on each other through
autocorrelation analysis, we can regress current sale values, yt, on a constant, , and oneperiod lagged sale values, yt-1, to obtain the autoregressive model for sale values in the
following form

Yt

1 t 1

Estimating the above model using Statpro or Eviews will result in the following output.

49

It can be seen that the one-period lagged sale values is estimated as 0.3495 and it is
significant. Therefore, we can write the time series model for stereo sale values as

Yt

117.8574 0.3495Yt

(4.6533) (2.5606)
In the next step we can use the above Autoregressive model known as AR(1) model to
forecast future stereo sale values recursively in the following form

Yt

117.8574 0.3495Yt

Yt

117.8574 0.3495Yt

Yt

117.8574 0.3495Yt

117.8574 0.3495Yt

n 1

Yt

We should note that in order to forecast Yt+n, we need to know the value of observation t+n-1,
that is Yt+n-1. This means that, in order to forecast Yt+n, first we need to forecast Yt+n-1 and
then obtain the forecast for Yt+n. Therefore, for forecasting, Yt+n, we need all the values of
Yt+1 to Yt+n-1. If we attempt to forecast Yt+n, using all forecast values of Yt+1 to Yt+n-1, then we
have a dynamic forecast for Yt+n. Alternatively, we can forecast sale values one period at a
time and wait for the actual sale values to be realised and forecast one period ahead again.
This method is called static forecast. We will talk more about this later.

50

280
260
240
220
200
180
160
140
120

Sales

3.4

Oct-99

Jul-99

Apr-99

Jan-99

Oct-98

Jul-98

Apr-98

Jan-98

Oct-97

Jul-97

Apr-97

Jan-97

Oct-96

Jul-96

Apr-96

Jan-96

Oct-95

Jul-95

Apr-95

80

Jan-95

100

forecast

Moving Average (MA) models

This a widely used and simple alternative model for forecasting time series and is based on
the principal of averaging passed values as prediction of future values. To implement such
methodology, first we chose a span, which is the number of lagged period used in averaging.
Therefore, assuming s span of p period, the forecast for time t+1 of Y, Yt+1, is determined as
the average of Yt-p+1 to Yt in the following form.
Yt

(Yt

p 1

... Yt ) / p

Similarly to forecast Yt+2,

Yt

(Yt

p 2

... Yt 1 ) / p

Yt

(Yt

p k

... Yt

k 1

)/ p

Now if we only have observations up to time t, the forecast of Yt+2 should be based on the
forecast value of Yt+1, Yft+1. Therefore,

Yt
Yt

2
3

(Yt
(Yt

p 2
p 3

... Yt f1 ) / p
... Yt f1 Yt f 2 ) / p

51

and so on

Since this method of forecasting takes the average of the observations over a period, by
construction, it makes the resultant series smoother than the original series. In fact this is why
the method is called smoothing. Also note that as you increase the span, the series become
smoother. Therefore, one should note that if fluctuations in the series are not quite random
and part of the series, application of a large span reduces the predictive power. Similarly, if
the series is not very volatile and changes are mainly random and not part of the series, then
larger span can be used.
Let us look at such forecast for Dow Index using span of 3 and 12 periods. The following
figures illustrate the actual, fitted and forecast values of Dow Index. It can be noticed that
MA3 method follows the original series more closely than MA12. In fact MA12 over
smooths the series.
MA 3 forecasts for Dow Index
3400
3200
3000
2800
2600
2400
2200
2000
1800

Dow

Sep-93

May-93

Jan-93

Sep-92

May-92

Jan-92

Sep-91

May-91

Jan-91

Sep-90

May-90

Jan-90

Sep-89

May-89

Jan-89

Sep-88

May-88

Jan-88

1600

Forecas t

MA 12 forecasts for Dow Index


3400
3200
3000
2800
2600
2400
2200
2000
1800

Ja
n8
M 8
ay
-8
S 8
ep
-8
8
Ja
n8
M 9
ay
-8
S 9
ep
-8
9
Ja
n9
M 0
ay
-9
S 0
ep
-9
0
Ja
n9
M 1
ay
-9
S 1
ep
-9
1
Ja
n9
M 2
ay
-9
S 2
ep
-9
2
Ja
n9
M 3
ay
-9
S 3
ep
-9
3

1600

Dow

Forecas t

3.4.1 Exponential Weighted Moving Average (EWMA) model


A major problem, which arises when using Simple Smoothing techniques, is that equal
weights are allowed for each observation in the span period. This may not be quite true for
many time series. This is because time series might be more dependent on the recent past

52

values than farther past values. Therefore, taking this characteristic of time series into
account, we need to allocate different weights, perhaps exponentially declining weights,
when we calculate our Moving Averages. There are different methods of applying EWMA
(exponential smoothing).

3.4.2 Single Smoothing


The simplest one is known as single smoothing (single parameter smoothing). This method
involves choosing a single parameter ( ) in the following exponential smoothing model

Yt

Yt

(1 a)Yt

(1 a) 2 Yt

(1 a) 3 Yt

...

Where Yt 1 is the forecast of Y for next period based on available information at time t; that
is, the historical values of Y (Yt, Yt-1, Yt-2, .). Also note that 0< <1.
It is not difficult to show that the above model can be simply written as

Yt

Yt

(1 a)Yt

If is chosen to be close to 1 then the model resembles a RW model; that is, there is a
stronger weight is given to very recent values of Y, while small values of implies that the
weights decay very slowly. Let us see how exponential smoothing can be done in Eviews.
Consider the Coca-Cola quarterly sales series over the period 1986:1 and 1996:2.

Time series chart of Sales

5250
Sales

4500
3750
3000
2250
1500
1

10

13

16

19

22

25

Observation Number

53

28

31

34

37

40

Applying the simple exponential smoothing model to this data set, using a estimation period
1986:1 to 1994:2.

Choosing the first option (single smoothing) and allowing the program to choose the best
estimate for , we will have a new series called SALESSM, which is the fitted series.
Plotting the actual and fitted series we will get the following graph.

6000

5000

4000

3000

2000

1000
86

87

88

89

90

91

92

93

SALESSM

94

95

96

97

98

99

SALES

Note that for periods after the last observation, forecast beyond the end of the sample, the
procedure yields same forecast for all future observations.

54

3.4.3 Exponential Smoothing with Trend (Holt Model)


This procedure is used when the time series is believed to have constant trend. Therefore, we
use two parameters, and . The former parameter is same as before used for exponential
weighting while the second parameter defines the trend in the following form.

Lt

Yt

Tt

( Lt

Yt

Lt

(1 a)(Lt

Lt 1 ) (1

Tt 1 )
)Tt

kTt

Let us see how this works in Excel on Coca-Cola data and then we perform the same model
in Eviews. Note that formulas in columns G, H and I are simply the above set of formulas.
The only problem is the initial values for L1 and T1. The former can simply be Y1, while the
second one can be estimated using the solver to minimise one of the criteria given, MAE,
RMSE or MAPE.

The graph of actual and fitted values is presented below.


6000.00
5000.00
4000.00
3000.00
2000.00
1000.00
0.00

Sales

Forecast

55

Now, performing the same Holts procedure (ES with Trend) in Eviews Yields the following
plot of actual and fitted sale values for Coca-Cola
6000

5000

4000

3000

2000

1000
86

87

88

89

90

91

92

SALESSM

93

94

95

96

SALES

Once the parameters of the Holts procedure are estimated, it is not difficult to perform a
recursive forecast for the period 1996Q3 to 1999Q4, in Excel or Eviews.

7000
6000
5000
4000
3000
2000
1000
86

87

88

89

90

91

92

93

SALESSM

3.5

94

95

96

97

98

99

SALES

Seasonal Time Series

As we have seen earlier sometimes time series show regular pattern over the calendar. This
type of regularity in time series is known as seasonality. Seasonality might be detected in data
with any frequency, e.g. semi-annual, quarterly, monthly, or even in daily data when we
consider days of the week or intra-day data when we consider hours of the day as regular
periods.
Seasonality in the time series can be detected using seasonal dummy variables. These dummy
variable models can then be use on their own or within some other more detailed econometric
models to produce forecast for time series.

56

For example consider the following data which represent the quarterly sales of Coca-Cola
over the period Q1 1986 and Q2 1996. Plot of the sale values indicates a distinct regular
movement in the series over the sample period. Also, the graph shows that there is an upward
trend in the sale values.
Time series chart of Sales

5250
Sales

4500
3750
3000
2250
1500
1

10

13

16

19

22

25

28

31

34

37

40

Observation Number

There are two methods that can be used to forecast seasonal series. These are;
1- Exponential Smoothing Method of Holt-Winter
2- Seasonal Dummy Regression Model
3.5.1 Holt-Winter Model for seasonal time series
Holt-Winter Model for seasonal time series is a little more complex than Holts model, in that
it used three parameters, , , , for smoothing rather than two. Also, seasonality can be
treated in the Holt-Winter Method as Additive or Multiplicative.
The difference between the two treatments is the way seasonal changes are applied to the
mean of the series. Suppose we are dealing with monthly seasonal series. The mean of the
series (or the mean for the base season) is 100 and the analysis shows that seasonal change in
March is +30 and in Jun is -20. Then, in additive treatment the changes are added to the mean
of the series (or the mean of the base season). Therefore, the figures for March and June will
be 130 and 80, respectively. On the other hand, if the treatment is multiplicative, then
assuming that the seasonal changes for March and June are 1.3 and 0.8, the level of the series
for March and April will be 100*1.3=130 and 100*(0.8)=80, respectively.
The Additive Holt-Winter Exponential Smoothing Method uses the following set of
equations to forecast a seasonal series.
Lt

(Yt

St

Tt

( Lt

Lt 1 ) (1

St

(Yt

Lt ) (1

Lt

kTt

Yt

) (1 a)( Lt

St

)Tt
)St

k M

57

Tt 1 )

Where: Lt is the permanent component (intercept), Tt is the trend component, St is the additive
seasonal component, M refers to the number id seasons (quarterly=4, monthly=12, etc)

The Multiplicative Holt-Winter Exponential Smoothing Method uses the following set of
equations to forecast a seasonal series.
Lt

Yt
St M

Tt

( Lt

Lt 1 )

St

Yt
Lt

(1

( Lt

kTt ) S t

Yt

(1 a )( Lt
(1
)St

Tt 1 )

)Tt

k M

Let us see how this method works in Eviews. Note that you can perform this easily using
Statpro. We use the same data series; that is, Coca-Cola sales.
Using =0.2, =0.2 and =0.2 as smoothing parameters we get the following results. Notice
that these parameters can be changes to optimise some criteria such as RMSE. The program
can do this automatically, but it usually minimises these criteria in such a way that one or two
of parameters become zero.
Sample: 1986:1 1996:2
Included observations: 42
Method: Holt-Winters Additive Seasonal
Original Series: SALES
Forecast Series: SALESSM_AD
Parameters: Alpha
Beta
Gamma
Sum of Squared Residuals
Root Mean Squared Error
End of Period Levels:
Mean
Trend
Seasonals: 1995:3
1995:4
1996:1
1996:2

0.2000
0.2000
0.2000
1947573.
215.3388
4841.597
112.4040
211.0728
-190.6499
-354.7695
334.3466

We can see from the result that seasonal components can be Q1=-354, Q2=334, Q3=211 and
Q4=-190. We can also plot the actual and fitted values of the Holt-Winter Model with
additive seasonal components.

58

Holt-Winter ES with additive seasonal components for Coca-Cola sales

Sample: 1986:1 1996:2


Included observations: 42
Method: Holt-Winters Multiplicative Seasonal
Original Series: SALES
Forecast Series: SALESSM_MU
Parameters: Alpha
Beta
Gamma
Sum of Squared Residuals
Root Mean Squared Error
End of Period Levels:
Mean
Trend
Seasonals: 1995:3
1995:4
1996:1
1996:2

0.2000
0.2000
0.2000
1539869.
191.4773
4837.382
111.8119
1.063009
0.950468
0.887070
1.099454

Holt-Winter ES with Multiplicative seasonal components for Coca-Cola sales

59

3.5.2 Regression Analysis of Seasonality


In order to model a seasonal time series using regression analysis, as mentioned before, we
use seasonal dummies. These are zero and one dummies which take a value of one for a
particular season (period) and zeros otherwise. Therefore, first we construct the seasonal
dummies (S1,t, S2,t, S3,t and S4,t for quarterly observations, and S1,t to S12,t for monthly
observations) and then run the following regression model.

Yt

S 2,t

S 3,t

S 4,t

~ iid(0,

Note that we use dummies S2,t, S3,t and S4,t and we drop S1,t. This is to avoid multicolinearity.
For monthly series, we use dummies S1,t to S12,t.
Dependent Variable: SALES
Method: Least Squares
Sample: 1986:1 1996:2
Included observations: 42
Variable
Coefficient
C
1169.499
TREND
72.97756
S2
639.1542
S3
505.0406
S4
172.9101
R-squared
0.916479
Adjusted R-squared
0.907450
S.E. of regression
297.5065
Sum squared resid
3274874.
Log likelihood
-296.1419
Durbin-Watson stat
0.440827

Std. Error
t-Statistic
193.6863
6.038108
7.590032
9.614922
77.45190
8.252272
100.2065
5.039997
73.16934
2.363149
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0000
0.0000
0.0000
0.0000
0.0235
2994.353
977.9309
14.34009
14.54696
101.5009
0.000000

Estimating the above regression shows that coefficients of all dummy variables as well as the
trend variable are significant. Therefore, we can calculate fitted values and plot them along
the actual values. The graph shows that the fitted is quite good and this is confirmed by the
R-bar squared of the model, which is 90.7%
Actual, fitted and residual values of Seasonal Regression Model for Coca-Cola sales

60

3.5.3 Forecasting with regression model


Having obtained the estimates of coefficients of trend and dummy variables, the next step is
to use this model to produce forecasts of time series, Yt. This can be done recursively using
the following equations.
Yt

S2

Yt

S3

S2

S4

S3

S4

Eviews does this automatically and produces the graph of the forecasts as well as a series of
statistics which we will discuss later.
Actual, fitted and forecast values of Seasonal Regression Model for Coca-Cola sales

61

CHAPTER 4

Econometric Forecasts & Forecast


Evaluation

62

4 Econometric Forecasts & Forecast Evaluation

4.1 Ex post vs. Ex ante forecast


Depending on the information used, forecasts can also be distinguished, into ex-post and exante forecasts. Both forecasts predict the value of the dependent variable(s) beyond the
estimation period. In an ex-post forecast observations on both endogenous and exogenous
variables are known with certainty, and the performance of the forecasting model can be
checked using the actual (realised) values of the dependent variable(s). In the case of the exante forecast, the values of exogenous explanatory variables may or may not be known with
certainty.

Estimation Period

|
t=0

Ex post
Forecast period

|
t-p

Ex ante
Forecast period

|
t

|
T

4.1.1 Unconditional vs. Conditional Forecasts


In an unconditional forecast, values for all explanatory variables in the forecasting are
known with certainty over the forecasting period. Therefore, the ex post forecast is an
unconditional forecast, but ex ante forecasts may be unconditional provided the values of
explanatory variables known with certainty over the ex-ante forecast period.
In the case of conditional forecast values of one or more explanatory variables are not
known with certainty over the forecasting period. Therefore, estimates, guesses or forecasts
of these variables must be used to perform forecasts of dependent variable(s) over the
forecasting period.

4.2 Unconditional forecasting


For producing unconditional forecasts, all the values of the explanatory variables should be
known over the forecasting period. Therefore, one way to perform unconditional forecasts
when out of sample (full sample) data is not available is to reduce the estimation period by
the forecasting period and perform an ex post forecast.
Sometimes models include several period lagged explanatory variables which can be used to
produces unconditional forecasts beyond the end of the sample as the value of the
explanatory variables are known

63

In some cases, models might only include, trend, dummy and seasonal dummy variables.
These are also known explanatory variables. Therefore, forecasts produced using these types
of models are unconditional too.
4.2.1 Forecast error
Errors associated with econometric forecasts can come from a combination of four distinct
sources;
1- The random error terms in the model (unexplained part of variation in the
dependent variable
2- The process of estimation of regression parameters even when they are correctly
estimated as these are random variables
3- In the case of conditional forecast, the error due to the estimation or prediction of
explanatory variable(s)
4- Errors may be induced due to model specification (e.g. estimating a liner instead
of a nonlinear model)
Let us consider the following model for unconditional forecasting
Yt

Xt

~ NI(0,

1,2,, T

The forecasting problem can therefore be posed as; Given a known value for XT+1, what
would be the best forecast that can be made for YT+1?
This can be answered if the values of
and
appropriate forecast for YT+1 can be written as

YT

Et (YT 1 )

XT

are known, and if this is the case the

Therefore the error of this forecasting exercise; i.e. the forecast error, can be written as
eT

YT

YT

The forecast error from the above model has two important and desirable properties:
1- The mean of forecast errors is zero, E(eT 1 )
means that the forecast is unbiased.

E (YT

Y T 1)

E(

T 1

0 , this

2
2- The variance of forecast error, defined as 2f E[( eT 1 ) 2 ] E[( T 1 ) 2 ]
, has
the minimum variance among all possible forecasts that are based on linear
models.

Furthermore, since forecast error are normally distributed with mean zero and variance
the significance of the forecasted values can be performed using
YT

YT

which means

~ N(0,1)

64

2
f

Therefore the confidence interval around the point forecast can be provided using
YT

Pr ob

Or

4.3

p YT

YT

/2

/2

YT

/2

YT

as

/2

Evaluating forecasts

The question after estimating a model and making forecasts is how to evaluate these
forecasts. One method is to use the forecast error variance and construct the confidence
interval for the forecasts as mention above.
An alternative method is to use several forecasts over a period and compare the forecast
values with realised values. The objective here is to see how good forecasts values track the
actual values of the dependent variable.
There are a number of methods are proposed for such comparison. Among these are; Mean
Absolute Error (MAE), Root Mean Absolute Error (RMAE), Mean Square Error (MSE) and
Root Mean Square Error (RMSE).

Let use consider the following model


Yt

Xt

~ NI(0,

1,2,,200

We have 200 observations in total. We use an Ex post forecast; that is,


1- we estimate the model over 180 observations and forecast the value of Y181f .
2- We increase the sample to 181 and estimate the model, but this time we forecast
Y182f
3- If we repeat the same procedure, we will have 20 one step ahead forecasts
f
f
f
Y181
, Y182
,, Y200

4- These forecast values can then be put next to actual values

65

Forecast values

Actual values

Forecast error

Absolute Error

Square error

Y -Y

|Y -Y |

a
( Y181
- Y181f )2

Y -Y

|Y -Y |

a
( Y182
- Y182f )2

f
181
f
182

a
181
a
182

a
181
a
182

f
181
f
182

a
181
a
182

f
181
f
182

.
.
.

.
.
.

.
.
.

.
.
.

.
.
.

Y199f

a
Y199

a
- Y199f
Y199

a
| Y199
- Y199f |

a
( Y199
- Y199f )2

f
Y200

a
Y200

a
f
- Y200
Y200

a
f
| Y200
- Y200
|

a
f
( Y200
- Y200
)2

MAE

MSE

1 200 a
| Yi Yi f |
20 i 181

1 200 a
(Yi Yi f ) 2
20 i 181

RMAE

1
20 i

RMSE

200

1
20 i

| Yi a Yi f |
181

200

(Yi a Yi f ) 2
181

5- Therefore, in general, the RMAE and RMSE for evaluating the forecast can be
found using the following formulas

1
M

RMAE

| Yi

RMSE

Yi |

i 1

1
M

(Yi a Yi f ) 2
i 1

These two statistics can b used to compare the forecasting performance of different models
proposed to forecast a variable. Obviously the model with the smaller RMSE (RMAE) is
preferred in terms of forecasting performance.
Another method suggested in the literature for evaluating the performance of forecasts, is the
Theils inequality coefficient or Theils U statistics, which is defined as
1
M

Theil' s U
1
M

(Yi a Yi f ) 2
i 1

(Yi f ) 2
i 1

1
M

(Yi a ) 2
i 1

Although the numerator of the Theils U is the same as the RMSE, the denominator is chosen
in such a way to confine the statistic between 0 and 1.
1- When Theils U = 0, it means that YTa m = YTf m ; that is there is a perfect relation
(fit) between the actual and forecast values.
2- When Theils U = 1, it means that the prediction is no way close to the actual
values.
Therefore, the Theils U is in fact the standardised RMSE and measures the RMSE in relative
terms.

66

Example: Forecasting returns on Wheat futures


In this exercise we estimate an AR(1) model for the returns wheat futures contracts over the
period (1/17/1979 to 4/05/2000) and the perform one-step ahead forecast over the period
(4/12/2000 to 4/04/2001) and try to evaluate the forecasts using RMSE and Theils U statistics
Dependent Variable: DLWHTF
Method: Least Squares
Sample(adjusted): 1/17/1979 4/05/2000
Included observations: 1108 after adjusting endpoints
Convergence achieved after 2 iterations
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
-0.000118
0.000806
-0.146926
0.8832
AR(1)
-0.055441
0.030030
-1.846211
0.0651
R-squared
0.003072 Mean dependent var
-0.000118
Adjusted R-squared
0.002171 S.D. dependent var
0.028348
S.E. of regression
0.028318 Akaike info criterion
-4.288863
Sum squared resid
0.886887 Schwarz criterion
-4.279819
Log likelihood
2378.030 F-statistic
3.408496
Durbin-Watson stat
2.000738 Prob(F-statistic)
0.065129
Inverted AR Roots
-.06

The figure below contains the forecast statistics and the plot of the forecasts along with the
95% confidence interval.
0.08
Forecast: DLWHTFF
Actual: DLWHTF
Sample: 4/12/2000 4/04/2001
Include observations: 52

0.06
0.04
0.02

Root Mean Squared Error


0.028400
Mean Absolute Error
0.022786
Mean Abs. Percent Error
101.3350
Theil Inequality Coefficient
0.946384
Bias Proportion
0.005541
Variance Proportion0.888654
Covariance Proportion
0.105806

0.00
-0.02
-0.04
-0.06
-0.08
4/12/00

6/21/00

8/30/00

11/08/00

DLWHTFF

1/17/01

3/28/01

2 S.E.

Now consider an alternative model [ARMA(1,1)] which is estimated over the exactly same
estimation period and used to perform forecast for the returns on wheat future over the
exactly same forecast period.

67

Dependent Variable: DLWHTF


Method: Least Squares
Sample(adjusted): 1/17/1979 4/05/2000
Included observations: 1108 after adjusting endpoints
Convergence achieved after 13 iterations
Backcast: 1/10/1979
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
-0.000120
0.000800
-0.149741
0.8810
AR(1)
0.065583
0.528506
0.124092
0.9013
MA(1)
-0.121601
0.525721
-0.231303
0.8171
R-squared
0.003154 Mean dependent var
-0.000118
Adjusted R-squared
0.001349 S.D. dependent var
0.028348
S.E. of regression
0.028329 Akaike info criterion
-4.287139
Sum squared resid
0.886814 Schwarz criterion
-4.273574
Log likelihood
2378.075 F-statistic
1.747880
Durbin-Watson stat
1.999428 Prob(F-statistic)
0.174624
Inverted AR Roots
.07
Inverted MA Roots
.12

And the forecast statistics are

0 .0 8
Fo r e c a s t: D L W H TFF
Ac tu a l: D L W H TF
Sa mp le : 4 /1 2 /2 0 0 0 4 /0 4 /2 0 0 1
In c lu d e o b s e r v a tio n s : 5 2

0 .0 6
0 .0 4
0 .0 2

R o o t Me a n Sq u a r e d Er
0 .0
r o2r8 3 8 2
Me a n Ab s o lu te Er r o r 0 .0 2 2 7 9 6
Me a n Ab s . Pe r c e n t Er1
ro
0r1 .4 8 9 4
Th e il In e q u a lity C o e ffic
0ie
.9n
4t5 2 8 7
Bia s Pr o p o r tio n
0 .0 0 5 6 2 8
Va r ia n c e Pr o p o r tio0
n.8 8 8 8 8 2
C o v a r ia n c e Pr o p o0
r tio
.1 0
n5 4 9 0

0 .0 0
- 0 .0 2
- 0 .0 4
- 0 .0 6
- 0 .0 8
4 /1 2 /0 0

6 /2 1 /0 0

8 /3 0 /0 0

1 1 /0 8 /0 0

D L W H TFF

1 /1 7 /0 1

3 /2 8 /0 1

2 S.E.

Forecast: DLWHTFF- ARMA(1,1)


Actual: DLWHTF
Sample: 4/12/2000 4/04/2001
Include observations: 52

Forecast: DLWHTFF- AR(1)


Actual: DLWHTF
Sample: 4/12/2000 4/04/2001
Include observations: 52

Root Mean Squared Error


0.028382
Mean Absolute Error 0.022796
Mean Abs. Percent Error
101.4894
Theil Inequality Coefficient
0.945287

Root Mean Squared Error


0.028411
Mean Absolute Error 0.022653
Mean Abs. Percent Error
97.62276
Theil Inequality Coefficient
0.992609

Based on the above results, RMSE and Theils U statistics, it can be concluded that the
ARMA(1,1) produce better forecasts than the AR(1) model, as the values of both statistics are
lower for the ARMA(1,1) model.

68

4.3.1 Multi-step ahead forecasts


It is sometimes useful to have forecasts more than one-period ahead, i.e. the forecast for the
dependent variable for 2, 3, or more period ahead. These are called multi-step ahead
forecasts.
To evaluate these types of forecasts, we use exactly the same methodology as we did for onestep ahead forecasts.
Estimation Period
|

| | | | | | | | | | | | | |
2-step ahead forecast

Estimation Period
|

| | | | | | | | | | | | | |
2-step ahead forecast

Estimation Period
|

| | | | | | | | | | | | | |
2-step ahead forecast

And now we use all the 2-step ahead forecasts and the actual values to find the forecast
errors, MAE, RMSE and Theils U statistic just as we did in the case of one step ahead
forecasts. Then these statistics are compared to the ones with alternative (competing) models.
3, 4, and more step ahead forecast are performed and examined in a similar way.
4.3.2 Static versus dynamic forecast
Forecasts can also be classified into static and dynamic forecasts.
In a static forecasting exercise, just as in the above, in the one-step ahead forecast, the model
is re-estimated each time an observation is added to the estimation period.
In a dynamic forecasting exercise, once the model is estimated over the estimation period, 1,
2, 3, , m step ahead forecasts are calculated. This means that we use forecast values as
observed values and try to forecast further into future.

69

Estimation Period
|

t
1-step

T
|

2-step
3-step
k- step
m- step

Example: Interest rate model (Pyndick and Rubinfeld 8.2)


In this example, we estimate the Interest rate model over the period January 1960 to August
1995. Then we used the model to forecast the interest rates over the period January 1995
to February 1996.
Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08
Included observations: 428
Variable
C
IP
GM2
GPW(-1)
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733

Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000

And forecasting the model (using a dynamic forecast) over the period January 1995 to
February 1996, i.e. 14 forecast points, 1 month ahead to 14 month ahead.

14
Fo r e c a s t: R 3 F
Ac tu a l: R 3
Sa mp le : 1 9 9 5 :0 1 1 9 9 6 :0 2
In c lu d e o b s e r v a tio n s : 1 4

12
10

R o o t Me a n Sq u a r e d Er
2 .5
r o0r4 0 9 2
Me a n Ab s o lu te Er r o r 2 .4 7 0 4 3 7
Me a n Ab s . Pe r c e n t Er4
ro
5r.8 8 1 6 8
Th e il In e q u a lity C o e ffic
0ie
.1n
8t7 6 0 5
Bia s Pr o p o r tio n
0 .9 7 3 3 0 0
Va r ia n c e Pr o p o r tio0
n.0 0 0 0 4 3
C o v a r ia n c e Pr o p o0
r tio
.0 2
n6 6 5 7

8
6
4
2
9 5:01

9 5:03

9 5:05

9 5:07

R 3F

9 5:09

9 5:11

9 6:01

2 S.E.

70

4
94:07

94:10

95:01

95:04

95:07

R3

95:10

96:01

R3F

4.3.3 Decomposition of the Theils U


The Theils U
M

1
M

Theil' s U
1
M

(Yi a Yi f ) 2
i 1

1
M

(Yi f ) 2
i 1

(Yi a ) 2
i 1

can be decomposed into three component in the following form


1
M

(Ymf

Where Y f , Y a ,

Yma ) 2

and

(Y

Y a)

)2

2(1

are the means and standard deviations of the forecast and actual

values over the forecasting period, respectively, and is the correlation coefficient between
(1 / f a M ) (Ymf Y f )(Yma Y a ) . Therefore, three proportions of the
the two series;
Theils inequality can be defined as

The bias proportion

The variance proportion

US

The covariance proportion

UC

(Y

(Ymf

(1 / M )

(
(1/ M )
2(1
(1 / M )

Y a )2

a
f
m

Yma ) 2

)2
Yma ) 2

(Y

f
f
m

(Y

Yma ) 2

The bias proportion UB is an indication of systematic error, since it measures to what


extent the average values forecast and actual values deviate from each other. The smaller this
number the better the forecast.

71

The variance proportion US indicates the ability of the model to replicate the degree of
variability in the dependent variable for which the forecasts are produced. If US is found to be
large, it means that the actual values fluctuated considerably higher than the forecast values,
or vice versa. In such situations the model should be revised.
The covariance proportion UC measures the unsystematic errors; i.e., it represents the
remaining error after deviations from average values have been accounted for. As there are
always differences between actual and forecast values, and these series are not perfectly
correlated, this component is less problematic.

0.08
Forecast: DLWHTFF
Actual: DLWHTF
Sample: 4/12/2000 4/04/2001
Include observations: 52

0.06
0.04
0.02

Root Mean Squared Error


0.028400
Mean Absolute Error
0.022786
Mean Abs. Percent Error
101.3350
Theil Inequality Coefficient
0.946384
Bias Proportion
0.005541
Variance Proportion0.888654
Covariance Proportion
0.105806

0.00
-0.02
-0.04
-0.06
-0.08
4/12/00

6/21/00

8/30/00

11/08/00

DLWHTFF

1/17/01

2 S.E.

72

3/28/01

CHAPTER 5

Linear & Non Linear Programming

73

5 Linear & Non Linear Programming


5.1 Introduction
Linear programming is one of the most successful quantitative methods for decision making
which have been applied to different areas of business such as production management,
logistics and distribution, marketing, financial analysis, etc. Problems studied and solved
using linear programming include production scheduling, financial and resource planning,
capital budgeting, transportation, etc.
Linear programming describes graphical and mathematical procedures that are used to
optimise allocation of sources or limited resources to competitive products and activities. In a
linear programming problem, typically the objective is to either maximise the benefits while
resources are limited or minimise cost subject to certain criteria.
Nowadays, there are many computer programs that include solvers for linear (nonlinear)
programming as a part of their applications and routines. For example, programs such as The
Management Scientist, LINDO and Excel Solver are widely used in the industry for solving
linear programming problems along with some custom designed programs for solving
optimisation problems.
In what follows we will see how linear programming can be used for optimisation of
resources using simple examples, however, before getting down into the problem, we need to
introduce a few mathematical definitions.

5.2 Mathematical Inequalities


Before considering linear programming problems it is necessary to learn and be able to sketch
mathematical inequalities.
A linear mathematical inequality defines the relationship between two or more variable in a
form of inequality. For example,

dy ex 12
we use following mathematical signs to indicate the relationship between variables,
a<b
a.> b
a b
a b

a is less than b
a is greater than b
a is greater than or equal to b
a is less than or equal to b

5.3 Graphical presentation of inequalities

74

Linear mathematical inequalities can also be plotted and sketched on the Cartesian Plane. For
example the following simple inequality [y x] relation can be shown by first plotting the
line y=x, and then excluding those areas on which the inequality does not hold.
10
9

y=

8
7

y x

6
5
4
3
2
1

y x

0
0

10

Note that the shaded area represents the set of points that do not satisfy the inequality.
In general to sketch a mathematical inequality such as
dy ex

dy ex

dy ex

dy ex

First plot the line representing dy ex f , then find that which side of the line (area) has to
be excluded. For this purpose, two arbitrary points can be chosen at either sides of the line to
check whether the values satisfy the inequality. The side for which inequality is not satisfied
should then be excluded.
Example 2: Sketch the following linear mathematical inequality y 2x 8
10
8

Y+2x=8

6
4

y+2x 8

2
0
-10

-5

-2

10

-4

y+2x<8

-6
-8
-10

It is not difficult to extend the above to sketch two or more linear mathematical inequality
simultaneously to obtain the feasible region. Therefore, the feasible region is defined as the
area in which every point satisfied all inequalities simultaneously.

75

Example 3: find the feasible area or the following set of mathematical inequalities.

x 0

y 0

y x 10

10

10

0
-10

-5

-2

0
0

10

-10

-5

-2

-4

-4

-6

-6

-8

-8

-10

-10

10

5.4 The problem of linear programming


In general linear programming problems consists of three parts:
1- A linear objective function
The objective function is a mathematical statement of what the researcher (management)
wants to achieve. This could be the maximisation of profit, minimisation of cost or some
other measurable objective. The linearity condition implies that the parameters of the
objective function are constant and the function can be represented by straight lines on a
graph.
2- A set of linear structural constraints
The structural constraints are the physical limitations on the objective function. They could
be constraints in terms of budgets, labour, raw materials, markets, social acceptability, legal
requirements or contracts. Again the linearity condition means that all these constraints have
fixed coefficients and can be represented by straight lines on a graph.
3- A set of non-negativity constraints
Finally, the non-negativity constraints limit the solution to positive (meaningful) results and
answers only. For example, prices, budgets, labour, raw materials, can not be negative.

5.5 The graphical solution of the Linear Programming problem


In order to see how the linear programming problem is graphically optimised, consider the
following example.
Miximise

2y+x

subject to the following constraints,

76

y+x 5 ,

y-x 2

x 0

y 0.

First use the constraint to sketch the feasible region on a graph and use inequality conditions
to find the feasible regions and the final feasible area.

6
Y=X+2

5
4

Y=-0.5X+4.1

3
X=0

Y=-4.1X+3.5

Y=-0.5X+3

1
Y=-X+5

0
0

Y=0

Assuming an arbitrary value for the objective function, draw the objective function (note that
this is not the maximised function or value). Then shift the objective function up or down to
find the point at which the function is in the feasible area and has its maximum value, with
respect to other points within the feasible area.

5.6 Numerical solution of the Linear Programming problem


Alternatively and more appropriately, the LP problem can be solved using a numerical
solution. To do so, you must find all intersection points between inequalities and evaluate the
objective function at each point. Depending on whether the function should be maximised or
minimised, the appropriate intersection point, which is still within the feasible area, should be
selected.
The intersection points between constraints in the above LP problem within the feasible are
O(0,0), A(0,2),

B(1.5, 3.5) , C(5,0)

Evaluating the objective function at each point will result in


1234-

O(0,0)
A(0,2)
B(1.5, 3.5)
C(5,0)

2y + x = 0
2y + x = 4
2y + x = 8.5
2y + x = 5

77

Note that only in point B within the feasible area the objective function has its maximum
value.

Example; Maximise 3x+2y, subject to x+4y 8, x+y 5, 2x+y 6, x 0 and y 0.


1- Draw the feasible area

8
8

7
7
6
5

Feasible

Area

0
0

0
0

2
6

2- Try to shift the objective function to find the maximum value.


It can be seen that the objective function can increase infinitely as it moves to the right hand
side. In such cases we say that the feasible area is unbounded. Therefore, there are unlimited
number of solutions to the above problem.

Example: Find the maximum value of the objective function 3y+x subject to the following
constraints, 5x+3y 90, y-2x 15, x 0, y y.
1- Draw the constraints and specify the feasible region

78

40
35
Y-2X 15

30
25

3Y+x

20
15
10

3Y+5X 90

5
0
0

10

15

20

2- Draw the objective function and move it around the feasible area to find the miximum
value that it can take within the feasible region.
3- Find the points of intersection between constraints. These are:
O(0,0), A(0,15),

B(4.09, 23.18) , C(18,0)

Evaluating the objective function at each point will result in


1234-

O(0,0)
A(0,15)
B(4.09, 23.18)
C(18,0)

x + 3y = 0
x + 3y = 45
x + 3y = 73.63
x + 3y = 18

Note that only at point B within the feasible area the objective function has its maximum
value.

Example: An air transport company operates two types of aircrafts, A4030 and B6015. The
A4030 has a carrying capacity of 40 passengers and 30 tons of cargo, whereas the B6015 can
carry 60 passengers and 15 tons of cargo. The company has a contract to carry 480
passengers and 180 tons of cargo each day. If each A4030 flight costs 500 and each B6015
flight costs 600, what choice of aircraft would minimise the transportation cost subject to
fulfilment of the contract?
The objective function is

z=500*X + 600*Y

Where X is the number of A4030 flights and Y is the number of B6015 flights per day. The
constraints are
40*X + 60*Y 480
30*X + 15*Y 180
X 0 and Y 0

number of passengers
amount of cargo
non-negativity conditions

79

The passenger constraint implies that the number flight required to carry minimum number of
passengers should be
40*X + 60*Y = 480
A: (X1 =0, Y1 = 8)

or

B: (X2 =12, Y2 = 0)

Similarly, the cargo constraint implies that the number flight required to carry minimum
amount of cargo should be
30*X + 15*Y = 180
C: (X1 =0, Y1 = 12) or

D: (X2 =6, Y2 = 0)

However, the above combinations do not lead to minimum transportation costs. Therefore, a
LP solution is required to optimise the cost.

No of B Flights

Drawing the feasible area helps to identify the possible solution to the problem. Thus, we
draw the feasible area first.
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0

9 10 11 12 13 14 15 16 17

No of A Flights

Now let us evaluate the cost at the point of intersection between the linear constraint, which
is within the feasible area.
E: (XE=3, YE = 6)
The total transportation cost is

3*500 + 6*600 = 5100

This combination of scheduling ensures that the terms of contracts are fulfilled and cost is
minimised.

5.6.1 Existence of multiple solutions


There might be cases where more than a unique solution can be obtained for a LP problem. In
other words, there might be cases where a number of solutions satisfy the set of constraints as
well as the objective function simultaneously. Consider the above example when the cost of

80

each flight for A4030 aircrafts is 600 and the cost of each flight for B6015 planes is 900.
The objective function would be
z=600*X + 900*Y
which is parallel to the line drawn for constraint to carry passengers. This means that
operating on any combination of aircrafts that lie on the line drawn for passenger constraint
will be an optimum solution.

5.7 Opportunity cost and Shadow price


Similar to multivariate constrain optimisation covered in earlier in QM, in LP one can obtain
some sort of sensitivity factor on the optimum solution by relaxing constraints.
Consider the following maximisation problem. The aim is to maximise the production
function for two goods X and Y.
z = 2*X + 3*Y
Subject to
X + 2*Y 40
labour constraint
6*X + 5*Y 150
material constraint
and
X 0 and Y 0
In this example the production is constrained by 40 hours of labour and 150 litres of
moulding material. It is possible to see and analyse how additional resources can influence
the profitability of the production; that is, by how much the profit increase if the available
labour hours (moulding material) is increased by one hour (1 litre). The amount of increase in
profit as a result of the additional unit of resource (in this case) is referred to as the
opportunity cost or the shadow price.
In order to determine the shadow price in LP problem, we first determine the optimum
production level which maximises revenue (z) subject to constraints
z = 2*X + 3*Y
Subject to

X + 2*Y 41
6*X + 5*Y 150
X 0 and Y 0

labour constraint is relaxed by one unit


material constraint
non negativity constraints

Using Excel's Solver we can determine the optimum production level as


X=14.2857 and

Y=12.8571

The above production level results in the following revenue level


z= 2*14.2857 + 3 * 12.7851 = 67.14

81

Next we examine the case where a unit increase is allowed in the constraint function for
labour and the final revenue is derived using the same procedure as before.

z = 2*X + 3*Y
Subject to
and

X + 2*Y 41
6*X + 5*Y 150
X 0 and Y 0

labour constraint is relaxed by one unit


material constraint

the results indicate that the value of the objective function is increased to
z= 2*13.57 + 3 * 13.71 = 68.28
Therefore the increase in profit as a result of one additional unit in labour hour is
SP = 68.28 67.14 = 1.14
To find out the shadow price of moulding material, we need to repeat the procedure by
increasing the available moulding material by one unit and keeping the rest constant.
z = 2*X + 3*Y
Subject to
and

X + 2*Y 40
6*X + 5*Y 151
X 0 and Y 0

labour constraint is relaxed by one unit


material constraint is relaxed by one unit

the results indicate that the revenue is increased to


z= 2*14.57 + 3 * 12.71 = 67.28
Therefore the increase in revenue as a result of one additional unit in labour hour is
SP = 67.28 67.14 = 0.14
If the labour cost per hour is below 1.14, then it is profitable to increase the amount of
available labour hours for this production line. Similarly if the cost of moulding material is
below 0.14 per liter, then profit increase further by providing more moulding material for
the production line.

5.8 Financial Application of Linear Programming


In finance linear programming has been used to solve problems involving capital budgeting,
make-or-buy decisions, asset allocation, portfolio selection and financial planning. For
example, a portfolio manager can be in situations where s/he needs to select specific
investments-e.g. stocks and bonds- form a variety of investment alternatives. The same
applies to managers of mutual funds, credits unions, banks and insurance companies. The

82

objective function for portfolio selection problem strategy is always to maximise the
expected return and/or minimise the risk of portfolio. The constraints are usually the
restrictions on the type of investment instruments due to state laws, company policy,
maximum permissible risk, etc. These problems are generally addressed using complicated
mathematical programming techniques one which (perhaps the simplest) is linear
programming.

Consider a portfolio manager who is looking for different investment instruments to allocate
the sum of $100,000 cash, which she has just been given. The senior financial advisor of his
company has recommended investing in biotech, car manufacturing industry or government
bonds. These are based on the analysts results in which five investment opportunities have
been identified and their annual rate of returns are projected. These are reported in the
following table

Investment opportunities for the portfolio manager


Investment
Projected rate of return (%)
1
2
3
4
5

John Smith Pharmaceutical (JSP)


Fosters Pharmaceutical (FP)
Stella Atoir Cars (SAS)
Carling Cars (CS)
Government bonds (GB)

7.3
10.3
6.4
7.5
4.5

The guidelines for investment in Bud fund management company are as follows
1- Investment on each industry should not exceed $50,000 at any time
2- There should be at least 25% of the investment on the car manufacturing industry on Government
Bonds
3- The investment on high risk company (Fosters Pharmaceutical) should not be more than 60% of
the investment in that sector (biotech)

What should be the portfolio selection of the portfolio manager to maximise the projected
return subject to imposed investment constraints?
In this case a linear programming model can help the portfolio manager to solve the problem
and come up with the optimum solution; that is, achieving the highest return for such
investment opportunities.
Assuming

A=$ invested in JSP,


B=$ invested in FP,
C=$ invested in SAS,
D=$ invested in CS,
E=$ invested in GB.

And using the projected rates of returns, the objective function that need to be maximise is

83

max 0.073A+ 0.103B+0.064C+0.075D+0.045E


the constraints are

A+B+C+D+E = $100,000
A+B $50,000
C+D $50,000
E 0.25(C+D) or E0.25C-0.25D 0
B 0.6(A+B) or -0.60A+0.40B 0

Furthermore, there are non-negativity restrictions, which should be considered. What are
those constraints?
Using Excel Solver the following results are obtained as a solution to the problem. You need
to look at Excel manual to learn how to use a solver. At the moment this is out of the scoop
of this class.

84

Microsoft Excel 8.0e Answer Report


Worksheet: [Book1]Sheet1
Target Cell (Max)
Cell
$B$7

Name
maximise

Original Value
8000

Final Value
8000

Adjustable Cells
Cell

Name

$B$13
$B$14
$B$15
$B$16
$B$17

Pound invested in JSP


Pound invested in FP
Pound invested in SAS
Pound invested in CS
Pound invested in GB

Original Value
20000
30000
0
40000
10000

Final Value
20000
30000
0
40000
10000

Constraints
Cell
$B$21
$B$22
$B$23
$B$24
$B$25

Name
Fund avialable =
Investment on A + B =
Investment on C + D =
Gov Bond Constraint
Investment Constraint

Cell Value

Formula

100000 $B$21=$C$21
50000 $B$22<=$C$22
40000 $B$23<=$C$23
0 $B$24>=$C$24
0 $B$25<=$C$25

85

Status
Binding
Binding
Not Binding
Binding
Binding

Slack
0
0
10000
0
0

Microsoft Excel 8.0e Sensitivity Report


Worksheet: [Book1]Sheet1
Adjustable Cells
Cell

Name

$B$13
$B$14
$B$15
$B$16
$B$17

Pound invested in JSP


Pound invested in FP
Pound invested in SAS
Pound invested in CS
Pound invested in GB

Final Reduced Objective


Value
Cost
Coefficient
20000
30000
0
40000
10000

Allowable
Increase

Allowable
Decrease

0
0.073
0.03
0.055
0
0.103
1E+30
0.03
0 0.06399996 0.011000166
1E+30
0
0.075
0.0275 0.011000149
0
0.045
0.03 37796.2825

Constraints
Cell
$B$21
$B$22
$B$23
$B$24
$B$25

Name
Fund avialable =
Investment on A + B =
Investment on C + D =
Gov Bond Constraint
Investment Constraint

Final
Value
100000
50000
40000
0
0

Shadow
Price
0.069
0.022
0
-0.024
0.03

Constraint
R.H. Side

Allowable
Increase

100000
50000
50000
0
0

12500
50000
1E+30
50000
20000

Microsoft Excel 8.0e Limits Report


Worksheet: [Book1]Sheet1

Cell
$B$7

Target
Name
maximise

Value
8000

Cell

Adjustable
Name

Value

$B$13
$B$14
$B$15
$B$16
$B$17

Pound invested in JSP


Pound invested in FP
Pound invested in SAS
Pound invested in CS
Pound invested in GB

20000
30000
0
40000
10000

Lower Target
Limit Result
20000
30000
0
40000
10000

86

8000
8000
8000
8000
8000

Upper Target
Limit Result
20000
30000
0
40000
10000

8000
8000
8000
8000
8000

Allowable
Decrease
50000
12500
10000
12500
30000

5.9 Nonlinear Programming, (NLP)


There are many instances where the objective function or one or several of constraints are
nonlinear functions of decision variables. In such cases we need to use a method called
nonlinear programming to optimise our variables.

When an objective function is nonlinear, as we saw earlier in QM, the function may have
more than one turning point. The number of turning points depends on the degree of the
objective function. Also we noticed that the turning points might be local maxima (minima),
or global maximum (minimum).

Example 1: Florida Power and Light faces demands during both peak-load and off peak-load
times. FPL must determine the price per kilowatt-hour (kwt) to charge during both peak and
off-peak periods. The daily demand for power during each period (in kwh) is related to price
as follows:

Dp=60 - 0.5Pp + 0.1P0


D0=40 - P0 + 0.1Pp
Here, Dp and Pp are demand and price during peak times, whereas, D0 and P0 are demand and
price during off-peak periods. Note that because of signs of coefficients of prices, an increase
in price in peak period results in a decrease in demand during peak period but an increase in
demand during off-peak period. Similarly, an increase in the price for off-peak period
decreases the demand for off-peak period, while the demand for peak period increases. This
means that off-peak and peak time electricity consumption's are substitutes.
The cost for FPL to maintain 1kwh capacity during a day is $10 and the company wants to
determine a pricing strategy and a capacity level that maximises its daily profit.
Due to relationship between demand and price variables it is not at all obvious what FPL
should do. The price decision determines demand, and larger demand requires larger
capacity, which costs money. In addition, the revenue is price times demand, so it is not clear
whether price should be low or high for each period to increase revenue.
In order to tackle the problem let us see what variables we have
Peak and off peak prices
Peak and off-peak demands
Peak and off-peak revenues
Capacity
Cost of capacity
Total profit
Revenue = (Peak Price)(Peak Demand) + (off-peak Price)(off-peak Demand)
Substitute for peak and off-peak demand

87

Re venue Pp * (60 0.5Pp

0.1P0 ) P0 * (40 P0

0.1Pp )

Expanding the revenue equation we will get a nonlinear model for revenue

Re venue 60Pp 0.5Pp2 0.2 Pp P0

40P0

P02

The above model can be optimised using Excel solver in the following way
Enter the inputs of the problem in a spreadsheet as shown

Write the model in the variable cells as follow


Cells B14 to D14 ,
Cell B18, write
Cell B19
Cells D27 and D28
Cell B22
Cell B23
Cell B24

enter initial values (guess)


=B7+SUMPRODUCT(B14:C14,C7:D7)
=B8+SUMPRODUCT(B14:C14,C8:D8)
=D14
=B18*B14+B19*C14
=B10*D14
=B22-B23

Make the solver ready

88

The objective cell (B24) is the target cell for excel to maximise. The optimisation will take
place with respect to variables which are peak price, off-peak price and capacity. These
variables are related nonlinearly through the model. The optimisation is then performed
subject to the set of constraints given in the constraint box

The result shows that the maximum profit is obtained when prices are set at $70.31/kwh and
$26.53/kwt for peak and off peak periods respectively. This also, implies a capacity of 27.5.
The maximum profit therefore is $2202.30.

5.10 Portfolio Optimization


As we have seen in the finance literature, portfolio diversification can reduce the risk of
portfolio significantly. However, to achieve portfolios with minimum variance, it is
important to know the optimum weights for assets in the portfolio. This is has resulted in
asset allocation models that are used to determine the percentage of assets to invest in stocks,
bonds, commodities and other instruments.
In the following example we use a linear programming model to find the optimum weights
for asset in a portfolio to have certain expected return but minimum variance.
Example 2: The investment company RB Flury can invest in three stocks. From past data
means, SD's and correlations of annual returns on these three stocks have been estimated and
shown in the table below
Stock 1
Stock 1
Stock 1

Mean
0.14
0.11
0.10

SD
0.2
0.15
0.08

Correlation
Stock 1 and 2
Stock 1 and 3
Stock 2 and 3

89

0.6
0.4
0.7

The company wants to invest in a minimum variance portfolio, which has an expected return
of at least 0.12
The following are the summary information
X1, X2 and X3 are weights invested in each of stock 1, stock 2 and stock 3,
Total weight 1,
Expected return on portfolio min 12%,

We can construct the following spreadsheet for the NLP problem

Next we set up the model by filling appropriate cells


Cells B16 to D16 ,
enter initial values (guess)
Cell E16, write
=sum(B16:D16)
Cell B20
=SUMPRODUCT(B16:D16,B5:D5)
Cells D24 to D24
=B16*B6 , =C16*C6 and = D16*D6
Cell B26
=SUMPRODUCT(B24:D24,B24:D24)+2*(B24*C24*C10+B24*D24*D10+C24*D24*D11)
Cell B28
=sqrt(B26)

Select the target cell, variable cells and set up the constraints in the solver. Once everything
is in place, optimise the model

90

This will results in the following weights for the portfolio

Cell
$B$26

Name
Portfolio variance Stock 1

Original
Value
0.0148

Final Value
0.0148

Cell
$B$16
$C$16
$D$16

Name
Fractions to invest Stock 1
Fractions to invest Stock 2
Fractions to invest Stock 3

Original
Value
0.5
0
0.5

Final Value
0.5
0
0.5

Cell Value
0.12
1.0
0.5
0
0.5

Formula
$B$20>=$D$20
$E$16=$G$16
$B$16>=0
$C$16>=0
$D$16>=0

Cell
$B$20
$E$16
$B$16
$C$16
$D$16

Name
Actual
Fractions to invest Total
Fractions to invest Stock 1
Fractions to invest Stock 2
Fractions to invest Stock 3

91

Status
Binding
Binding
Not Binding
Binding
Not Binding

Slack
0.00
0
0.5
0
0.5

CAHPTER 6

Simulation Analysis

92

6 Simulation Analysis
6.1 Introduction
Simulation can be defined as a computer model which can imitate real life situations taking
into account the uncertainty involved in one or more factors. In fact, when we run a
simulation, we let these factors take random quantities and values, and we monitor and record
the outcome of the model each time. The collection of all outcomes can then be used to
produce the distribution of the outcomes for further analysis as well as decision making.
Simulation models are widely used in operation research, engineering, finance and asset
pricing, among other areas of science. In particular, in finance simulation technique is used in
option pricing, Value at Risk (VaR) analysis, capital budgeting, etc.
There are several steps involved in setting up a simulation.
1) Identifying the inputs and outputs of the model
2) Identifying the model - the relationship between inputs and outputs
3) Identifying the process of inputs - determining which distributions do the follow.
4) Set up the model - use a spread sheet or any other statistical software to relate input
and output variables
5) Simulate the model - run the model several hundred times when inputs take different
values according to their specified distributions
6) Record the outcome of outputs each time
7) Interpret the result
It is argued that the simulation process requires some input variables to be generated
randomly and fed to the model. To randomly generate a variable, we need to use the
distribution that we think the variable follows and draw randomly from this distribution. This
can be done easily in excel using RAND() command.

The random number that Excel generates takes values between 0 and 1. This random number
can be used for drawing observations from a specified distribution.

93

6.2

Spread sheet Simulation

To see how generate a random variable let use look at the following example.
We consider the random variable Y, which we know from historical observation, follows the
following probability distribution.
Variable Y

10

Probability

0.05

0.06

0.075

0.1

0.125

0.18

0.125

0.1

0.075

0.06

0.05

cum prob

0.05

0.11

0.185

0.285

0.41

0.59

0.715 0.815

0.89

0.95

The figure below shows the pdf of random variable Y.

0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0

10

We can use the pdf of the variable to generate (simulate) randomly drawn number from the
distribution that Y follows.
1234-

Construct a probability table in Excel which includes cumulative probabilities


Construct a column which counts the number of draws (column A),
Construct a column which generates random numbers (column B) (RAND()),
Construct a column which yields simulated random draws for Y using VLOOKUP
command. Simply write in cell c3 the following
=VLOOKUP(B3,$G$3:$H$13,2)

94

5- copy and paste the cells A3, B3 and C3 until row 1003. This will give you 1000 random
numbers drawn from the distribution that Y follows.
If we plot the histogram of these 1000 draws we will have the following figure, which more
or less resembles the original pdf of Y. Of course this is not identical to the pdf of Y, but if
you increase the number if simulations the shape of the generated distribution will get closer
and closer to the original pdf.

0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0

10

A simulation Example: Albright et al. (1999) example 16.1


In August Walton Bookstore must decide how many of next year's nature calendars to order.
Each calendar costs the bookstore $7.5 and is sold for $10. After February 1 all unsold
calendars are returned to the publisher for a refund of $2.5 per calendar. Walton believes that
the number of calendars it scan sell by February 1 follows the probability distribution shown
in the following table. Walton wants to maximise the expected profit from calendar sales.

Calendars Demanded
Probability

100
0.3

150
0.2

200
0.3

250
0.15

300
0.05

To find out the best order quantity for Walton Bookstore based on simulation analysis we
construct a simulation spread sheet and perform the following steps

1- Use the given information, identify the input (fixed and variables) and fill appropriate
cells. These are
Cost Data and Demand distribution
2- Enter the decision variable. In this case the Order Quantity

95

3- Start constructing the simulation model


i)
Enter the number of replications (cells A19 to A68)
ii)
Generate Random Numbers in column B, (cells B19 to B68)
iii)
Use the vlookup function and probability table to generate random demand
iv)
Calculate revenue using the demand and available stock (order quantity)
v)
Calculate cost using order quantity and unit cost
vi)
Calculate refund using the difference between demand and order quantity and
refund price
vii)
Calculate final profit
viii) The final step is to summarise the result of simulation. This is done by
calculating the mean, SD, Min and Max of the series generate for profit.
Calculate these in cells A12 to A15.
Note that each time you perform an operation in the spread sheet the values of random
numbers generated in column B changes and this will change the whole result in the spread
sheet and you end up looking at a new simulation. One way to stop these number to change is
by copying them and then pasting them as paste special (values) on the same cells. But this is
not recommended as you will lose all the formulas in those cells.

Another important part of the output is the table of average profits and plot of average profit
with respect to order quantity. This output shows the sensitivity of your profit with respect to
changes in order quantity. For example, according to our simulation result, order quantity of
150 yields the maximum profit of $277.5. However, you should note that each time you run
the simulation, you may get different result and it is the average of those results that you must
look at the average of all these simulations.

96

6.3 Simulation with @Risk


This is an Excel add-in program which is developed by Albright et al and is very handy for
simulating simple problems. The program has three main features which makes it very useful
when it is installed and run with Excel. These are
1- @Risk provides several functions such as RISKNORMAL, RISKDISCRETE,
RISKLOGN, etc. which makes it easier to generate random observation from continuous
and discrete pdf's.
2- In this program a set of cells in the spread sheet can be specified as output cells and when
simulation is run, the program records all the statistics (MIN, MAX, SD, etc.)
automatically. It also generates graphs such as histograms as well and average plots.
3- Furthermore, @Risk has function which allows sensitivity analysis to be performed on
certain input variables. This function is RISKSIMTABLE({ vector of values}).

Let us do the Walton Bookstore example again.


Walton Bookstore must decide how many of next year's nature calendars to order. Each
calendar costs the bookstore $7.50 and is sold for $10. After February 1 all unsold calendars
are returned to the publisher for a refund of $2.50 per calendar. Walton believes that the
number of calendars it scan sell by February 1 follows the probability distribution shown in
the following table.
Calendars Demanded
Probability

100
0.3

150
0.2

97

200
0.3

250
0.15

300
0.05

Now let us assume that the demand for calendars follow a Normal distribution with mean 175
and SD 60. Walton wants to maximise the expected profit from calendar sales. Using @Risk,
we can simulate Walton's profits for order quantities of {100, 150, 200, 250 and 300}.
This is done in the following way.
1- First open the @Risk application,

2- Enter the input cells. These are the values for Unit Cost (cell B4), Unit Price(cell B5),
Unit Refund (cell B6), Mean and SD of the distribution for demand (cells E5 and E6), and
possible order quantities (cells G4 to G8).
3- Next construct the decision variable, order quantity, in cell B9. Just write
=RiskSimTable(G4:G9) in that cell. This tells the program to run the simulation for
different order quantities.
4- Next construct the Demand Variable in cells B14. You just need to write in that cell
=RiskNormal(B5,B6).
This tells the program that the random variable, demand, should be drawn from a normal
distribution with mean 175 and SD 60.
5- Revenue in cell C14 can be written as

=B5*MIN(B14,B9)

6- Cost in cell D14 is

=B4*B9

7- Refund is calculated as

=B6*MAX(B9-B14,0)

8- Finally, Profit is the difference between revenue and cost plus any refund
=C14-D14+E14

We are now set to run the simulation, but before that we need to do one more thing. We need
to set the simulation setting. This means to tell to the program how many simulations to run

98

and how many iterations for each simulation. This can be done by clicking on the simulation
setting icon.
Test correlation of data

Define a function
Test which distribution

Add output icon

fits the data

List input and output

Select a distribution
for input variable

This box has several pages, but the page that we are dealing with now is the Iteration page.
We set the #Simulations to 5 as we have 5 different simulations, one for each order quantity.
We can also change the #Iteration as required (e.g. 10000 in this case). This means that for
each simulation case we have 10000 simulations.
Start

running

simulation
Define
setting,

simulation
output

and

updates

99

the

Now we need to add the output cell(s) to @risk. This is done by selecting the output
(resultant) cell(s) and clicking on the "add output icon". In this case select cell F14 (profit)
and click on the icon.
Finally, once everything is set, click on the "start simulation" button and watch the process
of simulation.
Once the simulation is completed, you will switch to @risk window which shows the results.
The result window has two parts, upper window and lower window. The upper window
shows the simulation details (e.g. simulation numbers, #iterations, #inputs, #outputs, runtime,
sampling type, etc). The second part of the upper window presents the results of the
simulated cell(s). In this case, cell F14 is the output cell and it can be seen that the summary
results shows the Min, Max and Mean of profit for different simulations (from 100 to 300
order quantities).

100

The lower part of the result window contains the full simulation statistic for each simulation
including Min, Max, SD, Var, Skewness, Kurtosis, and percentiles. The output is presented
for each variable in the model.

We can summarise the results by copying and pasting them in Excel and preparing them for
tabular or graphical reports and analysis.
Possible Order Quantities
100
150
200
250
300

Sim 1
Sim 2
Sim 3
Sim 4
Sim 5

Mean Profit
227.23
273.86
211.36
39.73
-190.55

SD of Profit
$89.85
$198.38
$323.70
$409.64
$442.59

Mean Profit
300.00
200.00
100.00
0.00
Sim 1

Sim 2

Sim 3

-100.00
-200.00

101

Sim 4

Sim 5

6.4 Applications of Simulation in Finance


Now that we saw how useful @Risk is for simulation we look at more examples in Finance.
The first example involves simulating stock prices (or any asset prices or returns). Simulating
asset prices is important for several reasons. First, and perhaps one of the most useful
application of simulating asset prices is to price derivative securities such as options. Second,
simulation of asset prices can be used in portfolio management, asset allocation ad Value at
Risk analysis.
6.4.1 Capital Budgeting & NPV Analysis
When evaluating projects we use the net present value model to assess whether the cash inflow of the project is greater than the cash out flow of the project, both adjusted for time
value using appropriate discount rate. However, in real life not only the cash flows of projects
are uncertain, but many other inputs that we use in the NPV model might fluctuate during the
project life. Therefore, one way to incorporate such uncertainty in the NPV calculation is to
use Monte Carlo Simulation.
Example: Shipping Investment
As an example consider a shipping investment project where the investor would like to
purchase a 5 year old handysize tanker with a market value of $20m. The investor expects to
operate the vessel under 1 year TC of $13,000/day and renew the TC every year. The
operating cost of the vessel is projected at $6000/day with an annual inflation rate of 3%. The
projected resale value of the vessel after 10 years is assumed to be $10m. The investor has a
required rate of return of 8% (i.e. the cost of capital for this project). The vessel is expected to
have 15 days a year off hire days. Given the information above we can calculate the NPV of
this project.
NPV & Simulation Analysis (HANDYSIZE)
Discount Rate
Freight Revenue
Opex
Initial Investment
Inflation
Resale Value

8%
15000
7000
20
3%
10

pa
$/day
$/day
$m
pa
$m

Year
Period
Annual Cost
Annual Revenue

2015
0
20.00

2016
1
2.555
5.250

2017
2
2.632
5.250

2018
3
2.711
5.250

2019
4
2.792
5.250

2020
5
2.876
5.250

2021
6
2.962
5.250

2022
7
3.051
5.250

2023
8
3.142
5.250

2024
9
3.237
5.250

2025
10
3.334
15.250

Net Cash Flow


Disct Cash Flow

-20.00
-20.00

2.695
2.488

2.618
2.231

2.539
1.998

2.458
1.785

2.374
1.592

2.288
1.416

2.199
1.256

2.108
1.111

2.013
0.980

11.916
5.354

NPV Handysize

0.21

At first sight the project seems OK as it has a positive NPV, however, as we mentioned there
are many factors in this evaluation which are uncertain and could change as the project gets

102

underway. Using MC simulation technique we can relax the assumption that these factors are
fixed and obtain the distribution of NPV as well as the probability that the NPV could be
negative.

For example, we can assume that


inflation ~lognormal(3%,0.3%)
discount rate ~lognormal(8%,0.5%)
freight rate ~ lognormal(15000,2000)
resale price~ lognormal(10,3)

Running 10,000 simulation we can obtain the distribution of the NPV of this project and it
reveals that while the mean of the NPV is still about 0.28, there is significant probability that
this project ends up in a loss (50% probability of NPV being negative)!!

103

6.4.2 Simulating Stock Prices


Let us first see how we can simulate price of a stock. There has been large body of literature
on behaviour of asset prices and in particular stock prices. One of the most agreed model
amongst researchers is that stock prices follow a lognormal distribution. This means that the
log of stock's price should follow a normal distribution. This led the following model to be
proposed for an asst price at time t, Pt, given that the price at time 0, is P0,

Pt

P0 exp[(

0.5

)t

Z t]

Where
is the mean percentage growth rate of the stock,
is the standard deviation of the growth rate,
Z is a random variable which follows a normal distribution (0,1)
Assuming growth rate and standard deviation are annualised and expressed in decimals, e.g.
0.06 growth and 0.1 standard deviation per year.

If we have necessary information; i.e. average growth rate (mean return) and SD of a stock
we can simulate the value of the stock using the above formula. For example, consider a
stock's price today is 10 and has a growth of 6% and SD of 10%. Using a simple simulation
technique, we can construct the distribution of stock price in 25 days time (1 month).
104

The value of the cell for day 0 (cell D8) is the current stock price. The value of the stock for
next period (day 1) in cell D9 is given by the following formula
=D8*EXP((($C$4-0.5*$C$5^2)*(C9/250))+$C$5*RiskNormal(0,1)*SQRT(1/250))

while the values for days 2 to 25 in cells D10 to D33 is just copy of the previous cell. Next
we need to set up the simulation parameters as 1000 iteration and one simulation. Then we
need to determine the simulation input (cells D9 to D33) and finally run the simulation.

Running the simulation will create 1000 price series 10 of which is shown in the following
figure. It can be seen that stock price fluctuates over the period and everyday, however, there
is a slight growth in the series. The important point is the distribution of the final prices that
we obtain. The mean of this distribution should be 10+(1+0.06/12)*10
10.6
10.4
10.2
10
9.8
9.6
9.4
9.2
9
1

Sim 1

Sim 2

Sim 6

Sim 7

11

13

105

15

17

19

21

23

25

Sim 3

Sim 4

Sim 5

Sim 8

Sim 9

Sim 10

@Risk can produce histograms for distribution of returns for each day of the simulation
period. For example, the following histogram shows the distribution of the prices in the last
day of the simulation.

Furthermore, you can obtain the summary statistic and percentiles of distributions of price
movements for each day of the simulation. For example, the table shows that in the last day
of simulation period, the stock price has a mean of 10.060 and SD of 0.312. The percentiles
show the cut off pints for the percentage of observations below that price. For example, 5%
percentile for the last day (day 25) has a value of 9.5551. This means that 5% of the
observations produced by simulation exercise fall below 9.5551. This is obtained simply by
sorting the observations according to their value in an acceding form and finding the value of
observation 50.
Similarly, the 95% percentile value of 10.5857 for day 25 indicates that 95% of the
observations fall below this value (10.5857). Furthermore, using the percentiles you can build
a confidence interval and say that we are 90% confident that stock price in day 25 should be
between 9.5551 and 10.857.
Name
Cell
Minimum =
Maximum =
Mean =
Std Deviation =
Variance =
Skewness =
Kurtosis =
Mode =
5% Perc =
10% Perc =
15% Perc =
.
.
.
85% Perc =
90% Perc =
95% Perc =

Price Day1 Price Day2


D9
D10
9.78598
9.69641
10.2037
10.2606
10.0024
10.0048
6.32E-02
8.88E-02
4.00E-03
7.89E-03
5.33E-03
-1E-03
2.97661
3.01824
10.003
10.0211
9.89819
9.85729
9.92142
9.89087
9.93682
9.90928
.
.
.
.
.
.
10.0679
10.0934
10.0835
10.1161
10.1065
10.1547

.
.
.

106

Price Day23
D31
8.95781
11.1546
10.0554
0.30571
9.35E-02
0.19571
3.0793
10.1687
9.55501
9.66773
9.72741
.
.
.
10.3786
10.4583
10.5444

Price Day24 Price Day25


D32
D33
8.96105
8.9348
11.2049
11.2021
10.0578
10.0601
0.31121
0.31235
9.69E-02
0.09756
0.23107
0.21612
3.12689
3.16104
10.3318
10.0028
9.55577
9.55507
9.66436
9.66254
9.72537
9.72567
.
.
.
.
.
.
10.373
10.3774
10.4676
10.4606
10.5772
10.5857

The other useful graph that @Risk can produce is the summary of distributions of all
simulation period; that is day1 to day25 or cells D9 to D33. This is shown below for this
example.

It can be seen that as the simulation period increases the variance of the distribution of price
increases. This is also evident from the table of results.

6.4.3 Pricing an Option Using Simulation


Now that we could simulate stock prices, let us see how can we use this to price an option?
The simplest example is pricing an European Option, which can only be exercised at
maturity. As in many other option pricing techniques, this method of pricing options is based
on no arbitrage theory.
Suppose that the current price of a stock is $50, and you would like to buy a call option on
this stock with an exercise price of $56 and 3 month maturity. The payoff of this option
would be (not considering the premium)
If stock price at maturity, T=0.25, exceed the exercise price, then
Pay off = PT - 56
If stock price at maturity, T=0.25, is less than the exercise price, then
Pay off = 0
Therefore, the payoff is Max(0,PT-Px). However, we want to know what should be the fair
price of this option. There are a number of Option Pricing Models proposed in the literature
and they are mainly based on the no arbitrage assumption. One model proposed by Cox, Ross
and Rubenstein (1979) uses expected discounted present value of cash flows from an option

107

on a stock which has the same SD as the underlying and a growth rate equal to risk free rate
(interest rate).
Example:
Current price for a non-dividend paying stock is P0=$12. Assuming the annual growth rate
and standard deviation of this stock are =12% and =20%, and risk free rate is r=7%,
calculate the value of an European call option on this stock with strike price X=$14 and 3
month to maturity.
According to Cox el at (1979) model, we need to find the mean of the cash flow for this
option and discount it to the present time, t=0, while assuming the stock price increase at the
risk free rate. To find the mean of the cash flow, we need to simulate the stock price several
times for the next 3 month using a growth rate equal to risk free rate and find the cash flow as
the difference between stock value at time 0.25 and the strike price. That is
If stock price at maturity, T=0.25, exceed the exercise price, then Pay off = PT - $42
If stock price at maturity, T=0.25, is less than the exercise price, thenPay off = 0
Then we find the mean of these cash flows and discount the mean using a r=7% discount rate.
The value that is obtained will be the call option price.

Input cells:
$13.50
$14.00
12%
20%
7%
0.25

Current stock price


Exercise price
Mean annual return
SD of annual return
Risk free rate
Maturity

Output cells:
Stock price in 3 month
using risk free growth =C6*EXP((C10-0.5*C9^2)*C11+C9*RiskNormal(0,1)*SQRT(C11)) 13.6698
rate
Option
cash
flow
at
=MAX(0,C16-C7)
$0.00
maturity
DPV of the cash flow
=C17*EXP(-C10*C11)
$0.00

108

Option Price

=C18

$0.00

We can then run the simulation by choosing cell C18 to be the output cell in @risk. The
simulation exercise will result in

It can be seen that the mean discounted present value of cash flow is 0.424cents. This is the
call option price. You can check this using Black-Scholes pricing formula too.

109

CHAPTER 7

Value at Risk Estimation and Analysis

110

7 Value at Risk Estimation and Analysis


7.1 Introduction
The Value at Risk (VaR) methodology is developed during the 1990's by JP Morgan as a
simple tool to measure and report the aggregate risk of the company to the high level
managers. According to stories, the then chairman of JP Morgan, Denies Weathersone, asked
his staff to produce a one page report on a daily basis (at 4:15 pm) about the risk and
potential losses over the next 24 hours. This led the JP Morgan staff to develop a
methodology which measure risk across different operations in the firm and produces an
aggregate single risk measure. The measure used was known as VaR, or the maximum likely
loss over the next trading day, and the VaR was estimated from a system based on standard
portfolio theory, using estimates of standard deviations and correlations between the returns
to different traded instruments.
After the JP Morgan's 1993 conference, in which they presented RiskMetrics, many other
institutions attempted to set up such systems. JP Morgan made the RiskMetric's methodology
and data publicly available and nowadays many companies and firms routinely use the model
to assess their market risk exposure.

7.2 What is exactly VaR


The term VaR can be used in one of four different ways, depending on the particular context.
1- VaR Concept
In its most literal sense, VaR refers to a particular amount of money, the maximum
amount we are likely to lose over some period, given a specific confidence level.
2- VaR Procedure
There is a VaR estimation procedure, a numerical, statistical or mathematical procedure
to produce VaR figures. A VaR procedure is what produces a VaR number (amount
of money that we are likely to lose in the next period, given the level of confidence).
3- VaR Methodology
This is the methodology which is a set of procedures that can be used to produce VaR
figures, but can also be used to estimate other risks as well. VaR methodologies can
be used to estimate other amounts of risk-such as credit at risk and cash flow at risk in the same way as value at risk.

7.3 Measuring Financial Risk


The financial risk in the portfolio theory framework is mainly the dispersion of returns or
cash flows around the mean, which can be measured statistically using the second moment.
Assuming returns or cash flows are normally distributed, we can write the probability density
function (pdf) of asset returns as

f ( x)
2

exp

111

(x
2

)2
2

Where x, the asset return (cash flow), is defined over the interval - <x<+ , is mean return
(cash flow) and 2 is the variance of the returns (cash flow). A normal pdf with zero mean
and variance 1, N(0,1), is shown in the figure below

0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4

-3

-2

-1

The pdf of cash flow gives a complete representation of all possible random outcomes. In
fact, it tells us about each possible cash flow and its likelihood. Knowing the distribution of
cash flows one can answer questions about the likelihood of outcomes and possible future
cash flows.
Therefore, in order to be able to report the likelihood of loss, we need some information to
begin with. First we need to know the confidence level (1- ). Second we need to know the
investment or project horizon (holding period) N. Third we need to know about the
distribution of the cash flow of the investment, project or portfolio at time N.
Once we have all these information, VaR can be expressed as "we are (1- ) sure that we will
not lose more than V dollars in the next N days. Therefore, the VaR is a function of two
parameters, and N. For example, if the distribution of our cash flow after N=1 days is given
as N(0,1) below

0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4

-3

-2

-1

112

Assuming a confidence level of 95%, we can say that: "we are 95% confident that we will
not lose more than $1.645 by the next trading day". This means that VaR=$1.645.
Similarly, if the confidence level is chosen as (1- )=99% and N=10, and we find that the
distribution of cash flow after 10 trading days is going to be N(2,2) as shown below

0.25

0.2

0.15

0.1

0.05

0
-5

-3

-1

Then we know that SD of the cash flow will be 2 1.414 , while Z


the VaR will be
VaR=$1.414*(2.326)=$3.289

2.326 . Therefore,

But we always report the VaR as a positive value, although it is a loss. Therefore, we can say
that "we are 99% confident that we will not lose more than $3.289 in ten days" or "our
value at risk in ten days, assuming 99% CL is $3.289".
Note that if our distribution is symmetric, there is no difference if we use the right tail or left
tail of the pdf to calculate and report the VaR. However, if the distribution is not normal or if
it is asymmetric, then we need to calculate the VaR using the left tail and report the figure as
a positive value.

7.4 Volatilities: daily vs. yearly


As you noticed we used the variance or SD of the sample to find the VaR. In practice, we use
the asset returns or prices to calculate volatilities, but the estimate that we get for volatilities
depends on the frequency of the observations used. For example, if we use daily observations
we find daily and if use yearly observations we get yearly . Assuming that there are 252
trading days in a year, we can write
yearly

daily

252

Similarly, if we assume there are 25 trading days in one month, we can write
monthly

daily

25

113

Therefore, the volatility of the stock for 10 days would be approximately


10 day

10

daily

7.5 Simple VaR calculation using volatilities


Let us assume that we have a portfolio of $10m of IBM shares and the daily volatility of IBM
shares is SD=2% (or 32% per year). This means that one SD change in one day is about
$200,000. Assuming that the successive daily changes in IBM shares are independent, then
the SD of IBM shares in 10 days would be
SD10 d

SD1d * 10

SD10 d

$200,000 * 10

$632,456

Now, if we want to calculate the VaR of this portfolio for 10 days horizon, given that
CL=99% and the returns are normally distributed with mean zero, simply write
VaR Z * SD10d
VaR

2.326 * $632,456

$1,473,621

Note that although the assumption of normality and zero is not exactly true but it serves the
purpose in practice. This is because normally, the average daily returns compared to SD are
small.
Let us now consider another portfolio of single asset which consist of $5m AT&T shares with
daily SD =1% (approximately 16% per year). Using N=10 and CL=99%, we can find the SD
as
SD10 d SD1d * 10
SD10 d

$50,000 * 10

$158,114

VaR as
VaR Z * SD10d
VaR

2.326 * $158,114

$368,405

7.6 Two asset portfolio


Now consider a portfolio consisting of both $10m IBM and $5m AT&T shares. Also we
know that SDIBM=2% and SDAT&T=1%, while the correlation between the two stocks is
=0.7. It is not difficult to show that the SD of the portfolio, SDP can be written as

2
IBM

2
AT &T

IBM

114

AT &T

Note that the SD's mentioned in the formula are values therefore, we did not include the
weights.
To find the 10 day VaR of the portfolio at 99% CL, we need to calculate the SD of portfolio
and using the appropriate Z , obtain the VaR. Thus
(632,456) 2

(158,114) 2

2(0.7)(632,456) * (158,114)

$751,665

Therefore, our 10 day SD for portfolio is $751,665. Having obtained the SD of portfolio we
proceed to calculate the VaR using =1%.
VaR Z * SD10d
VaR

2.326 * $751,665

$1,751,379

From this value at risk calculation you can see directly the effect of diversification as the VaR
of the portfolio is smaller than the sum of VaR of holding each stock individually.

($1,473,621 + $368,405) - $1,751,379 = $90,647

7.7 Multiple asset portfolios


The above results can be extended to multiple asset portfolios as long as the changes in the
value of portfolio are linearly related to the changes of the value of the underlying market
variables (assets). This assumption of linearity (linear model) holds for portfolios, which do
not consist of positions on derivatives.
If the change in the value of the portfolio is linearly related to the changes in the market
variables, then we can write
n

xi

2
i

2
i

i 1

and
n
2
p
i 1

ij

i 1 j i

Where P is the change in the value of portfolio, xi is the change in the market variable
(asset) i, i is the weight of the asset i in the portfolio, and ij is the correlation between i and
j market variables.

Alternatively, for multi-asset portfolios, one can estimate the VaR for individual assets and
construct a vector of VaRs, V. Then estimate the correlation between assets and construct the
correlation matrix, C, and use the following formula to find the VaR of the portfolio of the
assets
115

(VCV' )1 / 2

VaR p
where

[VaR1 VaR2 .... VaRn ]


1
21

12

1n

2n

n1

n2

note that in this case you do not need to use weighs as the dollar values of VaR for each asset
in the portfolio will take the weights into account automatically.
As an example consider a portfolio of 4 TD3 FFA positions of 5000t (1m, 2m, 3m and 4m to
maturity) on 17 March 2003, with the following 1% 10-day individual VaRs and correlation
matrix
1M
Current Price WS
Volatility SD
1% 10-day VaR $

112.5
113.6%
39,941

2M
3M
88.0
67.5
86.7%
74.4%
23,866
15,709

65
61.4%
12,606

1
0.724 0.593 0.426
0.724
1
0.826 0.667
0.593 0.826
1
0.774
0.426 0.667 0.774

4M

Therefore the 1% 10-day VaR of the portfolio will be


VaR10 d

V' CV

1/ 2

$80,314.8

The relationship between VaR, CL and holding period


We mentioned that the VaR depends on two important factors, CL and Holding Period, HP. It
is not difficult to see that the VaR will increase as CL increases; however, this relationship is
nonlinear and is shown in figure A. The relationship between VaR and HP is also positive
and nonlinear. This is because as HP increases, the volatility of underlying asset increases to
which will result in a higher SD for calculation of VaR. This is shown in figure B.

116

Figure A

Figure B

Figure C shows the relationship between VaR, CL and HP in one diagram. It can be seen that
the VaR surface changes as the CL and HP change.
Figure C
VaR Surface Against Confidence Level and Holding Period

12.000
10.000
8.000
VaR

6.000
4.000
2.000

16
11

0.98

0.99

0.96

0.97

0.94

0.95

0.92

6
0.93

0.9

0.91

0.000
1

Holding period

Confidence level

7.8 Expected Tail Loss (ETL)


One problem with the VaR is that it only considers a particular value as the expected loss at
certain probability. This is not quite correct as we may have loses grater than the VaR if
things go wrong severely. For example, consider the following cash flow distribution. The
reported VaR is 1.645. Therefore, according to VaR there is 5% chance that we lose 1.645 in
N days. This does not say that there is also 1% chance that we lose 2.326 in N days.

117

Therefore, a better way of expressing the risk in terms of probability of loss is to report the
Expected Tail Loss (ETL). The ETL is the average of all the losses that we may make given
% probability that things go wrong in N days.

VaR=1.64
5
ETL=2.06
1

In order to calculate the ETL, we must find the average of the losses from =5% to =5%.
This is done by slicing the tail into 10, 100 or more slices and find the loss values (VaR) in
each case and taking the average of these VaR's. Assuming 10 slices, the ETL value is given
in the following table.
Tail VaR's

Tail VaR values

0.95
0.955
0.96
0.965
0.97
0.975
0.98
0.985
0.99
0.995

1.645
1.695
1.751
1.812
1.881
1.960
2.054
2.170
2.326
2.576

Average of tail VaR's

1.987

It can be seen that the accuracy of the ETL depends on the number of slices used in
calculating the average. The following table shows the relationship between number of
slices and ETL.
Number of Tail slices
N=10
N=50
N=100
N=500
N=1000
N=5,000

ETL
1.9870
2.0432
2.0521
2.0602
2.0614
2.0624

118

Although in practice people use ETL in the way mentioned, but it is not again the most
accurate way of calculating the ETL. This is because the VaR changes with the probability of
loss, therefore, one has to consider a weighted average ETL. Weighted average ETL is an
ETL, which is calculated using 's as weights. Consider the same example.

VaR=1.64
5
ETL=2.06
1

This is shown in the following table where the last column is the weighted VaR's and
weighted average ETL 1.844.
1

10.95
0.955
0.96
0.965
0.97
0.975
0.98
0.985
0.99
0.995

2
VaR
1.645
1.695
1.751
1.812
1.881
1.960
2.054
2.170
2.326
2.576

0.050
0.045
0.040
0.035
0.030
0.025
0.020
0.015
0.010
0.005

/0.275
0.182
0.164
0.145
0.127
0.109
0.091
0.073
0.055
0.036
0.018

Sum

Ave=1.987

0.275

1.000

5
VaR x 4
0.299
0.277
0.255
0.231
0.205
0.178
0.149
0.118
0.085
0.047
1.844

7.9 VaR Estimation Methodologies


The accuracy of the VaR estimate depends heavily on the method used for estimating the
volatility of the underlying asset, the behaviour of the variable, as well as the assumptions on
the distributional properties of the price changes (returns). Factors such as volatility
clustering, leptokurtosis, fat-tails and skewness are stylised facts of freight rates which can
affect the accuracy of VaR estimates and performance of VaR procedures. Therefore, a
number of alternative methodologies have been proposed to incorporate deviations from
normality as well as the time varying volatility of returns, when estimating VaR. The

119

following sections are devoted to discuss some of these methods which are broadly classified
into parametric and non-parametric approaches in VaR estimation.
7.9.1 Parametric VaR Estimation
The parametric approach in estimating the VaR explicitly assumes that returns follow a
defined parametric distribution, such as the Normal, Student-t or Generalised Error
Distribution, among others. Based on this approach, parametric models are used to estimate
the unconditional and conditional distribution of returns, which is then used to calculate VaR.
These methods are usually preferred and used frequently in estimating VaR because they are
simple to apply and produce relatively accurate VaR estimates (see for example Jorion; 1995
and 2002).
7.9.1.1 Sample Variance & Covariance Method
A simple and straightforward method of calculating VaR is to use the historical constant
variance and covariance between the return series, and find the difference between the mean
and the % percentile of the distribution of the asset or the portfolio; that is, VaRt 1 Z t .
Based on this method, forecasts of variances of returns are usually generated using a rolling
window of a specified size, e.g. 1,000 data points. The Variance-Covariance method is a
simple and fast method for estimating the VaR, but is believed to be efficient only in the short
term. The main disadvantage of this method is that it does not take into account the dynamics
of volatility of the underlying asset as it applies equal weights to past observations in the
variance calculation.
7.9.1.2 Exponential Weighted Averages variance or RiskMetrics Method
RiskMetrics uses a weighted average of the estimated volatility and the last price changes at
any point in time to estimate future volatility and VaR.1 The weighting factor, which
determines the decay in the importance of past observations, could be estimated from
historical data. However, usually it is set as constant between 0.9 and 0.98. J.P. Morgan
RiskMetrics, for instance, uses a weighting multiplier of 0.97 which is argued to be the
optimal decay factor in variance calculation. Thus the RiskMetrics exponentially weighted
average variance estimator can be obtained using the following equation.
2
t 1

2
t

(1

)rt 2

(7.1)

It is obvious that the higher the decay factor, the longer the memory assigned to past
observations. In case of a portfolio of assets, the covariance and correlation of two assets, say
X and Y, can be estimated, respectively, as:
1

For more details on this approach the reader is referred to J.P. Morgans RiskMetrics Technical Manual (1995) as well as
Chapter 3 of this book.

120

2
XY ,t 1

2
XY ,t

(1

(7.2)

)rt X rtY

(7.3)

XY ,t
XY ,t 1
X ,t

Y ,t

Again, once the variance and covariance are calculated, the


assuming an appropriate parametric distribution.

% VaR can be estimated

7.9.1.3 GARCH Models and VaR Estimation


A relatively more advanced parametric method for VaR estimation is to use the volatility
input estimated through GARCH-type models. In the simple GARCH (1,1) specification the
variance is a function of the most recent error terms and the previous periods variance.
Hence, t2 g 2 t 1 can be formulated as:
2
t

2
1 t 1

2
t 1

(7.4)

Where 2 = (0 1 2) is a vector of parameters to be estimated, subject to the non-negativity


constraints 0 0, 1 and 2 > 0, in order for the variance to be positive definite. GARCH
models can also be extended in a multivariate framework to model the variance/covariance
matrix of a portfolio of assets (see chapter 3 for more details on multivariate GARCH
models).
7.9.1.4 Monte Carlo Simulation and VaR Estimation
This method of estimating the VaR of an asset or a portfolio is based on the assumption that
prices follow a certain stochastic process (Geometric Brownian Motion, Jump Diffusion,
Mean Reversion, Mean Reversion Jump Diffusion etc.), or a multivariate process in the case
of portfolios. Once the stochastic mathematical process for the underlying asset is
determined, it can be used to generate many possible paths for the evolution of the asset price
through Monte Carlo Simulation. Simulating the stochastic processes of the underlying
assets will yield the distribution of the portfolio value at given point in the future, and the
VaR of the portfolio can be estimated as the difference between the expected value (mean) of
the distribution of the portfolio and the % lower percentile of the distribution.
The advantage of this method is that it allows for certain properties of the underlying asset
price, such as seasonality and mean reversion, to be considered and incorporated in the
simulation exercise. This is quite important because such dynamics in asset price have direct
impact on the accuracy of estimated VaR. For example, if the price of an asset is meanreverting and current prices are above their long run mean, then there is a higher likelihood
that prices drop (Figure 7.1, Price Path 1). On the other hand, when current prices are below
121

their long run mean, then there is a higher likelihood for a price increase (Figure 7.1, Price
Path 2).
Figure 7.1: Mean Reversion and VaR Estimation
45

Price

40
35
30
25
20
1

Price Path 1

10

13

16

19

Price Path 2

22

25

28

31

34

37

40

43

46

49

52

55

58

Time

Another advantage of the Monte Carlo simulation is that due to its flexibility it can be used to
estimate VaR of portfolios containing short-dated as well as nonlinear instruments such as
options and option portfolios. In addition, sensitivity analysis can be performed in a simple
way by changing market parameters. For instance, by changing variances and correlations of
a portfolio, we can assess the sensitivity of the portfolio and examine the effect of different
volatility regimes on VaR estimates. However, simulation techniques are highly dependent
on the accuracy and quality of the processes chosen for the behaviour of the underlying asset
prices, and their complexity and computational burden increases with the number of the
assets in the portfolio.
The steps in MCS for VaR calculation include:
Step 1: Specify the dynamics of the underlying processes.
Step 2: Generate N sample paths by sampling changes in asset prices over the
required horizon. A minimum of 10,000 simulations are typically necessary
for satisfactorily accurate estimates
Step 3: Evaluate the portfolio at the end of horizon for each generated sample path.
The N outcomes constitute the distribution of the portfolio values at the
required time horizon.
Step 4: Finally, VaR can be estimated as the difference between the mean of the
simulated distribution and the lower % percentile of the ordered simulated
outcomes at the point in time for which the VaR is considered; for instance,
see Figure 7.2 for the 1% VaR.

122

Figure 7.2: Calculating VaR from the simulated distribution


70
60
50
40
30

mean

20

1% VaR

10

1% tail

0
1

10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58

Example 1: Estimating VaR using Monte Carlo Simulation

Consider portfolio A consisting of two long FFA contracts on two different hypothetical
shipping routes (Route 1 and 2), and portfolio B which consist of a long position in route 1
and a short position in route 2. The current value for route 1 is $30/mt for 54,000mt and daily
volatility of 2.52%, while the current value for route 2 is $10/mt for 150,000mt of cargo and
daily volatility of 1.89%. The long run mean of freight rates for route 1 and 2 are $25/mt and
$8/mt, respectively, while estimated historical correlation between the two routes is 0.8. In
addition, it is assumed that freight rates in both routes are mean reverting with mean
reversion rates of 0.33 and 0.4 for routes one and two, respectively.2 The estimated one-day
5% VaR of the two portfolios using the variance covariance method would be:
Route 1: VaR15d%

daily

1.645 * (30 * 54,000 * 0.0252)

5%
1d

daily

1.645* (10 *150,000 * 0.0189) $46,627

Route 2: VaR

$67,143

and therefore

A mean reverting process is defined as a process in which prices tend to revert back to a long run mean. A discrete version
of a bivariate mean reverting process can be written as
1 2
s1,t [ 1 ( 1 s1,t ))
t 1,t
1 ] t
1
t ~ N(0, )
2
1 2
s 2,t [ 2 ( 2 s 2,t ))
t 2,t
2 ] t
2
2
where s1,t and s2,t are log of asset prices and s1,t and s2,t are log price changes at time t; 1 and 2 are the log price levels
to which prices of asset 1 and 2 revert over the long run, respectively; 1 and 2 are the coefficients of mean reversion
measuring the speed at which prices revert to their mean; 1 and 2 are the standard deviation of prices, and t is a (2x1)
vector of stochastic terms which follow a bivariate normal distribution with zero mean and variance-covariance

2
1

1, 2

1, 2

2
2

123

Portfolio A: VaR

5%
1d

Portfolio B: VaR

5%
1d

67,143

46,627

67,143 46,627

1
0.8

0.8 67,147
1

1/ 2

$108,127

46,627

0.8 67,143

0.8

1/ 2

46,627

$40.905

Using Monte Carlo simulation, we estimate 5% VaR for each route as well as portfolio A and
B for different time horizons; namely, 1-, 10-, 20- and 40-days. Panel A of Table 7.1 presents
the parameters of the underlying routes, whereas, Panel B presents the estimated VaR for
both the individual routes and the two portfolios. It can be seen that VaR estimated for
portfolios A and B through MCS are slightly lower than those estimated using the simple
variance-covariance method. This is because MCS incorporates the assumed mean reversion
property of freight rate processes and, as a result, the estimated portfolio VaRs are marginally
less than those estimated through the simple variance-covariance method. Although the
difference in VaRs is relatively small for short horizons, this increases as we consider longer
VaR periods, consistent with the fact that the impact of mean reversion increases as we
consider longer periods; for instance, the estimated 5% 40-day VaR for portfolio A using
MCS is $601,538, while the estimated 5% 40-day VaR for portfolio A using variancecovariance method is $683,854 ( = $108,127x 40 ).
Table 7.1: VaR calculation using Monte Carlo Simulation
Panel A: Assumptions

Route
1
2

Maturity
60 -days
60-days

Rate($/t)
30
10

Contract
Size
(tonnes)
54,000
150,000

Current Value ($)


Route
1
2

1.620m
1.500m

1-day VaR
10-day VaR
20-day VaR
40-day VaR

Route 1
$66,121
$201,509
$279,375
$372,858

Daily Vol
2.52%
1.89%

Mean
Reversion
Rate
0.33
0.40

Current Value ($)


Port A
Port B
3.120m
0.120m
Long Route 1
Long Route 1
Long Route 2
Short Route 2

(=30$x54,000t)
(=10$x150,000t)

Panel B: Estimated 5% VaR using MC Simulation


Route 2
$44,863
$139,890
$190,127
$251,517

Portfolio A
$105,428
$333,136
$451,969
$601,538

124

Portfolio B
$41,106
$125,630
$175,951
$245,969

Long Run
Mean
$25/t
$8/t

Correlation
0.8

Das könnte Ihnen auch gefallen