Sie sind auf Seite 1von 8

Outline

Forecasting
Regression overview
Relationships between variables
Model building
Least Squares estimation
Introduction to Regression
Inference
Dynamics
Prediction
2

Relationships Between
Regression Overview Variables
Regression is a useful tool for
Economic variables often do not exist in
Quantification of business process isolation
Policy analysis
There exist relationships between them
However, regression analysis requires
Sales = f(Advertising)
* Model building
More data collection
Consumption = f(Income)
Price inflation = f(Interest rates)
Regression based forecasting requires
Changes in one variable may precede or
* Prior forecasts of explanatory series
* Sources of uncertainty & error
cause changes in the other
3 4

Relationships Between Relationships Between


Variables Variables
240 Real GDP UK (deseasonalised)
Unemployment vs UK annual price inflation
3.5
Is there a relationship between 210
3.0
Unemployment and Inflation?
M000 (1995)
000 Workforce

2.5 180
How might it be described?
2.0 150
1.5
120
1.0 Which is the best trend to
0.5 90 describe this data?
0.0
60
0% 5% 10% 15% 20% 25% 30% 35%
Annual Price Inflation 1955 1965 1975 1985 1995 2005

5 6

1
Model Building Model Building

Choice of variables To describe relationships between variables


Choice of functional form we must
Estimation and interpretation of parameters Choose relevant variables
Distinguish between variables which are
Diagnostic checking
caused dependent variables
Respecification of model (if required)
causes explanatory variables regressors
Forecasting
Write a function describing the relationship
dependent var. = f(explanatory vars.)
7 8

Linear Regression Linear Regression

25 A linear relationship
Suppose we have a dependent variable, y
When x increases/decreases by one unit,
which we wish to explain in terms of another 20
y increases/decreases by 2 units
variable x y = 3 + 2x
2
15
As a first approximation we might assume
y

that they are related linearly 10 Intercept


Slope
y = 0 + 1x 5
If we specify or estimate the parameters, 0 & 0
When x = 0, y = 3
1 we have specified (estimated) the 0 1 2 3 4 5 6 7 8 9 10 11
relationship 9
x
10

Linear Regression Linear Regression

Unfortunately exact relationships are The model is amended to reflect the impact
extremely rare of these random departures from the
Deviations from the relationship may be due relationship as follows
to E(y) = 0 + 1x
Unforeseen (random) events or
Data errors
y = 0 + 1x +
Model errors
where represents the sum of random
elements in the model
11 12

2
Linear Regression Linear Regression

30 Number of Heads when coins are tossed Because of the random factors (errors) in the
relationship we can not determine exactly the
y = Number of heads

25

20

{ true values of 0 & 1
15 E(y) = 0 + x We will denote the estimated relationship
10
y = b0 + b1 x
5 How should we choose b0 & b1 to give an
0
estimated relationship that best describes the
15 20 25 30 35 40 45 data?
x = Number of coins tossed
13 14

Linear Regression Least Squares Regression

30 Number of Heads when coins are tossed The best fitted line is defined as the one
define residuals which minimises the (sum of) squared
y = Number of heads

25

20
e = y (b0 + b1x) { residuals (SSR)
15 This is the principle of Ordinary Least
10
Squares
5 Which values of b0 & b1 define a
line that best fits the data?
0
15 20 25 30 35 40 45
x = Number of coins tossed
15 16

Least Squares Regression Least Squares Regression


Given n pairs of data, (y1, x1) (yn, xn) we
x y y (x x ) (y y ) (y y )(x x )
2 2
estimate the parameters of the linear model x y x
y i = 0 + 1 x i + i 20 8 -10 -8.2 100 67.24 82
25 10 -5 -6.2 25 38.44 31
using the least squares formulae 30 15 0 -1.2 0 1.44 0
n
((y i y )(x i x )) 35 24 5 7.8 25 60.84 39
b1 = i=1
b 0 = y b1 x 40 24 10 7.8 100 60.84 78
n
(x i x)
2 30 16.2 250 228.8 230
i =1 b 1 = 0.92 b0 = -11.4
17 18

3
Least Squares Regression Least Squares Regression

We may calculate the SSR 30 Number of Heads when coins are tossed
The estimated relationship is not

y = Number of heads
25
x y y e = y y e necessarily equal to the actual
20 8 7 1 1 20 average relationship

25 10 11.6 -1.6 2.56 15 E(y) = 0 + x


30 15 16.2 -1.2 1.44 10

35 24 20.8 3.2 10.24 5


40 24 25.4 -1.4 1.96 yi = 11.4 + 0.92x i
0
17.2 15 20 25 30 35 40 45
x = Number of coins tossed
19 20

Least Squares Regression Least Squares Assumptions


In the general linear model
The general linear model is
E(i ) = 0
yi = 0 + 1xi + 2wi + + K-1zi + i ; i = 1,,n
The error has a constant variance var(i ) =
where; The error in any one observation is independent
xi , wi , , zi represent each of K 1 regressors of the errors in all other observations
we have K parameters in total to estimate with n The regressors are non random (or at least
observations on each variable independent of the errors)
0 is an intercept term The regressors are not linearly related to each
other
Each of 1 , , K1 are slope terms
The error i has a normal distribution
21 22

Least Squares Regression Least Squares Regression

Because the parameters are estimated using Given the Least Squares Assumptions, we can
random data, they are themselves random describe the random nature of our estimates
Given our assumptions, Least Squares Let bk be the estimate of k in
estimates are BLU yi = 0 + 1xi + 2wi + + K-1zi + i ; i = 1,,n
Best (minimum variance) of all Linear Unbiased
estimators bk ~ N( k , var(bk) )
The (estimated) standard deviation of bk is
called the standard error
23 24

4
Least Squares Regression Inference

R tells us the proportion of y explained by Specification analysis


the regression, Regression coefficients
R lies in the range [0, 1] Confidence intervals
2 SSR Hypothesis testing
R =1 n

(y i y ) ANOVA analysis
2

i =1 General restrictions
SSR
An estimator of var() = 2 2
is s =
n K
25 26

Specification Analysis Specification Analysis

Over specification Signs of over specification


Including irrelevant regressors in our model will Insignificant regressors (i.e. estimated coefficients
reduce estimation, forecasting and inference not significantly different from zero)
efficiency
Signs of under specification
Under specification The signs of coefficients contradict economic
Excluding relevant regressors from our model will theory
induce biased estimation, inference and forecasts
Low R2
Patterns in residuals
27 28

Specification Analysis Inference on Coefficients


R will always increase when regressors are
Let bk be the estimate of k in
added to a model, even if the extra
regressors are irrelevant yi = 0 + 1xi + 2wi + + K-1zi + i ; i = 1,,n
Adjusted R allows for the loss in efficiency of with standard error denoted se(bk)
including extra regressors
1
SSR
2
R =1 n K
1 n 2
(y i y )
n 1 i =1
29 30

5
Confidence Intervals Hypothesis Testing

A (1 ) confidence interval for k is given by A size test of the null hypothesis


Pr(bk c se(bk) < k < bk + c se(bk)) = (1) H0: k = r against HA : k r
is carried out by calculating the test statistic
where c is the /2 value from tnK tables
bk r
u =
eg for a 95% confidence interval with n K = 22, se(bk )
c = 2.074
Pr( t22 > 2.074) = 2%

31 32

Hypothesis Testing ANOVA Analysis

If u R0 = {u: |u| < c} we accept H0 ANOVA analysis allows us to test the total
explanatory power of the model
If u RA = {u: |u| > c} we accept HA with
H0 the dependent variable is not significantly
(1 ) confidence
related to any of the regressors
c is the /2 value from tnK tables H0 all of k = 0 for k = 1, K-1
HA the dependent variable is significantly
related to at least one regressor
HA at least one k 0 for k = 1, K-1
33 34

ANOVA Analysis General Restrictions


The test statistic may be calculated Testing some of the coefficients
n K R 2 Estimate the model with all regressors
n K Regression Sum of Squares
u = =
K 1 (1 R 2 ) K 1
included
SSR
Denote the Residual Sum of Squares SSR1 and the
If u R0 = {u: u < c} we accept H0 number of coefficients K1

If u RA = {u: u > c} we accept HA with Re-estimate the model excluding the


(1 ) confidence regressors you want to test for significance
Denote the Residual Sum of Squares SSR2 and the
c is the value from F(K1, nK ) tables number of coefficients K2
35 36

6
General Restrictions Dynamics
The test statistic may be calculated using
In time series analysis, we allow for variables
n K 1 SSR 2 SSR1 leading or lagging each other using
u = dynamics
K1 K2 SSR1
Let yt denote UK Real GDP measured at time t
If u R0 = {u: u < c} we accept H0 Let rt denote UK interest rates measured at
If u RA = {u: u > c} we accept HA with time t
(1 ) confidence
c is the value from F(K 1 K 2 ,n K 1 ) tables
37 38

Dynamics Dynamics

It is argued by the Bank of England that a Lags are easily created in our data set by
change in r will not impact on y for 18 shifting the observations through time
months. If the data is collected quarterly, this t UK GDP (M) = yt yt-1 yt-2 yt-3
Mar-90 132951
is six quarters Jun-90 132552 132951
Thus, instead of regressing yt on rt we should Sep-90 139408 132552 132951
Dec-90 146207 139408 132552 132951
regress it on rt-6 Mar-91 136716 146207 139408 132552
Jun-91 139950 136716 146207 139408
Sep-91 146830 139950 136716 146207
Dec-91 152178 146830 139950 136716
39 40

Specification Analysis Specification Analysis

Stepwise Forward Regression Note that neither of these methods is


Add variables to the model one at a time in order guaranteed to given us the optimal
of increasing R 2 until the additional variable fails forecasting model
2
to increase R In fact they may well indicate different
Stepwise Backward Regression models
Start with all potential regressors in the model.
Delete variables one at a time in order of
increasing significance until the additional deletion
2
fails to increase R
41 42

7
Prediction

The best in sample fit does not guarantee


the optimal forecasting performance
As a general rule of thumb, simpler models
with fewer regressors tend to outperform
models with more regressors
Keep it simple (but not too simple)

43

Das könnte Ihnen auch gefallen