Beruflich Dokumente
Kultur Dokumente
ECON2206/ECON3209
Slides02
ie_Slides02
Motivation
Example 1. Ceteris paribus effect of fertiliser on soybean yield
yield = 0 + 1ferti + u .
Example 2. Ceteris paribus effect of education on wage
wage = 0 + 1educ + u .
In general,
y = 0 + 1x + u,
where u represents factors other than x that affect y.
We are interested in
explaining y in terms of x,
how y responds to changes in x,
holding other factors fixed.
ie_Slides02
y = 0 + 1x + u
y + y = 0 + 1(x + x)
+ u + u
ie_Slides02
distribution of u
E(y| x = x3)
= 0 + 1x3
E(y| x = x2)
distribution of y
for given x = x3
E(y| x = x1)
x1
ie_Slides02
x2
x3
x
8
Observations on (x, y)
A random sample is a set of independent
observations on (x, y), ie, {(xi , yi), i = 1,2,...,n}.
At observation level, the model may be written as
yi = 0 + 1xi + ui , i = 1, 2, ..., n
where i is the observation index.
Collectively,
y1
x1 u1
y1 1
1
y
x u
y 1
1
2
2
2
, or 2
0
y
x
u
1
n
n n
yn 1
Matrix notation:
ie_Slides02
x1
u1
x2 0 u2
.
1
xn
u n
Y X B U.
i = 1, 2, ..., n
SSR u ( y i 0 1 xi )2
i 1
2
i
i 1
10
( 0 , 1 ) minimiser of SSR.
Choose ( 0 , 1 ) to minimise SSR.
The first order conditions lead to
n
( y i 0 1xi ) 0,
mean residual = 0
i 1
n
(y
i 1
0 1 xi )xi 0.
covariance of
residual and x
=0
ie_Slides02
11
(x
i 1
x )( y i y )
2
(
x
x
)
i
0 y 1 x ,
i 1
where
1 n
y yi ,
n i 1
1 n
x xi .
n i 1
2
(
x
x
)
0.
i
i 1
ie_Slides02
12
2
(
x
x
)
0.
i
i 1
13
y E( y | x ) u.
ie_Slides02
14
y i 0 1 xi ui
(xi, yi)
y
sample
regression
line 0 1x
residual
ui
population
regression
line 0+ 1x
ie_Slides02
15
OLS example
10
15
20
wage
25
10
15
educ
Interpretation
Slope 0.54 : each additional year of schooling increases
the wage by $0.54.
Intercept -0.90 : fitted wage of a person with educ = 0?
SRF does poorly at low levels of education.
16
Properties of OLS
The first order conditions:
n
(y
i 1
(y
i 1
0 1 xi ) 0,
0 1xi )xi 0
imply that
the sum of residuals is zero.
the sample covariance of x and the residual is zero.
the mean point ( x, y ) is always on the SRF (or OLS
regression line).
ie_Slides02
17
Sums of squares
Each yi may be decomposed into y i y i ui .
Measure variations from y :
Total sum of squares (total variation in yi ):
SST i 1 ( y i y )2 ,
n
SSE i 1 ( y i y )2 ,
n
SSR i 1 u i2 .
n
18
R2 when evaluating
regression models.
19
ie_Slides02
20
1
0
lwage
10
15
educ
R 2 0.186
21
ie_Slides02
22
OLS estimators
A random sample, containing independent draws
from the same population, is random.
A data set is a realisation of the random sample.
23
24
1 1
ie_Slides02
n
i 1
(ui u )( x i x )
n
i 1
( xi x )
25
Theorem 2.2
Under SLR1 to SLR5, the variances of ( 0 , 1 ) are:
Var ( 1 )
n
i 1
( xi x )
2 n 1 i 1 x i2
n
2
2
, Var ( 0 )
n
i 1
( xi x )
26
ie_Slides02
27
Estimation of 2
As the residual approximates u, the estimator of 2 is
2
u
SSR
i
2
i 1 .
n2
n2
n
2 is the number of
estimated coefficients
28
OLS in STATA
standard
error of
regression
SSR
ie_Slides02
29
Summary
What is a simple regression model?
What is the ZCM assumption? Why is it crucial for
model interpretation and OLS being unbiased?
What is the OLS estimation principle?
What are PRF, SRF, error term and residual?
How is R-squared is related to SSR?
Can we describe, in a simple linear regression model,
the nonlinear relationship between x and y?
What are Assumptions SLR1 to SLR5? Why do we
need to understand them?
What are the statistical properties of OLS estimators?
How do you OLS in STATA? regress y x
ie_Slides02
30