15 views

Uploaded by thinh

OLS Proofs

- Ej 1053979
- Expt5.doc
- 2012-Option-A DSE entrance masters economics
- TEofIFV
- n13.pdf
- Elements of Statistical Learning Solutions
- STA302_Mid_2009F
- Anderson Bror Sen 2005
- Topic13 OLS
- bae2009.pdf
- Exercise 10.21
- Técnica Data Minning
- Estimating and testing multiple structural changes in linear models using band spectral regressions
- Amitabha Sinha Associate Professor Department of a & A
- lec12_chap10
- FilmerP01
- Defense Pact in t Trade
- BRM - Industry Analysis
- FRM_2017_Part_I_GARP_Book_2.pdf
- Panel Data Econometrics - Institutional Quality, Economic Activity and Recessions

You are on page 1of 7

Alexandros Theloudis

November 7, 2015

We are interested in estimating the relationship between x and y in the population. We assume

that there is a linear association between the two (Assumption SLR.1 ):

y = 0 + 1 x + u (1)

where 0 and 1 are the parameters of interest and u is an error term. The population is not

directly observable as it is assumed to be infinite; however we can still learn a lot about the

aforementioned relation by hinging on random samples from the population. A random sample

can be denoted as {(xi , yi ) : 1, . . . , n}; this notation implies that we have n observations (xi , yi )

where {xi , yi } are independently drawn from the same aforementioned population (we say that

{xi , yi } are i.i.d. - independent and identically distributed ). This is Assumption SLR.2.

The error terms are assumed to have zero conditional mean: conditional on x the expected

value of u in the population is 0. This can be written as E(u|x) = 0; it implies that no x conveys

any information about u (Assumption SLR.4 ).

The question is: how can we get 0 and 1 ? Well, we cant unless we have infinite amount of

information. However, given a random sample of observations, the OLS estimates 0 and 1 are

BLUE: Best Linear Unbiased Estimates of the items of interest 0 and 1 . How do we get 0 and

1 ?

There are two equivalent ways to obtain 0 and 1 given the 3 assumptions above. The first way

is the Method of Moments ; the second is the Sum of Squared Residuals.

E[u|x] = 0 implies that E[u] = 0 and E[xu] = 01 . From (1) we can write:

E[u] = E[y 0 1 x] = 0

E[xu] = E[x(y 0 1 x)] = 0

1 X

n

yi 0 1 xi = 0

n i=1

1 X

n

xi (yi 0 1 xi ) = 0

n i=1

Alexandros Theloudis: Department of Economics, University College London, Gower Street, WC1E 6BT Lon-

1 From the Law of Iterated Expectations: E (u) = E [E (u|x)] = E [0] = 0. Also: E (xu) = E [E (xu|x)] =

1

where the summation over i implies summation over all sample observations.

1 X

n

yi 0 1 xi = 0

n i=1

n n

1X 1X 1

yi 1 xi = n 0

n i=1 n i=1 n

n n

1X 1X

yi 1 xi = 0

n i=1 n i=1

y 1 x

= 0 (2)

Pn we make use of the formula for calculating a sample average: for any generic

variable z: z = n1 i=1 zi .

1 X

n

xi (yi 0 1 xi ) = 0

n i=1

1 X

n

y 1 x

) 1 xi ) = 0

(2)

xi (yi (

n i=1

n

X

xi yi xi y + xi 1 x

1 x2i = 0

i=1

Xn

xi (yi y) 1 xi (xi x

) = 0

i=1

n

X n

X

1 xi (xi x

) = xi (yi y)

i=1 i=1

n

X Xn

1 xi (xi x

) = xi (yi y)

i=1 i=1

Pn

x (y y)

1 = Pni=1 i i (3)

i=1 i (xi x

x )

Pn

Expression (3) is the OLS estimator 1 if i=1 xi (xi x

) 6= 0. This is guaranteed by Assumption

SLR.3. Substituting (3) into (2) we can get the OLS estimator 0 . Having obtained 0 and 1

we can then get:

Predicted/fitted values: yi = 0 + 1 xi

i = yi yi

Residuals: u

An alternative (but equivalent) way to obtain the OLS estimators is by minimizing the sum of

squared residuals (SSR hereafter). What really is a residual ui ? Think about observation i with

(xi , yi ); for xi the aforementioned OLS regression line predicts a y-value of yi . How far is yi from

the actual yi ? This information is given by u i ! In other words, u

i is the vertical distance between

2

yi and yi ; it can be both positive or negative depending on whether the actual point lies above or

below the OLS regression line.

If we want our OLS regression line to fit the data well, then we must minimize P the distances

n

i , i. How do we do that? One way would be to minimize the sum of residuals

u i . But

i=1 u

that sum is by definition equal to 0. Instead we can minimize the SSR: the smaller this sum is,

the closer the OLS regression line is to our sample observations.

Analytically:

Xn Xn n

X 2

yi 0 1 xi

2

min 2i min

u (yi yi ) min

i=1 i=1 i=1

We need to find 0 and 1 so that the above SSR is minimized. Assuming that the above function

is well behaved, we will derive the first order conditions with respect to 0 and 1 and set them

equal to 0. For convenience we should open up the above expression:

Xn 2 X n

yi 0 1 xi = yi2 + (0 + 1 xi )2 2yi (0 + 1 xi )

i=1 i=1

{with respect to 0 }:

n

X Xn

2 0 + 1 xi 2yi = 0

i=1 i=1

n0 + 1 n

x = n

y

y 1 x

= 0

Notice that we now actually reached equation (2) above.

{with respect to 1 }:

n

X n

X

2 0 + 1 xi xi 2yi xi = 0

i=1 i=1

n n

(2) X X

y 1 x

+ 1 xi xi = yi xi

i=1 i=1

n

X n

X n

X

yxi + 1 xi (xi x

) = yi xi

i=1 i=1 i=1

Xn Xn

1 xi (xi x

) = xi (yi y)

i=1 i=1

n

X Xn

1 xi (xi x

) = xi (yi y)

i=1 i=1

Pn

x (y y)

1 = Pni=1 i i

i=1 i (xi x

x )

Now notice that this is actually the equation we found for 1 in (3). It should now be obvious

that the two ways of obtaining the OLS estimators are equivalent. As before, to obtain an

equation for 0 , one only needs to replace 1 in (2) with the expression in (3).

2 Unbiasedness

Every time we draw a new random sample from the population, the estimates 0 and 1 will be

different. The question we are asking now is: does the expected value of these estimates equal the

unknown true value for 0 and 1 in the population or not? The answer is yes. It will turn out

that E[0 ] = 0 and E[1 ] = 1 and thus we will say that the OLS estimates are unbiased.

3

2.1 Proof for 1

Pn

Following the lecture notes, we will set s2x = i=1 (xi x )2 . From (3) we have:

Pn

xi (yi y)

1 = Pni=1

i=1 i (xi x

x )

Notice that:

n

X n

X n

X

(xi x

x ) = xi

x x

x = n

xx n

xx=0

i=1 i=1 i=1

Pn Pn

Similarly we can show that i=1 y(xi x ) = 0 and i=1 x (yi y) = 0. Working on (3) we get:

Pn Pn Pn

xi (yi y) xi (yi y) i=1 x (yi y)

1 = Pni=1 = Pni=1 Pn

i=1 x i (x i

x ) i=1 x i (x i

x ) i=1 (xi x

x )

Pn

(xi x )(yi y)

= Pni=1

i=1 (xi x )(xi x )

Pn

(xi x )yi

= Pi=1

n

i=1 (x i )2

x

Pn

(xi x )yi

= i=1 2

sx

Pn

(1)

(xi x )(0 + 1 xi + ui )

1 = i=1

s2x

Pn Pn Pn

(xi x ) (xi x )xi (xi x)ui

= 0 i=1 2 +1 i=1 2 + i=1 2

sx sx sx

| {z } | {z }

=0 =1

Pn

i=1 (xi )ui

x

= 1 +

s2x

Now lets take expectations on both side conditional on the data x1 , x2 , . . . , xn we have available:

Pn

(xi x)ui

E[1 |x1 , . . . , xn ] = E 1 + i=1 2 |x1 , . . . , xn

sx

Pn

i=1 (xi x )ui

= E [1 |x1 , . . . , xn ] + E |x 1 , . . . , x n

s2x

Pn

i=1 (xi x

)ui

= 1 + E |x1 , . . . , xn

s2x

n

X

= 1 + s2

x (xi x

)E [ui |x1 , . . . , xn ]

i=1

(because conditional on x1 , . . . , xn the expected value of any function of xi is the function itself)

n

X

= 1 + s2

x (xi x

)E [ui |xi ]

i=1

4

(because of SLR.4 ). We are not done though. We have to prove that the un-conditional expecta-

tion of 1 is equal to 1 . By the Law of Iterated Expectations:

h i h i (4)

E 1 = E E[1 |x1 , . . . , xn ] = E [1 ] = 1

From (2) we have that

0 = y 1 x

Notice that from (1) y = 0 + 1 x

+u

, so that (2) now becomes:

0 = 0 + 1 x

+u 1 x

= 0 + (1 1 )

x+u

Now lets take expectations on both side conditional on the data x1 , x2 , . . . , xn we have available:

h i

E[0 |x1 , . . . , xn ] = E 0 + (1 1 )

x+u

|x1 , . . . , xn

h i

= E [0 |x1 , . . . , xn ] + E (1 1 )

x|x1 , . . . , xn + E [

u|x1 , . . . , xn ]

h i

= 0 + xE (1 1 )|x1 , . . . , xn + E [

u|x1 , . . . , xn ]

is non-random)

" n #

1X

= 0 + E ui |xi

n i=1

h i

(because it has already been proved that E 1 = 1 )

n

1X

= 0 + E [ui |xi ]

n i=1

E[0 |x1 , . . . , xn ] = 0 (5)

(from SLR.4 ). As before, by the Law of Iterated Expectations one can show that E[0 ] = 0 .

Here we impose an additional assumption:

V ar[ui |xi ] = 2 , i

This is Assumption SLR.5 according to the lecture notes. Recall from before that:

Pn

(xi x

)ui

1 = 1 + i=1 2

sx

h i

We are now interested in the conditional variance V ar 1 |x1 , . . . , xn :

h i Pn

i=1 (xi x

)ui

V ar 1 |x1 , . . . , xn = V ar 1 + |x1 , . . . , xn

s2x

5

Pn

i=1 (xi )ui

x

= V ar [1 |x1 , . . . , xn ] + V ar |x1 , . . . , xn

s2x

Pn

i=1 (xi x

)ui

+ 2Cov 1 , |x1 , . . . , xn

s2x

(because by the properties of the variance, V ar(a + b) = V ar(a) + V ar(b) + 2Cov(a, b))

Pn

i=1 (xi x

)ui

= V ar |x1 , . . . , xn

s2x

(because 1 is just a number so it doesnt vary or covary with any random variable)

n

1 2X

=( ) (xi x

)2 V ar [ui |x1 , . . . , xn ]

s2x i=1

properties of the variance: V ar(c z) = c2 V ar(z) where c is a constant and z is a random variable)

n

1 2X

=( ) (xi x

)2 V ar [ui |xi ]

s2x i=1

n

1 2X

=( ) (xi x

)2 2

s2x i=1

h i 1

V ar 1 |x1 , . . . , xn = 2 2 (6)

sx

Pn

as i=1 (xi x )2 = s2x . The derivation of the conditional variance of 0 follows a same logic. Can

you derive it?

4 Goodness-of-Fit (R2 )

How much of the variation in y is explained by variation in x? If we are interested in that question,

we are after the coefficient of determination R2 . R2 gives us a sense of the goodness-of-fit of our

regression; i.e. it informs us about what fraction of the variation in y is due to variation in x.

What do we mean by saying variation in y? The squared Pn distance of yi from the sample mean

y informs us about the spread of yi and is denoted by i=1 (yi y)2 . Notice that if we divide

this expression by n 1 we get the sample variance for yi . For what follows we will work on

P n

i=1 (yi y

)2 . As yi = u

i + yi we can write:

n

X n

X

2 2

(yi y) = (

ui + yi y)

i=1 i=1

n

X

= 2i + (

u yi y)2 + 2

ui (

yi y)

i=1

n

X n

X n

X

= 2i +

u (

yi y)2 + 2 i (

u yi y) (7)

i=1 i=1 i=1

The last expression in (7) is 0. To see why, we can use yi = 0 + 1 xi and write:

n

X n

X

i (

u yi y) = i (0 + 1 xi y)

u

i=1 i=1

6

n

X n

X

= i (0 + 1 xi )

u i y

u

i=1 i=1

n

X n

X n

X

= 0 i + 1

u i xi y

u i

u

i=1 i=1 i=1

=0

Pn Pn

where the

Pnlast line comes from our sample moment conditions i=1 i = 0 and

u i=1 i xi = 0.

u

Hence: i (

i=1 u yi y) = 0. Going back to (7) we now have:

n

X n

X n

X

2

(yi y) = 2i +

u (

yi y)2

i=1 i=1 i=1

SST = SSR + SSE

where:

SSR: Sum of squared residuals (unexplained variation in y)

SSR SSE

1= +

SST SST

SSE SSR

=1

SST SST

SSE SSR

R2 =1

SST SST

R2 is the fraction of sample variation in y that is explained by x.

- Ej 1053979Uploaded byRahul Meena
- Expt5.docUploaded byValeria Silva Veas
- 2012-Option-A DSE entrance masters economicsUploaded byPenelopeChavez
- TEofIFVUploaded byYanti
- n13.pdfUploaded byChristine Straub
- Elements of Statistical Learning SolutionsUploaded byAditya Jain
- STA302_Mid_2009FUploaded byexamkiller
- Anderson Bror Sen 2005Uploaded byChristos Voulgaris
- Topic13 OLSUploaded byToulouse18
- bae2009.pdfUploaded byCassia Neves
- Exercise 10.21Uploaded byLeonard Gonzalo Saavedra Astopilco
- Técnica Data MinningUploaded byYoliEspínMontoro
- Estimating and testing multiple structural changes in linear models using band spectral regressionsUploaded bycheetahvn
- Amitabha Sinha Associate Professor Department of a & AUploaded byAmitabha Sinha
- lec12_chap10Uploaded byakirank1
- FilmerP01Uploaded byjlventigan
- Defense Pact in t TradeUploaded bymavioglu
- BRM - Industry AnalysisUploaded byLakshay Khosla
- FRM_2017_Part_I_GARP_Book_2.pdfUploaded byMohammad Khataybeh
- Panel Data Econometrics - Institutional Quality, Economic Activity and RecessionsUploaded byBruno Candea
- Exact InferenceUploaded byRoger Asencios Nuñez
- j 0353056065Uploaded byInternational Journal of computational Engineering research (IJCER)
- four day school week 2.pdfUploaded byYrence Olive
- 3-3-2Beta9Uploaded byQuan Le
- 9-Interest Rates, Saving AndUploaded byanaqshabbir
- PlmUploaded byManohar Giri
- Chapter 3-2Uploaded by錢子傑
- 11803-34530-1-PBUploaded byRina Yuli
- Example of Detrending RegressionsUploaded byFranco Imperial
- US Federal Trade Commission: 050114jerryhausmanUploaded byftc

- ijarce-1520145Uploaded byAkhirul Hajri
- bhavana_vandana.pdfUploaded byjoseluis829
- How I made $500k with machine learning and HFT... _ Jesse SpauldingUploaded bysjalum22
- 40 Reasons You Should Learn LatinUploaded byRaúl Romo
- 29702604 Liuhebafa Five Character Secrets Pdf1Uploaded byabarros_27411
- BM THAMPIUploaded byChandralekha K Rema
- ASASAsssUploaded byJUANJOSEFOX
- Masdars - Arabic PatternsUploaded byspeed2kx
- c7_science Form 1-HeatUploaded byNur Izwani
- Influence of Etcs on Line Capacity Generic StudyUploaded byAnonymous HiQx2gF
- assignment.docxUploaded byRaman Maharana
- Information Preservation and Weather Forecasting for Black HolesUploaded byCarlos
- antony.pdfUploaded bywillf1992
- Examining World Market Segmentation and Brand Positioning StrategiesUploaded byAngela Conejero
- Chi-squared Test -Uploaded byBizimana Evode
- Interpersonal Communication 3Uploaded bymihaela_zaharia
- Tools+in+Family+Assessment.pptUploaded bysaila salsabila
- j.1540-6563.2010.00273_65.xUploaded byDavid M
- Marking Scheme for Ppt 2009 (BI Pmr)Uploaded byThanamalar Palaniandy
- omgtUploaded byXTN
- How Cults Manipulate PeopleUploaded byapi-3755336
- Hydraulics and Hydraulics Machines - D.dinu, S. LiviuUploaded byadrianxtrn
- Color Theory PresentationUploaded byAishani Dhawan
- ds-study-qsUploaded byAbhijeet Panwar
- Pastoral Poetry and Pastoral DramaA Literary Inquiry, with Special Reference to the Pre-RestorationStage in England by Greg, Walter W., 1875-1959Uploaded byGutenberg.org
- Unit 501Uploaded bygeoff
- RRL.(final)Uploaded byIce cold
- 9702_w14_qp_22.pdfUploaded bySzeYee Oon
- CMI Level 7 ExamplesUploaded byRay Faiers
- Samuel.docxUploaded byhapber