Sie sind auf Seite 1von 466

Copyright 1996

Lawrence C. Marsh

PowerPoint Slides
for
Undergraduate Econometrics
by
Lawrence C. Marsh
To accompany: Undergraduate Econometrics
by R. Carter Hill, William E. Griffiths and George G. Judge
Publisher: John Wiley & Sons, 1997

Chapter 1

Copyright 1996

Lawrence C. Marsh

1.1

The Role of
Econometrics
in Economic Analysis
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

The Role of Econometrics


Using Information:
1. Information from economic theory.
2. Information from economic data.

1.2

Copyright 1996

Lawrence C. Marsh

Understanding Economic Relationships:


federal
budget

inflation

money supply

Dow-Jones
Stock Index

short term
treasury bills
trade
deficit
unemployment

power of
labor unions

Federal Reserve
Discount Rate

capital gains tax


crime rate

rent
control
laws

1.3

Copyright 1996

Lawrence C. Marsh

Economic Decisions
To use information effectively:
economic theory
economic data

economic
decisions

*Econometrics* helps us combine


economic theory and economic data .

1.4

Copyright 1996

Lawrence C. Marsh

The Consumption Function

1.5

Consumption, c, is some function of income, i :

c = f(i)
For applied econometric analysis
this consumption function must be
specified more precisely.

Copyright 1996

Lawrence C. Marsh

demand, qd, for an individual commodity:

qd = f( p, pc, ps, i )

1.6

demand

p = own price; pc = price of complements;


ps = price of substitutes; i = income
supply, qs, of an individual commodity:

qs = f( p, pc, pf )

supply

p = own price; pc = price of competitive products;


ps = price of substitutes; pf = price of factor inputs

Copyright 1996

Lawrence C. Marsh

How
How much
much ??
Listing the variables in an economic relationship is not enough.
For effective policy we must know the amount of change
needed for a policy instrument to bring about the desired
effect:
By how much should the Federal Reserve
raise interest rates to prevent inflation?
By how much can the price of football tickets
be increased and still fill the stadium?

1.7

Copyright 1996

Lawrence C. Marsh

Answering the How Much? question


Need to estimate parameters
that are both:
1. unknown
and
2. unobservable

1.8

Copyright 1996

Lawrence C. Marsh

The Statistical Model


Average or systematic behavior
over many individuals or many firms.
Not a single individual or single firm.
Economists are concerned with the
unemployment rate and not whether
a particular individual gets a job.

1.9

Copyright 1996

Lawrence C. Marsh

1.10

The Statistical Model


Actual vs. Predicted Consumption:
Actual = systematic part + random error
Consumption, c, is function, f, of income, i, with error, e:

c = f(i) + e
Systematic part provides prediction, f(i),
but actual will miss by random error, e.

Copyright 1996

Lawrence C. Marsh

The Consumption Function


c = f(i) + e
Need to define f(i) in some way.
To make consumption, c,
a linear function of income, i :
f(i) = 1 + 2 i
The statistical model then becomes:
c = 1 + 2 i + e

1.11

Copyright 1996

Lawrence C. Marsh

1.12

The Econometric Model


y = 1 + 2 X 2 + 3 X 3 + e
Dependent variable, y, is focus of study
(predict or explain changes in dependent variable).
Explanatory variables, X2 and X3, help us explain
observed changes in the dependent variable.

Copyright 1996

Lawrence C. Marsh

1.13

Statistical Models
Controlled (experimental)
vs.
Uncontrolled (observational)
Controlled experiment (pure science) explaining mass, y :
pressure, X2, held constant when varying temperature, X3,
and vice versa.
Uncontrolled experiment (econometrics) explaining consumption, y : price, X2, and income, X3, vary at the same time.

Copyright 1996

Lawrence C. Marsh

Econometric model
economic model
economic variables and parameters.
statistical model
sampling process with its parameters.
data
observed values of the variables.

1.14

Copyright 1996

Lawrence C. Marsh

The Practice of Econometrics

1.15

Uncertainty regarding an outcome.


Relationships suggested by economic theory.
Assumptions and hypotheses to be specified.
Sampling process including functional form.
Obtaining data for the analysis.
Estimation rule with good statistical properties.
Fit and test model using software package.
Analyze and evaluate implications of the results.
Problems suggest approaches for further research.

Copyright 1996

Lawrence C. Marsh

1.16

Note: the textbook uses the following symbol


to mark sections with advanced material :

Skippy

Chapter 2

Copyright 1996

Lawrence C. Marsh

2.1

Some Basic
Probability
Concepts
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Random Variable
random variable:
A variable whose value is unknown until it is observed.
The value of a random variable results from an experiment.
The term random variable implies the existence of some
known or unknown probability distribution defined over
the set of all possible values of that variable.
In contrast, an arbitrary variable does not have a
probability distribution associated with its values.

2.2

Copyright 1996

Lawrence C. Marsh

Controlled experiment values


of explanatory variables are chosen
with great care in accordance with
an appropriate experimental design.
Uncontrolled experiment values
of explanatory variables consist of
nonexperimental observations over
which the analyst has no control.

2.3

Copyright 1996

Lawrence C. Marsh

Discrete Random Variable


discrete random variable:
A discrete random variable can take only a finite
number of values, that can be counted by using
the positive integers.
Example: Prize money from the following
lottery is a discrete random variable:
first prize: $1,000
second prize: $50
third prize: $5.75
since it has only four (a finite number)
(count: 1,2,3,4) of possible outcomes:
$0.00; $5.75; $50.00; $1,000.00

2.4

Copyright 1996

Lawrence C. Marsh

Continuous Random Variable


continuous random variable:
A continuous random variable can take
any real value (not just whole numbers)
in at least one interval on the real line.
Examples:
Gross national product (GNP)
money supply
interest rates
price of eggs
household income
expenditure on clothing

2.5

Copyright 1996

Lawrence C. Marsh

Dummy Variable

A discrete random variable that is restricted


to two possible values (usually 0 and 1) is
called a dummy variable (also, binary or
indicator variable).
Dummy variables account for qualitative differences:
gender (0=male, 1=female),
race (0=white, 1=nonwhite),
citizenship (0=U.S., 1=not U.S.),
income class (0=poor, 1=rich).

2.6

Copyright 1996

Lawrence C. Marsh

2.7

A list of all of the possible values taken


by a discrete random variable along with
their chances of occurring is called a probability
function or probability density function (pdf).
die
one dot
two dots
three dots
four dots
five dots
six dots

x
1
2
3
4
5
6

f(x)
1/6
1/6
1/6
1/6
1/6
1/6

Copyright 1996

Lawrence C. Marsh

A discrete random variable X


has pdf, f(x), which is the probability
that X takes on the value x.

f(x) = P(X=x)

Therefore,

0 < f(x) < 1

If X takes on the n values: x1, x2, . . . , xn,


then f(x1) + f(x2)+. . .+f(xn) = 1.

2.8

Copyright 1996

Lawrence C. Marsh

Probability, f(x), for a discrete random


variable, X, can be represented by height:
0.4

f(x)

0.3
0.2
0.1
0

number, X, on Deans List of three roommates

2.9

Copyright 1996

Lawrence C. Marsh

2.10

A continuous random variable uses


area under a curve rather than the
height, f(x), to represent probability:
f(x)
red area
0.1324

green area
0.8676

$34,000

$55,000

per capita income, X, in the United States

Copyright 1996

Lawrence C. Marsh

2.11
Since a continuous random variable has an
uncountably infinite number of values,
the probability of one occurring is zero.
P[X=a] = P[a<X<a]=0
Probability is represented by area.
Height alone has no area.
An interval for X is needed to get
an area under the curve.

Copyright 1996

Lawrence C. Marsh

2.12

The area under a curve is the integral of


the equation that generates the curve:
b

P[a<X<b]=

f(x) dx

For continuous random variables it is the


integral of f(x), and not f(x) itself, which
defines the area and, therefore, the probability.

Copyright 1996

Lawrence C. Marsh

Rules of Summation
n

Rule 1:

x = x1 + x2 + . . . + xn

i=1 i

Rule 2:

axi = ai
= 1xi

i=1

Rule 3:

xi +yi = i
= 1xi + i
= 1yi

i=1

Note that summation is a linear operator


which means it operates term by term.

2.13

Copyright 1996

Lawrence C. Marsh

2.14

Rules of Summation (continued)


n

Rule 4:

axi +byi = ai
= 1xi + bi
= 1yi

i=1

Rule 5:

= n i
= 1xi =

x1 + x2 + . . . + xn
n

The definition of x as given in Rule 5 implies


the following important fact:
n

xi x) = 0

i=1

Copyright 1996

Lawrence C. Marsh

2.15

Rules of Summation (continued)


n

Rule 6:

f(xi) = f(x1) + f(x2) + . . . + f(xn)

i=1

Notation:
n m

Rule 7:

x f(xi) = i f(xi) = i= 1f(xi)


n

= 1 [ f(xi,y1) + f(xi,y2)+. . .+ f(xi,ym)]


f(xi,yj) = i

i=1 j=1

The order of summation does not matter :


n m

m n

= 1 i
= f(x
1
f(xi,yj) =j
i,yj)

i=1 j=1

Copyright 1996

Lawrence C. Marsh

2.16

The Mean of a Random Variable


The mean or arithmetic average of a
random variable is its mathematical
expectation or expected value, EX.

Copyright 1996

Lawrence C. Marsh

Expected Value

2.17

There are two entirely different, but mathematically


equivalent, ways of determining the expected value:

1. Empirically:
The expected value of a random variable, X,
is the average value of the random variable in an
infinite number of repetitions of the experiment.
In other words, draw an infinite number of samples,
and average the values of X that you get.

Copyright 1996

Lawrence C. Marsh

Expected Value

2.18

2. Analytically:
The expected value of a discrete random
variable, X, is determined by weighting all
the possible values of X by the corresponding
probability density function values, f(x), and
summing them up.
In other words:

E[X] = x1f(x1) + x2f(x2) + . . . + xnf(xn)

Copyright 1996

Lawrence C. Marsh

Empirical vs. Analytical

As sample size goes to infinity, the


empirical and analytical methods
will produce the same value.
In the empirical case when the
sample goes to infinity the values
of X occur with a frequency
equal to the corresponding f(x)
in the analytical expression.

2.19

Copyright 1996

Lawrence C. Marsh

2.20

Empirical (sample) mean:


n

x = i= 1xi
where n is the number of sample observations.
Analytical mean:
n

E[X] = i = 1xif(xi)
where n is the number of possible values of xi.
Notice how the meaning of n changes.

Copyright 1996

Lawrence C. Marsh

The expected value of X:


n

EX =

xi f(xi)
i=1

The expected value of X-squared:


2

EX =

xi
i=1

f(xi)

It is important to notice that f(xi) does not change!

The expected value of X-cubed:


3

EX =

x
i
i=1

f(xi)

2.21

Copyright 1996

EX

Lawrence C. Marsh

2.22

= 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)

= 1.9
2

EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)


= 0 + .3 + 1.2 + 1.8 + 1.6

= 4.9
3

EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) +4 (.1)


= 0 + .3 + 2.4 + 5.4 + 6.4

= 14.5

Copyright 1996

Lawrence C. Marsh

2.23

E [g(X)] =

g(xi)

i=1

f(xi)

g(X) = g1(X) + g2(X)


n

E [g(X)] =

g1(xi) + g2(xi)] f(xi)

i=1

E [g(X)] =

g1(xi) f(xi) +i =1 g2(xi) f(xi)

i=1

E [g(X)] = E [g1(X)] + E [g2(X)]

Copyright 1996

Lawrence C. Marsh

Adding and Subtracting


Random Variables

2.24

E(X+Y) = E(X) + E(Y)


E(X-Y) = E(X) - E(Y)

Copyright 1996

Lawrence C. Marsh

2.25

Adding a constant to a variable will


add a constant to its expected value:

E(X+a) = E(X) + a
Multiplying by constant will multiply
its expected value by that constant:

E(bX) = b E(X)

Copyright 1996

Variance

Lawrence C. Marsh

2.26

var(X) = average squared deviations


around the mean of X.

var(X) = expected value of the squared deviations


around the expected value of X.
2

var(X) = E [(X - EX) ]

Copyright 1996

Lawrence C. Marsh

2.27

var(X) = E [(X - EX) ]


2

var(X) = E [(X - EX) ]


2

= E [X - 2XEX + (EX) ]
2

= E(X ) - 2 EX EX + E (EX)
2

= E(X ) - 2 (EX) + (EX)


2

= E(X ) - (EX)
2

var(X) = E(X ) - (EX)

Copyright 1996

Lawrence C. Marsh

2.28

variance of a discrete
random variable, X:

var ( X) =

(xi - EX ) f (xi )

i=1

standard deviation is square root of variance

Copyright 1996

Lawrence C. Marsh

2.29

calculate the variance for a


discrete random variable, X:
2

xi

f(xi)

(xi - EX)

(xi - EX) f(xi)

2
3
4
5
6

.1
.3
.1
.2
.3

2 - 4.3 = -2.3
3 - 4.3 = -1.3
4 - 4.3 = - .3
5 - 4.3 = .7
6 - 4.3 = 1.7

5.29 (.1) =
1.69 (.3) =
.09 (.1) =
.49 (.2) =
2.89 (.3) =

.529
.507
.009
.098
.867

x f(xi) = .2 + .9 + .4 + 1.0 + 1.8 = 4.3

i=1 i

(xi - EX) f(xi) = .529 + .507 + .009 + .098 + .867


= 2.01

i=1

Copyright 1996

Lawrence C. Marsh

2.30

Z = a + cX
var(Z) = var(a + cX)
= E [(a+cX) - E(a+cX)]
2

= c var(X)
2

var(a + cX) = c var(X)

Copyright 1996

Joint pdf

Lawrence C. Marsh

2.31

A joint probability density function,


f(x,y), provides the probabilities
associated with the joint occurrence
of all of the possible pairs of X and Y.

Copyright 1996

Lawrence C. Marsh

Survey of College City, NY

joint pdf
f(x,y)
vacation X = 0
homes
owned
X=1

college grads
in household
Y=2
Y=1

f(0,1)
.45

f(0,2)
.15

.05
f(1,1)

.35
f(1,2)

2.32

Copyright 1996

Lawrence C. Marsh

2.33

Calculating the expected value of


functions of two random variables.

E[g(X,Y)] = g(xi,yj) f(xi,yj)


i

E(XY) = xi yj f(xi,yj)
i

E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75

Copyright 1996

Lawrence C. Marsh

2.34

Marginal pdf

The marginal probability density functions,


f(x) and f(y), for discrete random variables,
can be obtained by summing over the f(x,y)
with respect to the values of Y to obtain f(x)
with respect to the values of X to obtain f(y).

f(xi) = f(xi,yj)
j

f(yj) = f(xi,yj)
i

Copyright 1996

Lawrence C. Marsh

marginal

Y=1

Y=2

2.35

marginal
pdf for X:

X=0

.45

.15

f
(X
=
0)
.60

X=1

.05

.35

.40 f(X = 1)

marginal
pdf for Y:

.50

.50
f(Y = 2)

f(Y = 1)

Copyright 1996

Lawrence C. Marsh

Conditional pdf

2.36

The conditional probability density


functions of X given Y=y , f(x|y),
and of Y given X=x , f(y|x),
are obtained by dividing f(x,y) by f(y)
to get f(x|y) and by f(x) to get f(y|x).

f(x,y)
f(x|y) = f(y)

f(x,y)
f(y|x) = f(x)

Copyright 1996

Lawrence C. Marsh

2.37

conditonal
Y=1
f(Y=1|X = 0)=.75
X=0
f(X=0|Y=1)=.90 .90
f(X=1|Y=1)=.10 .10
X=1

.75

.45

Y=2
.25

f(Y=2|X= 0)=.25

.60

.15

.05 .35

.30
.70

f(X=0|Y=2)=.30
f(X=1|Y=2)=.70

.40

.125 .875
f(Y=1|X = 1)=.125

.50

.50

f(Y=2|X = 1)=.875

Copyright 1996

Lawrence C. Marsh

Independence

X and Y are independent random


variables if their joint pdf, f(x,y),
is the product of their respective
marginal pdfs, f(x) and f(y) .

f(xi,yj) = f(xi) f(yj)


for independence this must hold for all pairs of i and j

2.38

Copyright 1996

Lawrence C. Marsh

not independent
Y=1

Y=2

.50x.60=.30

.50x.60=.30

2.39

marginal
pdf for X:

X=0

.45

.15

f
(X
=
0)
.60

X=1

.05

.35

.40 f(X = 1)

.50x.40=.20

marginal
pdf for Y:

.50

f(Y = 1)

.50x.40=.20

.50
f(Y = 2)

The calculations
in the boxes show
the numbers
required to have
independence.

Copyright 1996

Lawrence C. Marsh

2.40

Covariance

The covariance between two random


variables, X and Y, measures the
linear association between them.

cov(X,Y) = E[(X - EX)(Y-EY)]


Note that variance is a special case of covariance.
2

cov(X,X) = var(X) = E[(X - EX) ]

Copyright 1996

Lawrence C. Marsh

2.41

cov(X,Y) = E [(X - EX)(Y-EY)]


cov(X,Y) = E [(X - EX)(Y-EY)]
= E [XY - X EY - Y EX + EX EY]
= E(XY) - EX EY - EY EX + EX EY
= E(XY) - 2 EX EY + EX EY
= E(XY) - EX EY
cov(X,Y) = E(XY) - EX EY

Y=1
X=0
X=1

Copyright 1996

Y=2

Lawrence C. Marsh

2.42

.45

.15

.60

.05

.35

.40

.50

.50

EY=1(.50)+2(.50)=1.50
EX EY = (.40)(1.50) = .60

EX=0(.60)+1(.40)=.40

covariance
cov(X,Y) = E(XY) - EX EY
= .75 - (.40)(1.50)
= .75 - .60
= .15

E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75

Copyright 1996

Lawrence C. Marsh

Correlation

2.43

The correlation between two random


variables X and Y is their covariance
divided by the square roots of their
respective variances.

(X,Y) =

cov(X,Y)
var(X) var(Y)

Correlation is a pure number falling between -1 and 1.

Y=1

Copyright 1996

Y=2

Lawrence C. Marsh

2.44

EX=.40
2

EX=0(.60)+1(.40)=.40

X=0
X=1

.45

.15

.60

.05

.35

.40

cov(X,Y) = .15

.50

EY=1.50
2 2

.50

2
EY=1(.50)+2(.50)
2
var(Y) = E(Y ) - (EY)
= .50 + 2.0
= 2.50 - (1.50)2
= 2.50
= .25

var(X) = E(X ) - (EX)


2
= .40 - (.40)
= .24

correlation
(X,Y) =

cov(X,Y)
var(X) var(Y)

(X,Y) = .61

Copyright 1996

Lawrence C. Marsh

2.45

Zero Covariance & Correlation


Independent random variables
have zero covariance and,
therefore, zero correlation.
The converse is not true.

Copyright 1996
Since expectation is a linear operator,
it can be applied term by term.

Lawrence C. Marsh

2.46

The expected value of the weighted sum


of random variables is the sum of the
expectations of the individual terms.

E[c1X + c2Y] = c1EX + c2EY


In general, for random variables X1, . . . , Xn :

E[c1X1+...+ cnXn] = c1EX1+...+ cnEXn

Copyright 1996

Lawrence C. Marsh

The variance of a weighted sum of random


variables is the sum of the variances, each times
the square of the weight, plus twice the covariances
of all the random variables times the products of
their weights.

2.47

Weighted sum of random variables:


2
2

var(c1X + c2Y)=c1 var(X)+c2 var(Y) + 2c1c2cov(X,Y)


Weighted difference of random variables:

var(c1X c2Y) = c21 var(X)+c22var(Y) 2c1c2cov(X,Y)

Copyright 1996

Lawrence C. Marsh

The Normal Distribution

2.48

Y ~ N(, )
2

f(y) =
f(y)

exp

2 2

2
(y
)
2 2

Copyright 1996

Lawrence C. Marsh

The Standardized Normal

Z = (y - )/
Z ~ N(,)
f(z) =

exp

2
z
-

2.49

Copyright 1996
2

Y ~ N(, )

f(y)

P[Y>a]

= P

>

2.50

Y-

Lawrence C. Marsh

a-

= P Z >

y
a-

Copyright 1996
2

Y ~ N(, )

f(y)

P[a<Y<b]

= P

a-

2.51

a-

Lawrence C. Marsh

<

Y-

<Z<

<

b-

b-

Copyright 1996

Lawrence C. Marsh

2.52

Linear combinations of jointly


normally distributed random variables
are themselves normally distributed.
Y1 ~ N(1,12), Y2 ~ N(2,22), . . . , Yn ~ N(n,n2)

W = c1Y1 + c2Y2 + . . . + cnYn

W ~ N[ E(W), var(W) ]

Copyright 1996

Lawrence C. Marsh

Chi-Square

2.53

If Z1, Z2, . . . , Zm denote m independent


N(0,1) 2random
variables,
and
2
2
2
V = Z1 + Z2 + . . . + Zm , then V ~ (m)
V is chi-square with m degrees of freedom.
mean:

E[V] = E[ (m) ] = m

variance:

var[V] = var[ (m) ] = 2m


2

Copyright 1996

Student - t

Lawrence C. Marsh

2.54

If Z ~ N(0,1) and V ~ and if Z and V


are independent then,
Z
2
(m)

t=

~ t(m)

t is student-t with m degrees of freedom.


mean:

E[t] = E[t(m) ] = 0

variance:

var[t]

symmetric about zero

= var[t(m) ] =

m / (m2)

Copyright 1996

Lawrence C. Marsh

2.55

F Statistic
If V1 ~ (m ) and V2 ~
2

2
(m2)

are independent, then

F=

and if V1 and V2

V1
V2

m1

~ F(m ,m )
1

m2

F is an F statistic with m1 numerator


degrees of freedom and m2 denominator
degrees of freedom.

Chapter 3

Copyright 1996

Lawrence C. Marsh

3.1

The Simple Linear


Regression
Model
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

3.2

Purpose of Regression Analysis


1. Estimate a relationship among economic
variables, such as y = f(x).
2. Forecast or predict the value of one
variable, y, based on the value of
another variable, x.

Copyright 1996

Lawrence C. Marsh

Weekly Food Expenditures


y = dollars spent each week on food items.
x = consumers weekly income.
The relationship between x and the expected
value of y , given x, might be linear:
E(y|x) = 1 + 2 x

3.3

Copyright 1996

Lawrence C. Marsh

f(y|x=480)
f(y|x=480)

y|x=480

Figure 3.1a Probability Distribution f(y|x=480)


of Food Expenditures if given income x=$480.

3.4

f(y|x)

Copyright 1996

Lawrence C. Marsh

f(y|x=480)

f(y|x=800)

y|x=480

y|x=800

Figure 3.1b Probability Distribution of Food


Expenditures if given income x=$480 and x=$800.

3.5

Average
Expenditure

Copyright 1996

E(y|x)

Lawrence C. Marsh

3.6

E(y|x)=1+2x
E(y|x)
x

E(y|x)
2 =
x

1{
x (income)
Figure 3.2 The Economic Model: a linear relationship
between avearage expenditure on food and income.

Copyright 1996

Lawrence C. Marsh

Homoskedastic Case
yt
ex
p

en
di

tu
re

f(yt)

.
.

x1=480

x2=800

income

xt

Figure 3.3. The probability density function


for yt at two levels of household income, x t

3.7

Copyright 1996

Lawrence C. Marsh

Heteroskedastic Case
t

f(yt)
d
n
e
p
ex

e
r
itu

.
x1

x2

x3

.
income

Figure 3.3+. The variance of yt increases


as household income, x t , increases.

xt

3.8

Copyright 1996

Lawrence C. Marsh

Assumptions of the Simple Linear


Regression Model - I
1. The average value of y, given x, is given by
the linear regression:
E(y) = 1 + 2x
2. For each value of x, the values of y are
distributed around their mean with variance:
var(y) = 2
3. The values of y are uncorrelated, having zero
covariance and thus no linear relationship:
cov(yi ,yj) = 0
4. The variable x must take at least two different
values, so that x c, where c is a constant.

3.9

Copyright 1996

Lawrence C. Marsh

3.10

One more assumption that is often used in


practice but is not required for least squares:
5. (optional) The values of y are normally
distributed about their mean for each
value of x:
y ~ N [( 1+ 2x), 2 ]

Copyright 1996

Lawrence C. Marsh

The Error Term

3.11

y is a random variable composed of two parts:


I. Systematic component:
This is the mean of y.

E(y) = 1 + 2x

II. Random component:

e = y - E(y)
= y - 1 - 2x
This is called the random error.

Together E(y) and e form the model:


y = 1 + 2x + e

Copyright 1996

3.12

y4

e4 {

y3
y2

y1

Lawrence C. Marsh

e2 {.

E(y) = 1 + 2x

.} e3

e1
}
.
x1

x2

x3

x4

Figure 3.5 The relationship among y, e and


the true regression line.

Copyright 1996

y^ 3

y2

^e {.
2 .
y^ 1.

y^2

Lawrence C. Marsh

3.13

y4
.
^e {
4
.y^

^y = b + b x
1
2

x4

.} ^e3
.
y
3

^e
}
. 1

y1

x1

x2

x3

Figure 3.7a The relationship among y, e^ and


the fitted regression line.

Copyright 1996

y^*1.

y^*2

.
^e* {y
2 . 2

y^*3

.
^e* {
3
.
y

Lawrence C. Marsh

. y4

{.

^e*
4

y^*4

3.14

^y = b + b x
1
2
^y*= b* + b* x
1
2

^e*
1

y1.

x1

x2

x3

x4

Figure 3.7b The sum of squared residuals


from any other line will be larger.

f(.)

Copyright 1996

Lawrence C. Marsh

f(e)

f(y)

1+2x

3.15

Figure 3.4 Probability density function for e and y

Copyright 1996

Lawrence C. Marsh

The Error Term Assumptions


1. The value of y, for each value of x, is
y = 1 + 2x + e
2. The average value of the random error e is:
E(e) = 0
3. The variance of the random error e is:
var(e) = 2 = var(y)
4. The covariance between any pair of es is:
cov(ei ,ej) = cov(yi ,yj) = 0

5. x must take at least two different values so that


x c, where c is a constant.
6. e is normally distributed with mean 0, var(e)=2
(optional)
e ~ N(0, 2)

3.16

Copyright 1996

Lawrence C. Marsh

Unobservable Nature
of the Error Term

3.17

1. Unspecified factors / explanatory variables,


not in the model, may be in the error term.
2. Approximation error is in the error term if
relationship between y and x is not exactly
a perfectly linear relationship.
3. Strictly unpredictable random behavior that
may be unique to that observation is in error.

Copyright 1996

Lawrence C. Marsh

Population regression values:


y t = 1 + 2x t + e t
Population regression line:
E(y t|x t) = 1 + 2x t
Sample regression values:
^
y t = b 1 + b2x t + e t
Sample regression line:
^
y t = b 1 + b2x t

3.18

Copyright 1996

Lawrence C. Marsh

3.19

y t = 1 + 2x t + e t
e t = y t - 1 - 2 x t
Minimize error sum of squared deviations:
T

S(1,2) = t=1
( y t

- 1 - 2x t )2

(3.3.4)

Copyright 1996

Minimize w. r. t. 1 and 2:

Lawrence C. Marsh

3.20

S(1,2) = t=1
(y t

- 1 - 2x t )

(3.3.4)

S()

1 = - 2 (y t - 1 - 2x t )

S()

2 = - 2 x t (y t - 1 - 2x t )

Set each of these two derivatives equal to zero and


solve these two equations for the two unknowns: 1

Copyright 1996

Minimize w. r. t. 1 and 2:

Lawrence C. Marsh

3.21

S(.)

S() = t
=1(y t

- 1 - 2 x t )2

S(.)
<
0
i

S(.)
= 0
i

bi

S(.)

.S(.)
>
0
i

Copyright 1996

Lawrence C. Marsh

To minimize S(.), you set the two


derivatives equal to zero to get:

3.22

S()

1 = - 2 (y t - b1 - b2x t ) = 0

S()

2 = - 2 x t (y t - b1 - b2x t ) = 0

When these two terms are set to zero,


1 and 2 become b1 and b2 because they no longer
represent just any value of 1 and 2 but the special
values that correspond to the minimum of S() .

Copyright 1996

- 2 (y t

- b1 - b2x t )

- 2 x t (y t

Lawrence C. Marsh

3.23

= 0

- b1 - b2x t )

= 0

y t - Tb1 - b2 x t = 20
x t y t - b1 x t - b2 xt = 0

Tb1

+ b2 x t
2

b1 x t + b2 xt

= y t
= x t y t

Copyright 1996

Tb1

+ b2 x t

b1 x t + b2 xt

Lawrence C. Marsh

= y t
= x t y t

Solve for b1 and b2 using definitions of

b2 =

3.24

T x t y t -

x t y t

2
2
T x t - ( x t)

b1 = y - b2 x

and

Copyright 1996

Lawrence C. Marsh

elasticities

y x
y/y
percentage change in y
=
=
=
x y
percentage change in x
x/x
Using calculus, we can get the elasticity at a point:
= lim

x 0

y x
y x
=
x y
x y

3.25

Copyright 1996

Lawrence C. Marsh

applying elasticities
E(y) = 1 + 2 x
E(y)
x

= 2

x
E(y) x
= 2
=
E(y)
x E(y)

3.26

Copyright 1996

Lawrence C. Marsh

estimating elasticities
y x
=
x y
^

3.27

x
= b2
y

y^ t = b1 + b2 x t = 4 + 1.5 x t
x
y

= 8 = average number of years of experience


= $10 = average wage rate

x
= b2
y
^

8
= 1.5 10 = 1.2

Copyright 1996

Lawrence C. Marsh

Prediction

3.28

Estimated regression equation:


^
y

xt =
^
yt =
If
If

xt =
xt =

= 4 + 1.5 x t
years of experience
predicted wage rate
^

2 years, then yt = $7.00 per hour.


^

3 years, then yt = $8.50 per hour.

Copyright 1996

Lawrence C. Marsh

log-log models
ln(y) = 1 + 2 ln(x)
ln(y)
ln(x)
= 2
x
x
1 y
y x

= 2

1 x
x x

3.29

Copyright 1996

1 y
y x

= 2

x y
y x

= 2

Lawrence C. Marsh

1 x
x x

3.30

elasticity of y with respect to x:


=

x y
y x

= 2

Chapter 4

Copyright 1996

Lawrence C. Marsh

4.1

Properties of
Least Squares
Estimators
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Simple Linear Regression Model

4.2

yt = 1 + 2 x t + t
yt = household weekly food expenditures
x t = household weekly income
For a given level of x t, the expected
level of food expenditures will be:
E(yt|x t) =

1 + 2 x t

Copyright 1996

Lawrence C. Marsh

Assumptions of the Simple


Linear Regression Model
1. yt = 1 + 2x t + t
2. E( t) = 0 <=> E(yt) = 1 + 2x t
3. var( t) =

2 = var(yt)

4. cov( i, j) = cov(yi,yj) = 0
5.

x t c for every observation

6.

t~N(0,2) <=> yt~N(1+ 2x t,2)

4.3

Copyright 1996

Lawrence C. Marsh

The population parameters 1 and 2


are unknown population constants.
The formulas that produce the
sample estimates b1 and b2 are
called the estimators of 1 and

2.

When b0 and b1 are used to represent


the formulas rather than specific values,
they are called estimators of 1 and 2
which are random variables because
they are different from sample to sample.

4.4

Copyright 1996

Lawrence C. Marsh

4.5

Estimators are Random Variables


( estimates are not )
If the least squares estimators b0 and b1
are random variables, then what are their
their means, variances, covariances and
probability distributions?
Compare the properties of alternative
estimators to the properties of the
least squares estimators.

Copyright 1996

Lawrence C. Marsh

4.6

The Expected Values of b1 and b2


The least squares formulas (estimators)
in the simple regression case:

where

Txtyt - xt yt
b2 =
2
2
Txt -(xt)

(3.3.8a)

b1 = y - b2x

(3.3.8b)

y = yt / T

and

x = x t / T

Substitute in
to get:

Copyright 1996 Lawrence C. Marsh


4.7
t
1
2 t
t

y = +x +

Txtt - xt t
b2 = 2 +
2
2
Txt -(xt)

The mean of b2 is:

TxtEt - xt Et
Eb2 = 2 +
2
2
Txt -(xt)

Since

Et = 0,

then

Eb2 = 2 .

Copyright 1996

Lawrence C. Marsh

An Unbiased Estimator

Eb2 = 2 means that


the distribution of b2 is centered at 2.
The result

Since the distribution of b2

is centered at 2 ,we say that

b2 is an unbiased estimator of 2.

4.8

Copyright 1996

Lawrence C. Marsh

Wrong Model Specification

The unbiasedness result on the


previous slide assumes that we
are using the correct model.
If the model is of the wrong form
or is missing important variables,
then Et 0, then Eb2 2 .

4.9

Copyright 1996

Lawrence C. Marsh

4.10

Unbiased Estimator of the Intercept


In a similar manner, the estimator b1
of the intercept or constant term can be
shown to be an unbiased estimator of 1
when the model is correctly specified.

Eb1 = 1

Copyright 1996

Lawrence C. Marsh

4.11

Equivalent expressions for b2:


(xt x )yt y )
b2 =
2
xt x )

(4.2.6)

Expand and multiply top and bottom by T:

Txtyt xt yt
b2 =
2
2
Txt (xt)

(3.3.8a)

Copyright 1996

Lawrence C. Marsh

4.12

Variance of b2

Given that both yt and t have variance 2,


the variance of the estimator b2 is:

var(b2) =

x t x

b2 is a function of the yt values but


var(b2) does not involve yt directly.

Copyright 1996

Lawrence C. Marsh

4.13

Variance of b1
Given

b1 = y b2x

the variance of the estimator b1 is:

x
var(b1) = 2
2
x t x
2
t

Copyright 1996

Lawrence C. Marsh

Covariance of b1 and b2

cov(b1,b2) = 2

4.14

x
x t x

If x = 0, slope can change without affecting


the variance.

Copyright 1996

Lawrence C. Marsh

What factors determine


variance and covariance ?

4.15

1. 2: uncertainty about yt values uncertainty about


b1, b2 and their relationship.
2. The more spread out the xt values are then the more
confidence we have in b1, b2, etc.
3. The larger the sample size, T, the smaller the
variances and covariances.
4. The variance b1 is large when the (squared) xt values
are far from zero (in either direction).
5. Changing the slope, b2, has no effect on the intercept,
b1, when the sample mean is zero. But if sample
mean is positive, the covariance between b 1 and
b2
will be negative, and vice versa.

Copyright 1996

Lawrence C. Marsh

Gauss-Markov Theorm

4.16

Under the first five assumptions of the


simple, linear regression model, the
ordinary least squares estimators b 1
and b2 have the smallest variance of
all linear and unbiased estimators of
1 and 2. This means that b1and b2
are the Best Linear Unbiased Estimators
(BLUE) of 1 and 2.

Copyright 1996

Lawrence C. Marsh

implications of Gauss-Markov
1. b1 and b2 are best within the class
of linear and unbiased estimators.
2. Best means smallest variance within
the class of linear/unbiased.
3. All of the first five assumptions must
hold to satisfy Gauss-Markov.
4. Gauss-Markov does not require
assumption six: normality.
5. G-Markov is not based on the least
squares principle but on b1 and b2.

4.17

Copyright 1996

Lawrence C. Marsh

4.18

G-Markov implications (continued)


6. If we are not satisfied with restricting our
estimation to the class of linear and unbiased
estimators, we should ignore the Gauss-Markov
Theorem and use some nonlinear and/or biased
estimator instead. (Note: a biased or nonlinear
estimator could have smaller variance than
those satisfying Gauss-Markov.)
7. Gauss-Markov applies to the b1 and b2
estimators and not to particular sample values
(estimates) of b1 and b2.

Copyright 1996

Lawrence C. Marsh

Probability Distribution
of Least Squares Estimators

2
t

b1 ~ N 1 , x x 2
t

b2 ~ N 2 ,

x t x

4.19

Copyright 1996

Lawrence C. Marsh

yt and t normally distributed

4.20

The least squares estimator of 2 can be


expressed as a linear combination of yts:

b2 = wt yt

x t x
where wt =
2
x t x

b1 = y b2x

This means that b1and b2 are normal since


linear combinations of normals are normal.

Copyright 1996

Lawrence C. Marsh

normally distributed under


The Central Limit Theorem

4.21

If the first five Gauss-Markov assumptions


hold, and sample size, T, is sufficiently large,
then the least squares estimators, b 1 and b2,
have a distribution that approximates the
normal distribution with greater accuracy
the larger the value of sample size, T.

Copyright 1996

Lawrence C. Marsh

Consistency

4.22

We would like our estimators, b1 and b2, to collapse


onto the true population values, 1 and 2, as
sample size, T, goes to infinity.
One way to achieve this consistency property is for
the variances of b1 and b2 to go to zero as T goes to
infinity.
Since the formulas for the variances of the least
squares estimators b1 and b2 show that their
variances do, in fact, go to zero, then b 1 and b2, are
consistent estimators of 1 and 2.

Copyright 1996

Lawrence C. Marsh

4.23

Estimating the variance


of the error term, 2
^e

= yt b1 b2 x t
T

^2

= T2t
^

t =1

^ is an unbiased estimator of 2

Copyright 1996

Lawrence C. Marsh

The Least Squares


Predictor, y^o

4.24

Given a value of the explanatory


variable, Xo, we would like to predict
a value of the dependent variable, yo.
The least squares predictor is:

^y = b + b x
o
1
2 o

(4.7.2)

Chapter 5

Copyright 1996

Lawrence C. Marsh

5.1

Inference
in the Simple
Regression Model
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Assumptions of the Simple


Linear Regression Model
1. yt = 1 + 2x t + t
2. E( t) = 0 <=> E(yt) = 1 + 2x t
3. var( t) =

2 = var(yt)

4. cov( i, j) = cov(yi,yj) = 0
5.

x t c for every observation

6.

t~N(0,2) <=> yt~N(1+ 2x t,2)

5.2

Copyright 1996

Lawrence C. Marsh

Probability Distribution
of Least Squares
Estimators

2
t

b1 ~ N 1 , x x 2
t
b2 ~ N 2 ,

x t x

5.3

Copyright 1996

Lawrence C. Marsh

Error Variance Estimation

5.4

Unbiased estimator of the error variance:

^2 =

^
e

Transform to a chi-square distribution:

^ 2

Copyright 1996

Lawrence C. Marsh

We make a correct decision if:

5.5

The null hypothesis is false and we decide to reject it.


The null hypothesis is true and we decide not to reject it.

Our decision is incorrect if:


The null hypothesis is true and we decide to reject it.
This is a type I error.
The null hypothesis is false and we decide not to reject it.
This is a type II error.

Copyright 1996

b2 ~ N 2 ,

Lawrence C. Marsh

x t x

Create a standardized normal random variable, Z,


by subtracting the mean of b2 and dividing by its
standard deviation:

b2 2
var(b2)

5.6

Copyright 1996

Lawrence C. Marsh

Simple Linear Regression


yt = 1 + 2x t + twhere E t = 0
yt ~ N(1+ 2x t , 2)
since Eyt = 1 + 2x t
t = yt 1 2x t
Therefore,

t ~ N(0,2) .

5.7

Copyright 1996

Lawrence C. Marsh

5.8

Create a Chi-Square
t ~ N(0,2) but want N(0,) .

t /~ N(0,) Standard Normal .


t /~

Chi-Square .

Copyright 1996

Lawrence C. Marsh

5.9

Sum of Chi-Squares

t =1 t /=
1 / 2 / T /
+ +. . .+ =
Therefore,

t =1 t /

Copyright 1996

Lawrence C. Marsh

Chi-Square degrees of freedom

5.10

Since the errors t = yt 1 2x t


are not observable, we estimate them with
the sample residuals e t = yt b1 b2x t.
Unlike the errors, the sample residuals are
not independent since they use up two degrees
of freedom by using b1 and b2 to estimate 1 and 2.
We get only T2 degrees of freedom instead of T.

Copyright 1996

Lawrence C. Marsh

5.11

Student-t Distribution
t=

~ t(m)

V/m

where Z ~ N(0,1)
and V ~

(m)
2

Copyright 1996

t =

Lawrence C. Marsh

Z
V / ( T2)

where Z =

5.12

~ t(m)

(b2 2)
var(b2)
and var(b2) =

( xi x )2

Copyright 1996

t =

Lawrence C. Marsh

V =

V / (T-2)

(T2)

(b2 2)
t =

5.13

var(b2)
(T2)

^2

( T2)

^2

Copyright 1996

var(b2) =

Lawrence C. Marsh

5.14

( xi x )2

(b2 2)

notice the
cancellations

t =

( xi x )2

^
(T2) 2
2

( T2)

(b2 2)

^2

( xi x )2

Copyright 1996

t =

(b2 2)

Lawrence C. Marsh

5.15

^2

t =

^
var(b )
2

( xi x )

(b2 2)

(b2 2)
se(b2)

Copyright 1996

Lawrence C. Marsh

5.16

Students t - statistic
t =

(b2 2)
se(b2)

~ t (T2)

t has a Student-t Distribution


with T2 degrees of freedom.

Copyright 1996

Lawrence C. Marsh

5.17

Figure 5.1 Student-t Distribution


f(t)
()

/2
-tc

/2
tc

red area = rejection region for 2-sided test

Copyright 1996

Lawrence C. Marsh

5.18

probability statements
P( t < -tc ) = P( t > tc ) =

P(-tc t tc) = 1
P(-tc

(b2 2)

se(b2)

t c) = 1

Copyright 1996

Lawrence C. Marsh

Confidence Intervals
Two-sided (1)x100% C.I. for 1:
b1 t/2[se(b1)], b1 + t/2[se(b1)]
Two-sided (1)x100% C.I. for 2:
b2 t/2[se(b2)], b2 + t/2[se(b2)]

5.19

Copyright 1996

Lawrence C. Marsh

5.20

Student-t vs. Normal Distribution


1. Both are symmetric bell-shaped distributions.
2. Student-t distribution has fatter tails than the normal.
3. Student-t converges to the normal for infinite sample.
4. Student-t conditional on degrees of freedom (df).
5. Normal is a good approximation of Student-t for the first few
decimal places when df > 30 or so.

Copyright 1996

Lawrence C. Marsh

Hypothesis Tests
1. A null hypothesis, H0.
2. An alternative hypothesis, H 1.
3. A test statistic.
4. A rejection region.

5.21

Copyright 1996

Lawrence C. Marsh

Rejection Rules

5.22

1. Two-Sided Test:
If the value of the test statistic falls in the critical region in either tail of the
t-distribution, then we reject the null hypothesis in favor of the alternative.
2. Left-Tail Test:
If the value of the test statistic falls in the critical region which lies in the
left tail of the t-distribution, then we reject the null hypothesis in favor of
the alternative.
2. Right-Tail Test:
If the value of the test statistic falls in the critical region which lies in the
right tail of the t-distribution, then we reject the null hypothesis in favor of
the alternative.

Copyright 1996

Lawrence C. Marsh

Format for Hypothesis Testing


1. Determine null and alternative hypotheses.
2. Specify the test statistic and its distribution
as if the null hypothesis were true.
3. Select and determine the rejection region.
4. Calculate the sample value of test statistic.
5. State your conclusion.

5.23

Copyright 1996

Lawrence C. Marsh

practical vs. statistical


significance in economics

5.24

Practically but not statistically significant:


When sample size is very small, a large average gap between
the salaries of men and women might not be statistically
significant.

Statistically but not practically significant:


When sample size is very large, a small correlation (say, =
0.00000001) between the winning numbers in the PowerBall
Lottery and the Dow-Jones Stock Market Index might be
statistically significant.

Copyright 1996

Lawrence C. Marsh

Type I and Type II errors

5.25

Type I error:
We make the mistake of rejecting the null
hypothesis when it is true.
= P(rejecting H0 when it is true).
Type II error:
We make the mistake of failing to reject the null
hypothesis when it is false.
= P(failing to reject H0 when it is false).

Copyright 1996

Lawrence C. Marsh

5.26

Prediction Intervals

A (1)x100% prediction interval for yo is:

y^ o tc se( f )
^
f = yo yo

se( f ) =

^
var( f )

1
o
^
^
2
var( f ) = 1 +
+

x t x2

Chapter 6

Copyright 1996

Lawrence C. Marsh

6.1

The Simple Linear


Regression Model
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Explaining Variation in yt

6.2

Predicting yt without any explanatory variables:


T

yt = 1 + et
T

(yt b1) = 0

t=1

et = t= 1(yt 1)

t=1
T

et
= (yt b1) = 0
1
t=1

t=1

yt Tb1 = 0

t=1

b1 = y

Why not y?

Copyright 1996

Lawrence C. Marsh

Explaining Variation in yt

6.3

^
y t = b 1 + b2x t + e t
^
Explained variation: yt = b1 + b2xt
Unexplained variation:

^e = y ^y = y b b x
t
t
t
t
1
2 t

Copyright 1996

Lawrence C. Marsh

Explaining Variation in yt

^
^
y t = y t + et

6.4

using y as baseline

^
^
yt y = yt y + et Why not y?
T

cross
^2 product
t=1
term
t
drops
out
T

(yty) = t= 1(yty) +e

t=1

SST = SSR + SSE

Copyright 1996

Lawrence C. Marsh

Total Variation in yt

SST = total sum of squares


SST measures variation of yt around y

SST

(yt y)
t=1

6.5

Copyright 1996

Lawrence C. Marsh

Explained Variation in yt

SSR = regression sum of squares

^y = b + b x
t
1
2 t

^
Fitted yt values:

SSR measures variation of yt around y

SSR

(yt y)
t=1

6.6

Copyright 1996

Lawrence C. Marsh

Unexplained Variation in yt
SSE = error sum of squares

^
^
et = ytyt = yt b1 b2xt
^

SSE measures variation of yt around yt

SSE

(yt yt) = et
t=1

t=1

6.7

Copyright 1996

Lawrence C. Marsh

Analysis of Variance Table


Table 6.1 Analysis of Variance Table
Source of
Sum of
Mean
Variation
DF
Squares
Square
Explained
1
SSR
SSR/1
Unexplained T-2
SSE
SSE/(T-2)
[= ^
2]
Total
T-1
SST

6.8

Copyright 1996

Lawrence C. Marsh

Coefficient of Determination
What proportion of the variation
in yt is explained?
2

0 R 1
2

R =

SSR
SST

6.9

Copyright 1996

Lawrence C. Marsh

Coefficient of Determination
SST = SSR + SSE
SST
SST

Dividing
by SST

SSR SSE
+
SST SST

1 =
2

R =

SSR
SST

SSR + SSE
SST SST

= 1

SSE
SST

6.10

Copyright 1996

Lawrence C. Marsh

Coefficient of Determination

6.11

R is only a descriptive measure.


2

R2 does not measure the quality


of the regression model.
Focusing solely on maximizing
2
R is not a good idea.

Copyright 1996

Lawrence C. Marsh

Correlation Analysis
Population:

Sample:

r=

6.12

cov(X,Y)
var(X) var(Y)

^
cov(X,Y)
^
var(X)

^
var(Y)

Copyright 1996

Lawrence C. Marsh

Correlation Analysis

6.13

^ = t=1
var(X)

^ =
var(Y)

^
cov(X,Y)
=

(xt x) /(T1)
T

(yt y) /(T1)
t=1

(xt x)(yt y)/(T1)

t=1

Copyright 1996

Lawrence C. Marsh

6.14

Correlation Analysis

Sample Correlation Coefficient


T

r=

(xt x)(yt y)

t=1

(xt x) (yt y)
t=1

t=1

Copyright 1996

Lawrence C. Marsh

Correlation Analysis and R

6.15

For simple linear regression analysis:


2

r = R

R is also the correlation


^
between yt and yt
measuring goodness of fit.

Copyright 1996

Lawrence C. Marsh

Regression Computer Output

6.16

Typical computer output of regression estimates:


Table 6.2 Computer Generated Least Squares Results
(1)
(2)
(3)
(4)
(5)
Parameter Standard T for H0:
Variable
Estimate
Error Parameter=0 Prob>|T|
INTERCEPT 40.7676 22.1387
1.841
0.0734
X
0.1283
0.0305
4.201
0.0002

Copyright 1996

Lawrence C. Marsh

Regression Computer Output


b1 = 40.7676

b2 = 0.1283

se(b1) =

^ 1)
var(b

490.12

se(b2) =

^ 2)
var(b

0.0009326 = 0.0305

t =
t =

40.7676
22.1287

b2
=
se(b2)

0.1283
0.0305

b1
se(b1)

= 22.1287

=
=

1.84
4.20

6.17

Copyright 1996

Lawrence C. Marsh

Regression Computer Output

6.18

Sources of variation in the dependent variable:


Table 6.3 Analysis of Variance Table
Sum of
Mean
Source
DF
Squares
Square
Explained
1 25221.2229 25221.2229
Unexplained 38 54311.3314 1429.2455
Total
39 79532.5544
R-square: 0.3171

Copyright 1996

Lawrence C. Marsh

Regression Computer Output


SST = (yty) = 79532
^
2
SSR = (yty) = 25221
^
2
SSE = et = 54311
SSE /(T-2) = ^2 = 1429.2455
2

R =

SSR
SST

= 1

SSE
= 0.317
SST

6.19

Copyright 1996

Lawrence C. Marsh

Reporting Regression Results

6.20

yt = 40.7676 + 0.1283xt
(s.e.) (22.1387) (0.0305)
yt = 40.7676 + 0.1283xt
(t) (1.84) (4.20)

Copyright 1996

Lawrence C. Marsh

Reporting Regression Results

6.21

R = 0.317
2

This R value may seem low but it is


typical in studies involving cross-sectional
data analyzed at the individual or micro level.
2

A considerably higher R value would be


expected in studies involving time-series data
analyzed at an aggregate or macro level.

Copyright 1996

Lawrence C. Marsh

Effects of Scaling the Data

6.22

Changing the scale of x


The estimated
coefficient and
standard error
change but the
other statistics
are unchanged.

yt = 1 + 2xt + et
yt = 1 + (c2)(xt/c) + et
yt = 1 +

* *
x+
2 t

et

where
*
=
2

c2

and

x*t = xt/c

Copyright 1996

Lawrence C. Marsh

Effects of Scaling the Data

6.23

Changing the scale of y


yt = 1 + 2xt + et
yt/c = (1/c) + (2/c)xt + et/c
All statistics
are changed
except for
the t-statistics
and R2 value.

*
*
*
y = +x
t

where y*t =

*1 = 1/c

*
+
e
t
t

yt/c

e*t = et/c

and

*2 = 2/c

Copyright 1996

Lawrence C. Marsh

Effects of Scaling the Data

6.24

Changing the scale of x and y


yt = 1 + 2xt + et
No change in
2
the R or the
t-statistics or
in regression
results for 2
but all other
stats change.

yt/c = (1/c) + (c2/c)xt/c + et/c


*
*
y = +
t

2x*t + e*t

where y*t =

*1 = 1/c

yt/c
and

e*t = et/c
x*t = xt/c

Copyright 1996

Lawrence C. Marsh

Functional Forms

6.25

The term linear in a simple


regression model does not mean
a linear relationship between
variables, but a model in which
the parameters enter the model
in a linear way.

Copyright 1996

Lawrence C. Marsh

Linear vs. Nonlinear

6.27

Linear Statistical Models:

yt = 1 + 2xt + et

yt = 1 + 2 ln(xt) + et

ln(yt) = 1 + 2xt + et

yt = 1 + 2x2t + et

Nonlinear Statistical Models:


3

yt = 1 + 2xt + et

yt = 1 + 2xt + et

yt = 1 + 2xt + exp(3xt) + et

Copyright 1996

Lawrence C. Marsh

Linear vs. Nonlinear

6.27

nonlinear
relationship
between food
expenditure and
income

food
expenditure

income

Copyright 1996

Lawrence C. Marsh

Useful Functional Forms


Look at
each form
and its
slope and
elasticity

1.
2.
3.
4.
5.
6.

Linear
Reciprocal
Log-Log
Log-Linear
Linear-Log
Log-Inverse

6.28

Copyright 1996

Lawrence C. Marsh

Useful Functional Forms

6.29

Linear
yt = 1 + 2xt + et
slope: 2

xt
elasticity: 2 y
t

Copyright 1996

Lawrence C. Marsh

6.30

Useful Functional Forms

Reciprocal
yt = 1 + 2 x + et
1

slope:
1
2 2

xt

t
elasticity:
1
2 x y
t

Copyright 1996

Lawrence C. Marsh

Useful Functional Forms

Log-Log
ln(yt)= 1 + 2ln(xt) + et
yt
slope: 2 x
t

elasticity: 2

6.31

Copyright 1996

Lawrence C. Marsh

Useful Functional Forms

Log-Linear
ln(yt)= 1 + 2xt + et
slope: 2 yt

elasticity: 2xt

6.32

Copyright 1996

Lawrence C. Marsh

6.33

Useful Functional Forms

Linear-Log
yt= 1 + 2ln(xt) + et
slope:

1
_

xt

elasticity:

1
_

yt

Copyright 1996

Lawrence C. Marsh

Useful Functional Forms

Log-Inverse
1

ln(yt) = 1 - 2x + et
t
yt
slope: 2 2
xt

1
elasticity: 2 x
t

6.34

Copyright 1996

Lawrence C. Marsh

Error Term Properties

1. E (et) = 0
2. var (et) =

3. cov(ei, ej) = 0
4. et ~ N(0,

6.35

Copyright 1996

Lawrence C. Marsh

Economic Models

1.
2.
3.
4.
5.

Demand Models
Supply Models
Production Functions
Cost Functions
Phillips Curve

6.36

Copyright 1996

Lawrence C. Marsh

Economic Models

6.37

1. Demand Models
* quality demanded (yd) and price (x)
* constant elasticity

ln(yt )= 1 + 2ln(x)t + et
d

Copyright 1996

Lawrence C. Marsh

Economic Models

6.38

2. Supply Models
* quality supplied (y ) and price (x)
* constant elasticity
s

ln(y )= 1 + 2ln(xt) + et
s
t

Copyright 1996

Lawrence C. Marsh

Economic Models

6.39

3. Production Functions
* output (y) and input (x)
* constant elasticity

Cobb-Douglas Production Function:

ln(yt)= 1 + 2ln(xt) + et

Copyright 1996

Lawrence C. Marsh

Economic Models

4a. Cost Functions


* total cost (y) and output (x)

yt = 1 + 2x t + et
2

6.40

Copyright 1996

Lawrence C. Marsh

Economic Models

6.41

4b. Cost Functions


* average cost (x/y) and output (x)

(yt/xt) = 1/xt + 2xt + et/xt

Copyright 1996

Lawrence C. Marsh

Economic Models

6.42

5. Phillips Curve
nonlinear in both variables and parameters

* wage rate (wt) and time (t)

wt wt-1
1
% wt = w
= u
t-1
t
unemployment rate, ut

Chapter 7

Copyright 1996

Lawrence C. Marsh

7.1

The Multiple
Regression Model
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Two Explanatory Variables


yt = 1 + 2xt2 + 3xt3 + et
xts affect yt
separately

yt
= 2
xt2

yt
= 3
xt3

But least squares estimation of 2


now depends upon both xt2 and xt3 .

7.2

Copyright 1996

Lawrence C. Marsh

Correlated Variables
yt = 1 + 2xt2 + 3xt3 + et
yt = output

xt2 = capital

xt3 = labor

Always 5 workers per machine.


If number of workers per machine
is never varied, it becomes impossible
to tell if the machines or the workers
are responsible for changes in output.

7.3

Copyright 1996

Lawrence C. Marsh

The General Model

7.4

yt = 1 + 2xt2 + 3xt3 +. . .+ KxtK + et


The parameter 1 is the intercept (constant) term.
The variable attached to 1 is xt1= 1.
Usually, the number of explanatory variables
is said to be K1 (ignoring xt1= 1), while the

number of parameters is K. (Namely: 1 . . . K).

Copyright 1996

Lawrence C. Marsh

Statistical Properties of et
1. E(et) = 0
2. var(et) = 2

covet , es= for t s


4. et ~ N(0, 2)

7.5

Copyright 1996

Lawrence C. Marsh

Statistical Properties of yt

7.6

1. E (yt) = 1 + 2xt2 +. . .+ KxtK


2. var(yt) = var(et) =

cov(yt ,ys) = cov(et , es) = 0


4. yt ~ N(1+2xt2 +. . .+KxtK, 2)

ts

Copyright 1996

Lawrence C. Marsh

Assumptions

7.7

1. yt = 1 + 2xt2 +. . .+ KxtK + et
2. E (yt) = 1 + 2xt2 +. . .+ KxtK
3. var(yt) = var(et) =

cov(yt ,ys) = cov(et ,es) = 0

ts

5. The values of xtk are not random


6. yt ~ N(1+2xt2 +. . .+KxtK, )
2

Copyright 1996

Lawrence C. Marsh

Least Squares Estimation


yt = 1 + 2xt2 + 3xt3 + et
T

S S(1, 2, 3) = yt12xt23xt3
t=1

Define:

y*t = yt y
x*t2 = xt2 x2
x*t3 = xt3 x3

7.8

Copyright 1996

Lawrence C. Marsh

Least Squares Estimators


b1 = yb1b2x2 b3x3
*2

* *

**

* *

b2 = yt xt22 xt3 2yt xt3xt2xt32


*

* *

xt2 xt3 xt2xt3


*2

* *

**

* *

b3 = yt xt32xt2 yt xt2xt3xt2
2
2
*

* *

xt2 xt3 xt2xt3

7.9

Copyright 1996

Lawrence C. Marsh

Dangers of Extrapolation

7.10

Statistical models generally are good only


within the relevant range. This means
that extending them to extreme data values
outside the range of the original data often
leads to poor and sometimes ridiculous results.
If height is normally distributed and the
normal ranges from minus infinity to plus
infinity, pity the man minus three feet tall.

Copyright 1996

Lawrence C. Marsh

Error Variance Estimation

7.11

Unbiased estimator of the error variance:

^
=

^
e

Transform to a chi-square distribution:

^ 2

Copyright 1996

Lawrence C. Marsh

Gauss-Markov Theorem

7.12

Under the assumptions of the


multiple regression model, the
ordinary least squares estimators
have the smallest variance of
all linear and unbiased estimators.
This means that the least squares
estimators are the Best Linear
U nbiased Estimators (BLUE).

Copyright 1996

Variances

Lawrence C. Marsh

7.13

yt = 1 + 2xt2 + 3xt3 + et

When r23 = 0
these reduce
to the simple
regression
formulas.

var(b2) =

(1 r23)(xt2 x2)

var(b3) =

(1 r23)(xt3 x3)
where r23 =

(xt2 x2)(xt3 x3)


(xt2 x2) (xt3 x3)
2

Copyright 1996

Lawrence C. Marsh

7.14

Variance Decomposition

The variance of an estimator is smaller when:


1. The error variance, , is smaller:
2

2. The sample size, T, is larger:

0.

T
t=
1(xt2 x2)

3. The variables values are more spread out:


2

4. The correlation is close to zero: r

(xt2 x2) .
2

0.

Copyright 1996

Lawrence C. Marsh

7.15

Covariances

yt = 1 + 2xt2 + 3xt3 + et
cov(b2,b3) =

r23 2
2

(1 r23) (xt2 x2)2 (xt3 x3)2


where r23 =

(xt2 x2)(xt3 x3)


(xt2 x2) (xt3 x3)
2

Copyright 1996

Lawrence C. Marsh

Covariance Decomposition

7.16

The covariance between any two estimators


is larger in absolute value when:
1. The error variance, , is larger.
2

2. The sample size, T, is smaller.


3. The values of the variables are less spread out.
4. The correlation, r23, is high.

Copyright 1996

Lawrence C. Marsh

Var-Cov Matrix

7.17

yt = 1 + 2xt2 + 3xt3 + et
The least squares estimators b1, b2, and b3
have covariance matrix:
var(b1) cov(b1,b2) cov(b1,b3)
cov(b1,b2,b3) = cov(b1,b2) var(b2)
cov(b2,b3)
cov(b1,b3) cov(b2,b3) var(b3)

Copyright 1996

Normal

Lawrence C. Marsh

7.18

yt = 1 + 2x2t + 3x3t +. . .+ KxKt + et


yt ~N (1 + 2x2t + 3x3t +. . .+ KxKt), 2
2
e
~
N(0,

)
This implies and is implied by: t

Since bk is a linear
function of the yts:

bk ~ N k, var(bk)

bk k
z =
~ N(0,1)
var(bk)

for k = 1,2,...,K

Copyright 1996

Student-t

Lawrence C. Marsh

7.19

Since generally the population variance


of bk , var(bk) , is unknown, we estimate
^
^
2
2
var(b
)
it with
which
uses

instead
of

.
k

t =

bk k

^ k)
var(b

bk k
=
se(bk)

t has a Student-t distribution with df=(TK).

Copyright 1996

Lawrence C. Marsh

Interval Estimation
bk k
P tc
se(bk)

tc

7.20

tc = 1

is critical value for (T-K) degrees of freedom

such that P(t

tc) = /2.

P bk tc se(bk) k bk + tc se(bk) = 1
Interval endpoints:

bk tc se(bk) , bk + tc se(bk)

Chapter 8

Copyright 1996

Lawrence C. Marsh

8.1

Hypothesis Testing
and
Nonsample Information
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Chapter 8: Overview
1.
2.
3.
4.
5.
6.
7.

Student-t Tests
Goodness-of-Fit
F-Tests
ANOVA Table
Nonsample Information
Collinearity
Prediction

8.2

Copyright 1996

Lawrence C. Marsh

Student - t Test

8.3

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et


Student-t tests can be used to test any linear
combination of the regression coefficients:

H0: 1 = 0

H0: 2 + 3 + 4 = 1

H0: 32 73 = 21

H0: 2 3 5

Every such t-test has exactly TK degrees of freedom


where K=#coefficients estimated(including the intercept).

Copyright 1996

Lawrence C. Marsh

One Tail Test

8.4

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et


H0: 3 0
H1: 3 > 0

b3
~ t (TK)
t=
se(b3)
df = TK
= T4

tc

Copyright 1996

Lawrence C. Marsh

Two Tail Test

8.5

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et


H0: 2 = 0
H1: 2 0

b2
~ t (TK)
t=
se(b2)
df = TK
= T4

-tc

tc

Copyright 1996

Lawrence C. Marsh

Goodness - of - Fit

Coefficient of Determination
T

SSR
=
SST

R =
2

0 R 1

T (yt y)
t=1

(yt y)
t=1

8.6

Copyright 1996

Lawrence C. Marsh

Adjusted R-Squared

8.7

Adjusted Coefficient of Determination

Original:
2

R =

SSR
SST

= 1

SSE
SST

Adjusted:

R = 1
2

SSE/(TK)
SST/(T1)

Copyright 1996

Lawrence C. Marsh

Computer Output

Table 8.2 Summary of Least Squares Results


Variable Coefficient Std Error t-value p-value
constant
104.79
6.48
16.17
0.000
price
6.642
3.191 2.081
0.042
advertising 2.984
0.167 17.868
0.000

b2
6.642
t=
=
=
se(b2)
3.191

2.081

8.8

Copyright 1996

Lawrence C. Marsh

Reporting Your Results

8.9

Reporting standard errors:

^y = X + X
t
t2
t3
(6.48)

(3.191)

(0.167)

(s.e.)

Reporting t-statistics:

^y = X + X
t
t2
t3
(16.17)

(-2.081)

(17.868)

(t)

Copyright 1996

Lawrence C. Marsh

Single Restriction F-Test

8.10

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et


(SSER SSEU)/J
F =
SSEU/(TK)

(1964.758 1805.168)/1
1805.168/(52 3)

= 4.33

H0: 2 = 0
H1: 2 0
dfn = J = 1
dfd = TK = 49

By definition this is the t-statistic squared:


t = 2.081
F = t2 =

Copyright 1996

Lawrence C. Marsh

Multiple Restriction F-Test

8.11

yt = 1 + 2Xt2 + 3Xt3 + 4Xt4 + et


H0: 2 = 0, 4 = 0
H1: H0 not true

(SSER SSEU)/J
F =
SSEU/(TK)

dfn = J = 2
First run the restricted
regression by dropping
dfd = TK = 49
Xt2 and Xt4 to get SSER.
Next run unrestricted regression to get SSEU .

Copyright 1996

F-Tests

Lawrence C. Marsh

8.12

F-Tests of this type are always right-tailed,


even for left-sided or two-sided hypotheses,
f(F) because any deviation from the null will
make the F value bigger (move rightward).
(SSER SSEU)/J
F =
SSEU/(TK)

Fc

Copyright 1996

Lawrence C. Marsh

F-Test of Entire Equation

8.13

yt = 1 + 2Xt2 + 3Xt3 + et
We ignore 1. Why?
(SSER SSEU)/J
F =
SSEU/(TK)

H0: 2 = 3 = 0
H1: H0 not true

dfn = J = 2
(13581.35 1805.168)/2
dfd = TK = 49
=
1805.168/(52 3)
= 0.05
= 159.828
Reject H ! Fc = 3.187
0

Copyright 1996

Lawrence C. Marsh

8.14

ANOVA Table

Table 8.3 Analysis of Variance Table


Sum of
Mean
Source
DF Squares
Square F-Value
Explained
2 11776.18 5888.09 158.828
Unexplained 49 1805.168
36.84
Total
51 13581.35
p-value: 0.0001

R =

SSR
=
SST

11776.18
13581.35

0.867

Copyright 1996

Lawrence C. Marsh

Nonsample Information

8.15

A certain production process is known to be


Cobb-Douglas with constant returns to scale.
ln(yt) = 1 + 2 ln(Xt2) + 3 ln(Xt3) + 4 ln(Xt4) + et
4 = (1 2 3)
where 2 + 3 + 4 = 1
ln(yt /Xt4) = 1 + 2 ln(Xt2/Xt4) + 3 ln(Xt3 /Xt4) + et
y*t = 1 + 2 X*t2 + 3 Xt3*+ 4 Xt4*+ et
Run least squares on the transformed model.
Interpret coefficients same as in original model.

Copyright 1996

Lawrence C. Marsh

Collinear Variables

8.16

The term independent variable means


an explanatory variable is independent of
of the error term, but not necessarily
independent of other explanatory variables.
Since economists typically have no control
over the implicit experimental design,
explanatory variables tend to move
together which often makes sorting out
their separate influences rather problematic.

Copyright 1996

Lawrence C. Marsh

Effects of Collinearity

8.17

A high degree of collinearity will produce:


1. no least squares output when collinearity is exact.
2. large standard errors and wide confidence intervals.
3. insignificant t-values even with high R 2 and a
significant F-value.
4. estimates sensitive to deletion or addition of a few
observations or insignificant variables.
5. good within-sample(same proportions) but poor
out-of-sample(different proportions) prediction.

Copyright 1996

Lawrence C. Marsh

Identifying Collinearity

8.18

Evidence of high collinearity include:


1. a high pairwise correlation between two
explanatory variables.
2. a high R-squared when regressing one
explanatory variable at a time on each of the
remaining explanatory variables.
3. a statistically significant F-value when the
t-values are statistically insignificant.
4. an R-squared that doesnt fall by much when
dropping any of the explanatory variables.

Copyright 1996

Lawrence C. Marsh

Mitigating Collinearity

8.19

Since high collinearity is not a violation of


any least squares assumption, but rather a
lack of adequate information in the sample:
1.
2.
3.
4.

collect more data with better information.


impose economic restrictions as appropriate.
impose statistical restrictions when justified.
if all else fails at least point out that the poor
model performance might be due to the
collinearity problem (or it might not).

Copyright 1996

Prediction

Lawrence C. Marsh

8.20

yt = 1 + 2Xt2 + 3Xt3 + et
Given a set of values for the explanatory
variables, (1 X02 X03), the best linear
unbiased predictor of y is given by:
^y = b + b X + b X
0

02

03

This predictor is unbiased in the sense


that the average value of the forecast
error is zero.

Chapter 9

Copyright 1996

Lawrence C. Marsh

9.1

Extensions
of the Multiple
Regression Model
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Topics for This Chapter


1.
2.
3.
4.
5.
6.
7.

Intercept Dummy Variables


Slope Dummy Variables
Different Intercepts & Slopes
Testing Qualitative Effects
Are Two Regressions Equal?
Interaction Effects
Dummy Dependent Variables

9.2

Copyright 1996

Lawrence C. Marsh

Intercept Dummy Variables


Dummy variables are binary (0,1)

yt = 1 + 2Xt + 3Dt + et
yt = speed of car in miles per hour
Xt = age of car in years
Dt = 1 if red car, Dt = 0 otherwise.
Police: red cars travel faster.

H0: 3 = 0
H1: 3 > 0

9.3

Copyright 1996

Lawrence C. Marsh

yt = 1 + 2Xt + 3Dt + et

red cars: yt = (1 + 3) + 2xt + et


other cars: yt = 1 + 2Xt + et

yt

1 + 3

miles
per
hour

red c
ar
othe
r

c a rs

age in years

Xt

9.4

Copyright 1996

Lawrence C. Marsh

Slope Dummy Variables

9.5

yt = 1 + 2Xt + 3DtXt + et
Stock portfolio: Dt = 1 Bond portfolio: Dt = 0

yt

yt = 1 + (2 + 3)Xt + et

value
of
porfolio

1
1 = initial
investment

stocks

2 + 3

bonds
yt = 1 + 2Xt + et

years

Xt

Copyright 1996

Lawrence C. Marsh

Different Intercepts & Slopes

9.6

yt = 1 + 2Xt + 3Dt + 4DtXt + et


miracle seed: Dt = 1
harvest
weight
of corn

yt

1 + 3
1

regular seed: Dt = 0

yt = (1 + 3) + (2 + 4)Xt + et

miracle

2 + 4

yt = 1 + 2Xt + et
2

regular

rainfall

Xt

Copyright 1996

Lawrence C. Marsh

yt = 1 + 2 Xt + 3 Dt + et
For men Dt = 1.
For women Dt = 0.
yt

yt = (1+ 3) + 2 Xt + et

wage
rate
1 + 3
1

Men
Women

.
.
0

yt = 1 + 2 Xt + et
2

Testing for
discrimination
in starting wage

years of experience

H0: 3 = 0
H1: 3 > 0

Xt

9.7

Copyright 1996

Lawrence C. Marsh

yt = 1 + 5 Xt + 6 Dt Xt + et

9.8

For men Dt = 1.
For women Dt = 0.

yt

yt = 1 + (5 +6 )Xt + et

wage
rate

5 +6

Men
Women

yt = 1 + 5 Xt + et
5

Men and women have the same


starting wage, 1 , but their wage rates
increase at different rates (diff.= 6 ).

6 > means that mens wage rates are

increasing faster than women's wage rates.

years of experience

Xt

Copyright 1996

Lawrence C. Marsh

An Ineffective Affirmative Action Plan

9.9

yt = 1 + 2 Xt + 3 Dt + 4 Dt Xt + et
women are started
at a higher wage.

yt
wage
rate

yt = (1 + 3) + (2 + 4) Xt + et
Men

1
1 + 3
Note:
( 3 < 0 )

+ 4

Women
yt = 1 + 2 Xt + et

Women are given a higher starting wage, 1 ,


while men get the lower starting wage, 1 + 3 ,
(3 < 0 ). But, men get a faster rate of increase
in their wages, 2 + 4 , which is higher than the
rate of increase for women, 2 , (since 4 > 0 ).

years of experience

Xt

Copyright 1996

Lawrence C. Marsh

Testing Qualitative Effects

9.10

1. Test for differences in intercept.


2. Test for differences in slope.
3. Test for differences in both
intercept and slope.

Copyright 1996

Lawrence C. Marsh

9.11

men: Dt = 1 ; women: Dt = 0

Yt 1 2 Xt 3 Dt 4 Dt Xt et
H0: vs1:
Testing for
discrimination in
starting wage.

3
Est. Var b3

H0: vs1:
Testing for
discrimination in
wage increases.

intercept

b4

Est. Var b4

tn4

slope
tn4

Copyright 1996

Lawrence C. Marsh

9.12

Ho:
H1 : otherwise

Testing

SSE R SSE U 2
SSE U T 4

T 4

SSE U yt b1bXt b Dt b Dt Xt
t1
intercept and slope
and
SSE R

yt b 1 b 2 X t

Copyright 1996

Lawrence C. Marsh

Are Two Regressions Equal?

9.13

variations of The Chow Test


I. Assuming equal variances (pooling):
men: Dt = 1 ;

women: Dt = 0

yt = 1 + 2 Xt + 3 Dt + 4 Dt Xt + et

Ho: 3 = 4 = 0

vs. H1: otherwise

yt = wage rate

Xt = years of experience

This model assumes equal wage rate variance.

Copyright 1996

Lawrence C. Marsh

9.14
II. Allowing for unequal variances:
(running three regressions)
Forcing men and women to have same 1, 2.
Everyone: yt = 1 + 2 Xt + et
SSER

Allowing men and women to be different.


Men only: ytm = 1 + 2 Xtm + etm
SSEm
Women only: ytw = 1 + 2 Xtw + etw
SSEw
(SSER SSEU)/J
J = # restrictions
F=
SSEU /(TK)
K=unrestricted coefs.
J=2
K = 4 where SSEU = SSEm + SSEw

Copyright 1996

Lawrence C. Marsh

9.15

Interaction Variables
1. Interaction Dummies
2. Polynomial Terms
(special case of continuous interaction)
3. Interaction Among Continuous Variables

Copyright 1996

Lawrence C. Marsh

1. Interaction Dummies

9.16

Wage Gap between Men and Women


yt = wage rate; Xt = experience

For men Mt = 1. For women Mt = 0.


For black Bt = 1. For nonblack Bt = 0.

No Interaction: wage gap assumed the same:


yt = 1 + 2 Xt + 3 Mt + 4 Bt + et
Interaction: wage gap depends on race:
yt = 1 + 2 Xt + 3 Mt + 4 Bt + 5 Mt Bt + et

Copyright 1996

Lawrence C. Marsh

9.17

2. Polynomial Terms
Polynomial Regression

yt = income; Xt = age

Linear in parameters but nonlinear in variables:


yt = 1 + 2 X t + 3 X t + 4 X t + et
2

yt

20

30

40

50

60

70

80

90

People retire at different ages or not at all.

Xt

Copyright 1996

Lawrence C. Marsh

Polynomial Regression
yt = income; Xt = age

yt = 1 + 2 X t + 3 X + 4 X + et
2
t

3
t

Rate income is changing as we age:


yt
2
= 2 + 2 3 X t + 3 4 X t
Xt
Slope changes as X t changes.

9.18

Copyright 1996

Lawrence C. Marsh

3. Continuous Interaction

9.19

Exam grade = f(sleep:Zt , study time:Bt)


yt = 1 + 2 Zt + 3 Bt + 4 Zt Bt + et
Sleep and study time do not act independently.
More study time will be more effective
when combined with more sleep and less
effective when combined with less sleep.

Copyright 1996

Lawrence C. Marsh

continuous interaction

9.20

Exam grade = f(sleep:Zt , study time:Bt)


yt = 1 + 2 Zt + 3 Bt + 4 Zt Bt + et
Your studying is
more effective
with more sleep.

yt
= 2 + 4 Zt
Bt

yt
Your mind sorts
= 2 + 4 Bt
things out while
Zt
you sleep (when you have things to sort out.)

Copyright 1996

Lawrence C. Marsh

9.21

Exam grade = f(sleep:Zt , study time:Bt)

If Zt + Bt = 24 hours, then Bt = (24 Zt)


yt = 1 + 2 Zt + 3 Bt + 4 Zt Bt + et
yt = 1+ 2 Zt +3(24 Zt) +4 Zt (24 Zt) + et
yt = (1+24 3) + (23+24 4)Zt 4Z t + et
2

yt = 1 + 2 Zt + 3 Z2t + et
Sleep needed to maximize your exam grade:
2
yt
= 2 + 23 Zt = 0
Zt =

Zt
3
where > 0 and < 0
2

Copyright 1996

Lawrence C. Marsh

Dummy Dependent Variables

9.22

1. Linear Probability Model


2. Probit Model
3. Logit Model

Copyright 1996

Lawrence C. Marsh

Linear Probability Model


yi =

1 quits job
0 does not quit

yi = 1 + 2 Xi2 + 3 Xi3 + 4 Xi4 + ei


Xi2 = total hours of work each week
Xi3 = weekly paycheck
Xi4 = hourly pay (Xi3 divided by Xi2)

9.23

Copyright 1996

Lawrence C. Marsh

Linear Probability Model

9.24

yi = 1 + 2 Xi2 + 3 Xi3 + 4 Xi4 + ei


Read predicted values of yi off the regression line
^
yi = b1 + b2 Xi2 + b3 Xi3 + b4 Xi4
^
yi
yt = 1
yt = 0
total hours of work each week

Xi2

Copyright 1996

Lawrence C. Marsh

Linear Probability Model

Problems with Linear Probability Model:


1. Probability estimates are sometimes
less than zero or greater than one.
2. Heteroskedasticity is present in that
the model generates a nonconstant
error variance.

9.25

Copyright 1996

Lawrence C. Marsh

Probit Model
latent variable, zi :

zi = 1 + 2 Xi2 +

Normal probability density function:


f(zi) =

9.26

1
2

0.5z 2
i

Normal cumulative probability function:


F(zi) = P[ Z zi ] =

zi

1
0.5u2
du
2 e

Copyright 1996

Lawrence C. Marsh

9.27

Probit Model
Since zi = 1 + 2 Xi2 + , we can
substitute in to get

pi = P[ Z 1 + 2Xi2 ] = F(1 + 2Xi2)


yt = 1
yt = 0
total hours of work each week

Xi2

Copyright 1996

Lawrence C. Marsh

9.28

Logit Model
pi

is the probability of quitting the job.

pi =

Define pi :

1
1

+ e (

+ X +
2

i2

For 2 > 0,

pi

will approach 1 as Xi2

For 2 > 0,

pi

will approach 0 as Xi2

Copyright 1996

Lawrence C. Marsh

Logit Model
pi

9.29

is the probability of quitting the job.

pi =

1
1

+e

( + X +
1
2 i2

yt = 1
yt = 0
total hours of work each week

Xi2

Copyright 1996

Lawrence C. Marsh

Maximum Likelihood

9.30

Maximum likelihood estimation (MLE)


is used to estimate Probit and Logit functions.
The small sample properties of MLE
are not known, but in large samples
MLE is normally distributed, and it is
consistent and asymptotically efficient.

Chapter 10

Copyright 1996

Lawrence C. Marsh

10.1

Heteroskedasticity
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

10.2

The Nature of Heteroskedasticity


Heteroskedasticity is a systematic pattern in the errors
where the variances of the errors are not constant.
Ordinary least squares assumes that all observations are
equally reliable.
For efficiency (accurate estimation/prediction)
reweight observations to ensure equal error variance.

Copyright 1996

Lawrence C. Marsh

Regression Model

10.3

yt = 1 + 2xt + et
zero mean:

E(et) = 0

homoskedasticity:

var(et) = 2

nonautocorrelation:

cov(et, es) = t s

heteroskedasticity:

var(et) = t2

Copyright 1996

Lawrence C. Marsh

Homoskedastic pattern of errors


consumption

yt

.
.
.
.
.
. .
.
.
.
.
. ...
.
.
... .. .
.
.. . .
.
. .. . . . .
..
. .
.
income

xt

10.4

Copyright 1996

Lawrence C. Marsh

10.5

The Homoskedastic Case


f(yt)
p
m
u
s
n
co

.
x1

x2

n
o
ti

yt

x3

x4

income

xt

Copyright 1996

Lawrence C. Marsh

Heteroskedastic pattern of errors

consumption

yt

.
.

. . .
. .
. .
.
.
. .
.
.
.
.
. .. . .
.
.
.
.
.
.
.
.
.
. .
.
.
.
. . . .
.
. .
income

xt

10.6

Copyright 1996

Lawrence C. Marsh

The Heteroskedastic Case


t

f(yt)

10.7

s
n
co

p
m
u

n
o
ti

rich people

poor people

x1

x2

x3

income

xt

Copyright 1996

Lawrence C. Marsh

Properties of Least Squares

10.8

1. Least squares still linear and unbiased.


2. Least squares not efficient.
3. Usual formulas give incorrect standard
least squares.

errors for

4. Confidence intervals and hypothesis tests based on


usual standard errors are wrong.

Copyright 1996

Lawrence C. Marsh

yt = 1 + 2xt + et

var(et) = t2

heteroskedasticity:

incorrect formula for least squares variance:


var(b2) =

2
xt x

correct formula for least squares variance:


2

x
x

var(b2) =
t
t

x
x
t

10.9

Copyright 1996

Lawrence C. Marsh

Hal Whites Standard Errors

10.10

Whites estimator of the least squares variance:

^
2

x
x

est.var(b2) = t t
xt x
In large samples Whites standard error
(square root of estimated variance) is a
correct / accurate / consistent measure.

Copyright 1996

Lawrence C. Marsh

10.11

Two Types of Heteroskedasticity


1. Proportional Heteroskedasticity.
(continuous function(of xt, for example))
2. Partitioned Heteroskedasticity.
(discrete categories/groups)

Copyright 1996

Lawrence C. Marsh

10.12

Proportional Heteroskedasticity
yt = 1 + 2xt + et
E(et) = 0

var(et) = t2

where t2 = 2 xt

cov(et, es) = 0

ts

The variance is
assumed to be
proportional to
the value of xt

Copyright 1996

Lawrence C. Marsh

std.dev. proportional to

xt

10.13

yt = 1 + 2xt + et
variance: var(et) = t2
standard deviation:

t2 = 2 xt
t = xt

To correct for heteroskedasticity divide the model by

yt
1
xt
et
= 1
+ 2
+
xt
xt
xt
xt

xt

Copyright 1996

Lawrence C. Marsh

yt
1
xt
et
= 1
+ 2
+
xt
xt
xt
xt

10.14

y*t = 1x*t1 + 2xt2* + et *


et
1
1
var(et ) = var(
)=
var(et) = x 2 xt
xt
xt
t
*

var(e*t ) = 2

et is heteroskedastic, but et* is homoskedastic.

Copyright 1996

Lawrence C. Marsh

Generalized Least Squares

10.15

These steps describe weighted least squares:


1. Decide which variable is proportional to the
heteroskedasticity (xt in previous example).
2. Divide all terms in the original model by the square
root of that variable (divide by xt ).
3. Run least squares on the transformed model which
* *
*
has new yt, xt1 and xt2 variables
but no intercept.

Copyright 1996

Lawrence C. Marsh

Partitioned Heteroskedasticity

10.16

yt = 1 + 2xt + et
t = 1, ,100
yt = bushels per acre of corn
xt = gallons of water per acre (rain or other)
...

error variance of field corn: var(et) = 12


t = 1, . . . ,80

error variance of sweet corn: var(et) = 22


t = 81, . . . ,100

Copyright 1996

Lawrence C. Marsh

Reweighting Each Groups Observations


field corn: yt = 1 + 2xt + et
yt
1
=

1
1
1

xt
et
+ 2 +
1
1

sweet corn: yt = 1 + 2xt + et


yt
1
=

1
2
2

xt
et
+ 2 +
2
2

10.17

var(et) = 12
t = 1, . . . ,80

var(et) = 22
t = 81, . . . ,100

Copyright 1996

Lawrence C. Marsh

10.18

Apply Generalized Least Squares


Run least squares separately on data for each group.

^ 2 provides estimator of 2 using

1
1
the 80 observations on field corn.

^ 2 provides estimator of 2 using

2
2
the 20 observations on sweet corn.

Copyright 1996

Lawrence C. Marsh

10.19

Detecting Heteroskedasticity
Determine existence and nature of heteroskedasticity :

1. Residual Plots provide information on the


exact nature of heteroskedasticity (partitioned
or proportional) to aid in correcting for it.
2. Goldfeld-Quandt Test checks for presence
of heteroskedasticity.

Copyright 1996

Lawrence C. Marsh

Residual Plots

10.20

Plot residuals against one variable at a time


after sorting the data by that variable to try
to find a heteroskedastic pattern in the data.
et
0

.
.
. . .
.
.
.
.
.
. .
. .
. . . .. . . . . ..
.
.
. . ..
.
.
.
.
.
.
.
.
xt
..
.
.
.

Copyright 1996

Lawrence C. Marsh

Goldfeld-Quandt Test

10.21

The Goldfeld-Quandt test can be used to detect


heteroskedasticity in either the proportional case
or for comparing two groups in the discrete case.
For proportional heteroskedasticity, it is first necessary
to determine which variable, such as xt, is proportional
to the error variance. Then sort the data from the
largest to smallest values of that variable.

Copyright 1996

Lawrence C. Marsh

10.22
In the proportional case, drop the middle
r observations where r T/6, then run
separate least squares regressions on the first
T1 observations and the last T2 observations.

Ho: 12 = 22
H1: 12 > 22
Goldfeld-Quandt
Test Statistic

GQ =

Use F
Table

~ F[T1-K1, T2-K2]

Small values of GQ support Ho while large values support H1.

Copyright 1996

Lawrence C. Marsh

More General Model

10.23

Structure of heteroskedasticity could be more complicated:

t2 = 2 exp{1 zt1 + 2 zt2}


zt1 and zt2 are any observable variables upon
which we believe the variance could depend.
Note: The function exp{.} ensures that t2 is positive.

Copyright 1996

Lawrence C. Marsh

More General Model

10.24

t2 = 2 exp{1 zt1 + 2 zt2}


lnt2 = ln 2+ 1 zt1 + 2 zt2

lnt2 = + 1 zt1 + 2 zt2


where = ln2
Ho: 1 = 0, 2 = 0
H1: 1 0, 2 0
and/or

Least squares residuals, ^et

lne^t2 = +1zt1+2zt2 + t

the usual F test

Chapter 11

Copyright 1996

Lawrence C. Marsh

11.1

Autocorrelation
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

11.2

The Nature of Autocorrelation


For efficiency (accurate estimation/prediction)
all systematic information needs to be incorporated into the regression model.
Autocorrelation is a systematic pattern in the
errors that can be either attracting (positive)
or repelling (negative) autocorrelation.

Copyright 1996
Postive
Auto.

No
Auto.

et
0

et
0

et
Negative
Auto.

Lawrence C. Marsh

11.3

crosses line not enough (attracting)

.
.

. . ..

.. . .
.

. .

...

crosses line randomly

. ..
.
..

. . .. . . . . .
. . .
.
.
..
.
. .
..
.
.
.
.
. .t
.
too much (repelling)
. . crosses line
.
. . .
.
.
.
.
. . . t
.
.
.
.
.

Copyright 1996

Lawrence C. Marsh

Regression Model

11.4

yt = 1 + 2xt + et
zero mean:

E(et) = 0

homoskedasticity:

var(et) = 2

nonautocorrelation:

cov(et, es) = t s

autocorrelation:

cov(et, es) t s

Copyright 1996

Lawrence C. Marsh

Order of Autocorrelation

11.5

yt = 1 + 2xt + et
1st Order: et = et1 + t
2nd Order: et = 1et1 + 2et2 + t
3rd Order: et = 1et1 + 2et2 + 3et3 + t
We will assume First Order Autocorrelation:

AR(1) :

et = et1 + t

Copyright 1996

Lawrence C. Marsh

First Order Autocorrelation

11.6

yt = 1 + 2xt + et
et = et1 + t

where 1 < < 1

E(t) = 0 var(t) = 2

cov(t, s) = t s

These assumptions about t imply the following about et :

E(et) = 0
2

var(et) = e2 =
12

cov(et, etk) = e2 k for

k>0

corr(et, etk) = k for

k>0

Copyright 1996

Lawrence C. Marsh

Autocorrelation creates some


Problems for Least Squares:

1. The least squares estimator is still linear


and unbiased but it is not efficient.
2. The formulas normally used to compute
the least squares standard errors are no
longer correct and confidence intervals and
hypothesis tests using them will be wrong.

11.7

Copyright 1996

Lawrence C. Marsh

Generalized Least Squares


AR(1) :

et = et1 + t
yt = 1 + 2xt + et

substitute
in for et

yt = 1 + 2xt + et1 + t
Now we need to get rid of et1

(continued)

11.8

Copyright 1996

Lawrence C. Marsh

11.9

yt = 1 + 2xt + et1 + t
yt = 1 + 2xt + et
et = yt 12xt
et1 = yt1 12xt1

lag the
errors
once

yt = 1 + 2xt + yt1 12xt1 + t

(continued)

Copyright 1996

Lawrence C. Marsh

11.10

yt = 1 + 2xt + yt1 12xt1 + t


yt = 1 + 2xt + yt1 12xt1 + t
yt yt1 = 1(1) + 2(xtxt1) + t
y*t = *1 + 2x*t2 + t
y*t = yt yt1

x*t2 = (xtxt1)
*

1 = 1(1)

yt = yt yt1
*

x*t2 = xt xt1

Copyright
1996
*

Lawrence C. Marsh

1 = 1(1)

11.11

y*t = *1 + 2x*t2 + t

Problems estimating this model with least squares:


1. One observation is used up in creating the
transformed (lagged) variables leaving only
(T1) observations for estimating the model.
2. The value of is not known. We must find
some way to estimate it.

Copyright 1996

Lawrence C. Marsh

11.12

Recovering the 1st Observation

Dropping the 1st observation and applying least squares


is not the best linear unbiased estimation method.
Efficiency is lost because the variance
of the error associated with the 1st observation
is not equal to that of the other errors.
This is a special case of the heteroskedasticity
problem except that here all errors are assumed
to have equal variance except the 1st error.

Copyright 1996

Lawrence C. Marsh

Recovering the 1st Observation

11.13

The 1st observation should fit the original model as:

y1 = 1 + 2x1 + e1
with error variance: var(e1) = e2 = 2 /(1-2).
We could include this as the 1st observation for our
estimation procedure but we must first transform it so
that it has the same error variance as the other observations.
Note: The other observations all have error variance 2.

Copyright 1996

Lawrence C. Marsh

y1 = 1 + 2x1 + e1

11.14

with error variance: var(e1) = e2 = 2 /(1-2).

The other observations all have error variance 2.


Given any constant c : var(ce1) = c2 var(e1).
If c = 1-2 , then var( 1-2 e1) = (1-2) var(e1).
= (1-2) e2
= (1-2) 2 /(1-2)
= 2
The transformation 1 = 1-2 e1 has variance 2 .

Copyright 1996

Lawrence C. Marsh

11.15

y1 = 1 + 2x1 + e1
Multiply through by 1-2 to get:
1-2

y1 =

1-2

1 +

The transformed error 1 =

1-2

2x1 +

1-2

e1

1-2 e1 has variance 2 .

This transformed first observation may now be


added to the other (T-1) observations to obtain
the fully restored set of T observations.

Copyright 1996

Lawrence C. Marsh

Estimating Unknown Value

11.16

If we had values for the ets, we could estimate:

et = et1 + t
First, use least squares to estimate the model:

yt = 1 + 2xt + et
The residuals from this estimation are:

^e = y - b - b x
t
t
1
2 t

Copyright 1996

Lawrence C. Marsh

^e = y - b - b x
t
t
1
2 t

11.17

Next, estimate the following by least squares:

^e = e^ + ^
t
t1
t
The least squares solution is:

^^
t=2e e
^= t t-1
T
^2
t=2e
t-1
T

Copyright 1996

Lawrence C. Marsh

Durbin-Watson Test
Ho: = 0

11.18

vs. H1: 0 , > 0, or < 0

The Durbin-Watson Test statistic, d, is :

^ ^ 2
t = 2 e e
d= Tt t-1
^2
t=1 e
t
T

Copyright 1996

Lawrence C. Marsh

Testing for Autocorrelation

11.19

The test statistic, d, is approximately related to ^ as:

^
d 2(1)
When ^ = 0 , the Durbin-Watson statistic is d 2.
When ^ = 1 , the Durbin-Watson statistic is d 0.
Tables for critical values for d are not always
readily available so it is easier to use the p-value
that most computer programs provide for d.
Reject Ho if p-value < , the significance level.

Copyright 1996

Lawrence C. Marsh

Prediction with AR(1) Errors

11.20

When errors are autocorrelated, the previous periods


error may help us predict next periods error.
The best predictor, yT+1 , for next period is:

^y = ^ + ^ x + ^e~
T+1
1
2 T+1
T
^
^
where 1 and 2 are generalized least squares
~
estimates and eT is given by:
~
^
^
e =y x
T

2 T

Copyright 1996

Lawrence C. Marsh

11.21

For h periods ahead, the best predictor is:


^y = ^ + ^ x + ^h e~
T+h
1
2 T+h
T
^
^
h ~
Assuming | | < 1, the influence of eT
diminishes the further we go into the future
(the larger h becomes).

Chapter 12

Copyright 1996

Lawrence C. Marsh

12.1

Pooling
Time-Series and
Cross-Sectional Data
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Pooling Time and Cross Sections


yit = 1it + 2itx2it + 3itx3it + eit
for the ith firm in the tth time period
If left unrestricted,
this model requires different equations
for each firm in each time period.

12.2

Copyright 1996

Lawrence C. Marsh

12.3

Seemingly Unrelated Regressions


SUR models impose the restrictions:

1it = 1i

2it = 2i

3it = 3i

yit = 1i + 2ix2it + 3ix3it + eit


Each firm gets its own coefficients: 1i , 2i and 3i
but those coefficients are constant over time.

Copyright 1996

Lawrence C. Marsh

Two-Equation SUR Model

12.4

The investment expenditures (INV) of General Electric (G)


and Westinghouse(W) may be related to their stock market
value (V) and actual capital stock (K) as follows:

INVGt = 1G + 2GVGt + 3GKGt + eGt


INVWt = 1W + 2WVWt + 3WKWt + eWt
i = G, W

t = 1, . . . , 20

Copyright 1996

Lawrence C. Marsh

12.5

Estimating Separate Equations

We make the usual error term assumptions:


E(eGt) = 0
E(eWt) = 0
var(eGt) = G2

var(eWt) = W

cov(eGt, eGs) = 0

cov(eWt, eWs) = 0

For now make the assumption of no correlation


between the error terms across equations:
cov(eGt, eWt) = 0

cov(eGt, eWs) = 0

Copyright 1996

Lawrence C. Marsh

12.6

homoskedasticity assumption:
2

G = W
2

Dummy variable model assumes that G = W :


INVt = 1G + 1Dt + 2GVt + 2DtVt + 3GKt + 3DtKt + et
For Westinghouse observations Dt = 1; otherwise Dt = 0.
1W = 1G + 1

3W = 3G + 3
2W = 2G + 2

Copyright 1996

Lawrence C. Marsh

12.7

Problem with OLS on Each Equation


The first assumption of the Gauss-Markov
Theorem concerns the model specification.
If the model is not fully and correctly specified
the Gauss-Markov properties might not hold.
Any correlation of error terms across equations
must be part of model specification.

Copyright 1996

Lawrence C. Marsh

Correlated Error Terms

Any correlation between the


dependent variables of two or
more equations that is not due
to their explanatory variables
is by default due to correlated
error terms.

12.8

Copyright 1996

Lawrence C. Marsh

Which of the following models would


be likely to produce positively correlated
errors and which would produce
negatively correlations errors?

12.9

1. Sales of Pepsi vs. sales of Coke.


(uncontrolled factor: outdoor temperature)
2. Investments in bonds vs. investments in stocks.
(uncontrolled factor: computer/appliance sales)
3. Movie admissions vs. Golf Course admissions.
(uncontrolled factor: weather conditions)
4. Sales of butter vs. sales of bread.
(uncontrolled factor: bagels and cream cheese)

Copyright 1996

Lawrence C. Marsh

Joint Estimation of the Equations


INVGt = 1G + 2GVGt + 3GKGt + eGt
INVWt = 1W + 2WVWt + 3WKWt + eWt

cov(eGt, eWt) = GW

12.10

Copyright 1996

Lawrence C. Marsh

12.11

Seemingly Unrelated Regressions


When the error terms of two or more equations
are correlated, efficient estimation requires the use
of a Seemingly Unrelated Regressions (SUR)
type estimator to take the correlation into account.
Be sure to use the Seemingly Unrelated Regressions (SUR)
procedure in your regression software program to estimate
any equations that you believe might have correlated errors.

Copyright 1996

Lawrence C. Marsh

Separate vs. Joint Estimation

12.12

SUR will give exactly the same results as estimating


each equation separately with OLS if either or both
of the following two conditions are true:

1. Every equation has exactly the same set of


explanatory variables with exactly the same
values.
2. There is no correlation between the error
terms of any of the equations.

Copyright 1996

Lawrence C. Marsh

12.13

Test for Correlation

Test the null hypothesis of zero correlation

GW
2

rGW

2
^

GW

^2 ^2

G W

=Tr

2
GW

(1)
asy.

Copyright 1996
Start with
the residuals
^eGt and ^eWt
from each
equation
estimated
separately.

rGW

Lawrence C. Marsh

12.14

1
GW
T

e^Gte^Wt

2
^

2
^
e

2
^

2
^
e

1
G
T

1
W
T

2
^

GW

^2 ^2

G W

=Tr

2
GW

Gt
Wt

(1)
asy.

Copyright 1996

Lawrence C. Marsh

Fixed Effects Model

12.15

yit = 1it + 2itx2it + 3itx3it + eit


Fixed effects models impose the restrictions:

1it = 1i

2it = 2

3it = 3

For each ith cross section in the tth time period:

yit = 1i + 2x2it + 3x3it + eit


Each ith cross-section has its own constant 1i intercept.

Copyright 1996

Lawrence C. Marsh

The Fixed Effects Model is conveniently


represented using dummy variables:
D1i=1 if North
D1i=0 if not N

D2i=1 if East
D2i=0 if not E

D3i=1 if South
D3i=0 if not S

12.16

D4i=1 if West
D4i=0 if not W

yit = 11D1i + 12D2i + 13D3i + 14D4 i+ 2x2it + 3x3it + eit


yit = millions of bushels of corn produced
x2it = price of corn in dollars per bushel
x3it = price of soybeans in dollars per bushel
Each cross-sectional unit gets its own intercept,
but each cross-sectional intercept is constant over time.

Copyright 1996

Lawrence C. Marsh

Test for Equality of Fixed Effects

12.17

Ho : 11 = 12 = 13 = 14
H1 : Ho not true
The Ho joint null hypothesis may be tested with F-statistic:

F=

(SSER SSEU) / J
SSEU / (NT K)

~ F(NT K)

SSER is the restricted error sum of squares (one intercept)


SSEU is the unrestricted error sum of squares (four intercepts)
N is the number of cross-sectional units (N = 4)
K is the number of parameters in the model (K = 6)
J is the number of restrictions being tested (J = N1 = 3)
T is the number of time periods

Copyright 1996

Lawrence C. Marsh

Random Effects Model

12.18

yit = 1i + 2x2it + 3x3it + eit


1i = 1 + i
1 is the population mean intercept.
i is an unobservable random error that
accounts for the cross-sectional differences.

Copyright 1996

Lawrence C. Marsh

Random Intercept Term


1i = 1 + i

12.19

where i = 1, ... ,N

i are independent of one another and of eit


E(i) = 0
Consequently,

var(i) = 2
E(1i) = 1

var(1i) = 2

Copyright 1996

Lawrence C. Marsh

Random Effects Model

yit = 1i + 2x2it + 3x3it + eit


yit = (1+i) + 2x2it + 3x3it + eit
yit = 1 + 2x2it + 3x3it + (i +eit)
yit = 1 + 2x2it + 3x3it + it

12.20

Copyright 1996

Lawrence C. Marsh

12.21

yit = 1 + 2x2it + 3x3it + it


it = (i +eit)
it has zero mean:
it is homoskedastic:

E(it) = 0
var(it) =+ e
2

The errors from the same firm in different time periods


are correlated:
2

cov(it,is) =

ts

The errors from different firms are always uncorrelated:

cov(it,js) =

ij

Chapter 13

Copyright 1996

Lawrence C. Marsh

13.1

Simultaneous
Equations
Models
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Keynesian Macro Model

Assumptions of Simple Keynesian Model


1. Consumption, c, is function of income, y.
2. Total expenditures = consumption + investment.
3. Investment assumed independent of income.

13.2

Copyright 1996

Lawrence C. Marsh

The Structural Equations

consumption is a function of income:

c = 1 + 2 y
income is either consumed or invested:

y=c+i

13.3

Copyright 1996

Lawrence C. Marsh

The Statistical Model


The consumption equation:

ct = 1 + 2 yt + et
The income identity:

yt = ct + it

13.4

Copyright 1996

Lawrence C. Marsh

The Simultaneous Nature


of Simultaneous Equations
2.

13.5

1.

ct = 1 + 2 yt + et
5.
3.
4.

yt = ct + it

Since yt
contains et
they are
correlated

Copyright 1996

Lawrence C. Marsh

The Failure of Least Squares


The least squares estimators of
parameters in a structural simultaneous equation is biased and
inconsistent because of the correlation between the random error
and the endogenous variables on
the right-hand side of the equation.

13.6

Copyright 1996

Lawrence C. Marsh

13.7

Single vs. Simultaneous Equations


Single Equation:

Simultaneous Equations:

yt

ct

et

yt

et

ct
it

Copyright 1996

Lawrence C. Marsh

Deriving the Reduced Form


ct = 1 + 2 yt + et

yt = ct + it
ct = 1 + 2(ct + it) + et
(1 2)ct = 1 + 2 it + et

13.8

Copyright 1996

Lawrence C. Marsh

Deriving the Reduced Form

13.9

(1 2)ct = 1 + 2 it + et
2
1
1
ct =
+
it +
et
(12)
(12) (12)

ct = 11 + 21 it + t
The Reduced Form Equation

Copyright 1996

Lawrence C. Marsh

13.10

Reduced Form Equation

ct = 11 + 21 it + t
11 =

1
(12)

and

t =

21 =
1
(12)

2
(12)

+ et

Copyright 1996

Lawrence C. Marsh

yt = ct + it
where ct = 11 + 21 it + t
yt = 11 + (1+21) it + t
It is sometimes useful to give this equation
its own reduced form parameters as follows:

yt = 12 + 22 it + t

13.11

Copyright 1996

Lawrence C. Marsh

ct = 11 + 21 it + t
yt = 12 + 22 it + t

13.12

Since ct and yt are related through the identity:


yt = ct + it , the error term, t, of these two

equations is the same, and it is easy to


show that:

11 = 12 =

1
(12)

22 = (121) =

1
(12)

Copyright 1996

Lawrence C. Marsh

13.13

Identification
The structural parameters are

1 and 2.

The reduced form parameters are

11 and 21.

Once the reduced form parameters are estimated,


the identification problem is to determine if the
orginal structural parameters can be expressed
uniquely in terms of the reduced form parameters.

11
1 =
^
(1 21)
^

21
2 =
^
(1 21)
^

Copyright 1996

Lawrence C. Marsh

Identification

13.14

An equation is under-identified if its structural


(behavorial) parameters cannot be expressed
in terms of the reduced form parameters.
An equation is exactly identified if its structural
(behavorial) parameters can be uniquely expressed in terms of the reduced form parameters.
An equation is over-identified if there is more
than one solution for expressing its structural
(behavorial) parameters in terms of the reduced
form parameters.

Copyright 1996

Lawrence C. Marsh

The Identification Problem

13.15

A system of M equations
containing M endogenous
variables must exclude at least
M1 variables from a given
equation in order for the
parameters of that equation to
be identified and to be able to
be consistently estimated.

Copyright 1996

Lawrence C. Marsh

Two Stage Least Squares


yt1 = 1 + 2 yt2 + 3 xt1 + et1

yt2 = 1 + 2 yt1 + 3 xt2 + et2

Problem: right-hand endogenous variables


yt2 and yt1 are correlated with the error terms.

13.16

Copyright 1996 variables


Lawrence C. Marsh
Problem: right-hand endogenous
13.17
yt2 and yt1 are correlated with the error terms.

Solution: First, derive the reduced form equations.


yt1 = 1 + 2 yt2 + 3 xt1 + et1
yt2 = 1 + 2 yt1 + 3 xt2 + et2
Solve two equations for two unknowns, yt1, yt2 :
yt1 = 11 + 21 xt1 + 31 xt2 + t1
yt2 = 12 + 22 xt1 + 32 xt2 + t2

Copyright 1996

2SLS: Stage I

Lawrence C. Marsh

13.18

yt1 = 11 + 21 xt1 + 31 xt2 + t1


yt2 = 12 + 22 xt1 + 32 xt2 + t2
Use least squares to get fitted values:

^y = ^ + ^ x + ^ x
t1
11
21 t1
31 t2
^yt2 = ^12 + ^22 xt1 + ^32 xt2

yt1 = ^
yt1 + ^t1

yt2 = y^t2 + ^t2

Copyright 1996

Lawrence C. Marsh

2SLS: Stage II
yt1 = ^
yt1 + ^t1
Substitue in
for yt1 , yt2

and

13.19

yt2 = y^t2 + ^t2

yt1 = 1 + 2 yt2 + 3 xt1 + et1


yt2 = 1 + 2 yt1 + 3 xt2 + et2

yt1 = 1 + 2 (y^t2 + ^t2) + 3 xt1 + et1


yt2 = 1 + 2 (y^t1 + ^t1) + 3 xt2 + et2

Copyright 1996

Lawrence C. Marsh

2SLS: Stage II (continued)

13.20

yt1 = 1 + 2 y^t2 + 3 xt1 + ut1


yt2 = 1 + 2 y^t1 + 3 xt2 + ut2
where

ut1 = 2^t2 + et1

and

ut2 = 2^t1 + et2

Run least squares on each of the above equations


to get 2SLS estimates:
~

1 , 2 , 3 , 1 , 2 and 3

Chapter 14

Copyright 1996

Lawrence C. Marsh

14.1

Nonlinear
Least
Squares
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh

Review of Least Squares Principle

14.2

(minimize the sum of squared errors)

(A.) Regression model with only an intercept term:

SSE = 2 (y ^) = 0
t

y t = + et

yt ^ = 0

et = y t
2

e2t = (yt )
2

SSE = (yt )

^ = 0
yt
Yields an exact analytical solution:

^ =

yt = y

Copyright 1996

Lawrence C. Marsh

14.3

Review of Least Squares

(B.) Regression model without an intercept term:

^ )= 0
SSE = 2 x (y x
t t
t

2
^
xtyt xt = 0

yt = xt + et
et = yt xt
2

et = (yt xt)

SSE = (yt xt)

xt yt ^x2t = 0

^x2 = x y
t
t t
This yields an exact
analytical solution:

^= xtyt
x2t

Copyright 1996

Lawrence C. Marsh

Review of Least Squares

14.4

(C.) Regression model with both an intercept and a slope:


2

SSE = (yt xt)

yt = + xt + et

^ )=0
SSE = 2 x (y ^ x
t t
t

^ )=0
SSE = 2 (y ^ x
t
t

This yields an exact


analytical solution:

^
^
y x = 0
^
2
^
xtyt xt xt = 0

^
^ = y x

^= (xtx)(yty)
2

(xtx)

Copyright 1996

Lawrence C. Marsh

Nonlinear Least Squares

14.5

(D.) Nonlinear Regression model:

yt = xt + et
SSE = (yt xt )

PROBLEM: An exact
analytical solution to
this does not exist.

SSE = 2 x ^ln(x )(y x ^) = 0


t
t
t
t

^
xt ln(xt)yt] xt ln(xt)] = 0

Must use numerical


search algorithm to
try to find value of
to satisfy this.

Copyright 1996

Lawrence C. Marsh

14.6

Find Minimum of Nonlinear SSE


SSE

SSE = (yt x )

Copyright 1996

Lawrence C. Marsh

Conclusion
The least squares principle
is still appropriate when the
model is nonlinear, but it is
harder to find the solution.

14.7

Copyright 1996

Lawrence C. Marsh

Optional Appendix
Nonlinear least squares
optimization methods:

The Gauss-Newton Method

14.8

Copyright 1996

Lawrence C. Marsh

14.9

The Gauss-Newton Algorithm


1. Apply the Taylor Series Expansion to the
nonlinear model around some initial b(o).
2. Run Ordinary Least Squares (OLS) on the linear
part of the Taylor Series to get b(m).
3. Perform a Taylor Series around the new b(m) to
get b(m+1) .
4. Relabel b(m+1) as b(m) and rerun steps 2.-4.
5. Stop when (b(m+1) b(m) ) becomes very small.

Copyright 1996

Lawrence C. Marsh

The Gauss-Newton Method


yt = f(Xt,b) +

for

14.10

t = 1, . . . , n.

Do a Taylor Series Expansion around the vector

b = b(o) as follows:

f(Xt,b) = f(Xt,b ) + f(Xt,b )(b - b )


+ (b - b )Tf(Xt,b )(b - b ) + Rt
yt = f(Xt,b ) + f(Xt,b )(b - b ) + t
where

(b - b(o))Tf(Xt,b )(b - b ) + Rt +

Copyright 1996 Lawrence C. Marsh


14.11
yt = f(Xt,b) + f(Xt,b )(b - b ) + t
yt - f(Xt,b ) =

f(Xt,b )b - f(Xt,b ) b + t

yt - f(Xt,b ) + f(Xt,b ) b =
yt =

f(Xt,b )b + t

f(Xt,b )b + t

This is linear in b .

where yt yt - f(Xt,b ) + f(Xt,b ) b

Gauss-Newton just runs OLS on this


transformed truncated Taylor series.

Copyright 1996 Lawrence C. Marsh


Gauss-Newton
just runs OLS on this 14.12
transformed truncated Taylor series.
yt =

f(Xt,b)b + t or

for t = 1, . . . , n

^b

y =

f(X,b )b +

in matrix terms

[ f(X,b )T f(X,b )]-1 f(X,b )T y

This is analogous to linear OLS where


^b XTX)XTy
y = Xb + led to the solution:
except that X is replaced with the matrix of first
partial derivatives: f(Xt,b) and y is replaced by y

(i.e. y = y* and X =

f(X,b )

Copyright 1996

Lawrence C. Marsh

14.13

Recall that: y*(o) yf(X,b(o)) + f(X,b ) b


Now define: y y f(X,b(o))
Therefore:

y =

f(X,b ) b

Now substitute in for y in Gauss-Newton solution:

^b [ f(X,b
to get:

^b

= b(o) +

-1
T
T
)
f(X,b
)
]
f(X,b
)

[ f(X,b)T f(X,b)]-1 f(X,b)T y

Copyright 1996

^b

= b(o) +

Lawrence C. Marsh

14.14

[ f(X,b)T f(X,b)]-1 f(X,b)T y


^

Now call this b value b as follows:


b(1) = b +

[ f(X,b)T f(X,b)]-1 f(X,b)T y

More generally, in going from interation m to


iteration (m+1) we obtain the general expression:
b(m+1) = b(m) +

[ f(X,b(m)T f(X,b(m)]-1 f(X,b(m)T y(m)

Copyright 1996

Lawrence C. Marsh

14.15

Thus, the Gauss-Newton (nonlinear OLS) solution


can be expressed in two alternative, but equivalent,
forms:
1. replacement form:
b(m+1) = [ f(X,b(m)T f(X,b(m )]-1 f(X,b(m))T y*(m)
2. updating form:
b(m+1) = b(m) +

[ f(X,b(m))T f(X,b(m))]-1 f(X,b(m))T y(m)

Copyright 1996

Lawrence C. Marsh

14.16

For example, consider Durbins Method of estimating


the autocorrelation coefficient under a first-order
autoregression regime:
y t = b1 + b2Xt 2 + . . . + bK Xt K +

t-1

for t = 1, . . . , n.

+ ut where u t satisfies the conditions

E u t = 0 , E u 2t = su2, E u t u s = 0 for s t.
Therefore, u t is nonautocorrelated and homoskedastic.
Durbins Method is to set aside a copy of the equation,
lag it once, multiply by and subtract the new equation
from the original equation, then move the yt-1 term to
the right side and estimate along with the bs by OLS.

Copyright 1996

Lawrence C. Marsh
14.17

Durbins Method is to set aside a copy of the equation,


lag it once, multiply by and subtract the new equation
from the original equation, then move the yt-1 term to
the right side and estimate along with the bs by OLS.
y t = b1 + b2X t 2 + b3 X t 3 +
Lag once and multiply by

for t = 1, . . . , n.
where

= t - 1 + ut

y t-1 = b1 + b2Xt -1, 2 + b3 Xt -1, 3 + t -1


Subtract from the original and move y t-1 to right side:
yt = b1 - + b2Xt 2 - Xt-1, 2 + b3(Xt 3 Xt-1, 3)+ y t-1+ ut

Copyright 1996

Lawrence C. Marsh

The structural (restricted,behavorial) equation is:

14.18

yt = b1 - + b2Xt 2 - Xt-1, 2 + b3(Xt 3 - Xt-1, 3) + y t-1+ ut


Now Durbin separates out the terms as follows:
yt = b1 - + b2Xt 2 - b2 Xt-1 2 + b3Xt 3 - b3 Xt-1 3+ y t-1+ ut
The corresponding reduced form (unrestricted) equation is:
yt = 1 + 2Xt, 2 + 3Xt-1, 2 + 4Xt, 3 + 5Xt-1, 3 + 6yt-1+ u t

1 = b1 -

2 = b2

3= - b2

4 = b3 5= - b3 6=

Copyright 1996
1 = b1 -

2 = b2

3= - b2

Lawrence C. Marsh

14.19

4 = b3 5= - b3 6=

^ ^ ^ ^ ^ ^
Given OLS estimates: 1 2 3 4 5 6
we can get three separate and distinct estimates for

3
^

^2

^5

^4

^
^
6

These three separate estimates of are in conflict !!!


It is difficult to know which one to use as the
legitimate estimate of Durbin used the last one.

Copyright 1996

Lawrence C. Marsh

14.20

The problem with Durbins Method is that it ignores


the inherent nonlinear restrictions implied by this
structural model. To get a single (i.e. unique) estimate
for the implied nonlinear restrictions must be
incorporated directly into the estimation process.
Consequently, the above structural equation should be
estimated using a nonlinear method such as the
Gauss-Newton algorithm for nonlinear least squares.
yt = b1 - + b2Xt 2 - b2 Xt -1, 2 + b3Xt 3 - b3 Xt -1, 3+ yt-1+ ut

Copyright 1996

Lawrence C. Marsh
14.21

yt = b1 - + b2Xt 2 - b2 Xt-1, 2 + b3Xt 3 - b3 Xt-1, 3+ yt-1+ ut

yt yt yt yt
f(Xt,b) [ b b b
1
2
3
yt
b1

yt
b2

=X t, 2 X t-1,2)

y t
=X t, 3 X t-1,3)
b 3
yt
= ( - b1 - b2Xt-1,2 - b3Xt-1,3+ y t-1 )

Copyright 1996

(m+1)

Lawrence C. Marsh

14.22

= [ f(X,bm)T f(X,bm )]-1 f(X,b(m)T y (m

where yt (m) yt - f(Xt,bm) + f(Xt,b(m) b(m

Iterate until convergence.

b1(m)

2(m)

b(m) =

yt yt yt yt
f(Xt,bm [ b b b ]
1(m)
(m)
2(m)
3(m)

b3(m)

(m)

f(Xt,b) = b1 - + b2Xt 2 - b2 Xt-1 2 + b3Xt 3 - b3 Xt-1 3+ y t-1

Chapter 15

Copyright 1996

Lawrence C. Marsh15.1

Distributed
Lag Models
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh15.2

The Distributed Lag Effect


Effect
at time t

Economic action
at time t

Effect
at time t+1

Effect
at time t+2

Copyright 1996

Lawrence C. Marsh15.3

Unstructured Lags

yt = + 0 xt + 1 xt-1 + 2 xt-2 + . . . +n xt-n + et

n unstructured lags
no systematic structure imposed on the s
the s are unrestricted

Copyright 1996

Lawrence C. Marsh15.4

Problems with Unstructured Lags


1. n observations are lost with n-lag setup.
2. high degree of multicollinearity among xt-js.
3. many degrees of freedom used for large n.
4. could get greater precision using structure.

Copyright 1996

Lawrence C. Marsh15.5

The Arithmetic Lag Structure


proposed by Irving Fisher (1937)
the lag weights decline linearly

Imposing the relationship:


# = (n - # + 1)
only need to estimate one coefficient, ,
instead of n+1 coefficients, 0 , ... , n .

0 = (n+1)
1 =
n
2 = (n-1)
3 = (n-2)

n-2 =
3
n-1 =
2
n =

Copyright 1996

Lawrence C. Marsh15.6

Arithmetic Lag Structure

yt = + 0 xt + 1 xt-1 + 2 xt-2 + . . . +n xt-n + et


Step 1: impose the restriction:

# = (n - # + 1)

yt = + (n+1) xt + n xt-1 + (n-1) xt-2 + . . . + xt-n + et


Step 2: factor out the unknown coefficient, .

yt = + [(n+1)xt + nxt-1 + (n-1)xt-2 + . . . + xt-n] + et

Copyright 1996

Lawrence C. Marsh15.7

Arithmetic Lag Structure

yt = + [(n+1)xt + nxt-1 + (n-1)xt-2 + . . . + xt-n] + et


Step 3: Define zt .

zt = [(n+1)xt + nxt-1 + (n-1)xt-2 + . . . + xt-n]


Step 4: Decide number of lags, n.

For n = 4:

zt = [ 5xt + 4xt-1 + 3xt-2 + 2xt-3 + xt-4]


Step 5: Run least squares regression on:

y t = + zt + e t

Copyright 1996

Lawrence C. Marsh15.8

Arithmetic Lag Structure


i
0 = (n+1)

1 = n

2 = (n-1)

linear

.
.
.

lag
structure

n =
0

n+1

Copyright 1996

Lawrence C. Marsh15.9

Polynomial Lag Structure


proposed by Shirley Almon (1965)
the lag weights fit a polynomial
2

i = 0 + 1i + 2i +...+ pi
For example, a quadratic polynomial:

i = 0 + 1i + 2i
where i = 1, . . . , n
p = 2 and n = 4

n = the length of the lag


p = degree of polynomial
p

where i = 1, . . . , n

0
1
2
3
4

=
=
=
=
=

0
0 +
0 +
0 +
0 +

1 + 2
21 + 42
31 + 92
41 + 162

Copyright 1996

Lawrence C. Marsh
15.10

Polynomial Lag Structure

yt = + 0 xt + 1 xt-1 + 2 xt-2 + 3 xt-3 +4 xt-4 + et


Step 1: impose the restriction: i = 0 + 1i + 2i 2

yt = + 0xt + 0 + 1 + 2xt-1 + (0 + 21 + 42)xt-2


+ (0 + 31 + 92)xt-3+ (0 + 41 + 162)xt-4 + et
Step 2: factor out the unknown coefficients: 0, 1, 2.

yt = + 0 [xt + xt-1 + xt-2 + xt-3 + xt-4]


+ 1 [xt + xt-1 + 2xt-2 + 3xt-3 + 4xt-4]
+ 2 [xt + xt-1 + 4xt-2 + 9xt-3 + 16xt-4] + et

Copyright 1996

Lawrence C. Marsh
15.11

Polynomial Lag Structure

yt = + 0 [xt + xt-1 + xt-2 + xt-3 + xt-4]


+ 1 [xt + xt-1 + 2xt-2 + 3xt-3 + 4xt-4]
+ 2 [xt + xt-1 + 4xt-2 + 9xt-3 + 16xt-4] + et
Step 3: Define zt0 , zt1 and zt2 for 0 , 1 , and 2.

z t0 = [xt + xt-1 + xt-2 + xt-3 + xt-4]


z t1 = [xt + xt-1 + 2xt-2 + 3xt-3 + 4xt- 4 ]
z t2 = [xt + xt-1 + 4xt-2 + 9xt-3 + 16xt- 4]

Copyright 1996

Lawrence C. Marsh
15.12

Polynomial Lag Structure


Step 4: Regress yt on zt0 , zt1 and zt2 .

yt = + 0 z t0 + 1 z t1 + 2 z t2 + et
Step 5: Express ^is in terms of ^0 , ^1 , and ^2.
^0
^
1
^
2
^
3
^
4

= ^0
= ^0 + ^1 + ^2
^
^
^
= 0 + 21 + 42
^
^
^
= 0 + 31 + 92
^
^
^
= 0 + 41 + 162

Copyright 1996

Lawrence C. Marsh
15.13

Polynomial Lag Structure


i

.
.
.

1
0

.
0

Figure 15.3

Copyright 1996

Lawrence C. Marsh
15.14

Geometric Lag Structure


infinite distributed lag model:

yt = + 0 xt + 1 xt-1 + 2 xt-2 + . . . + et

yt = + i
xt-i + et
=0 i

(15.3.1)

geometric lag structure:

i = i

where || < 1 and i

Copyright 1996

Lawrence C. Marsh
15.15

Geometric Lag Structure


infinite unstructured lag:

yt = + 0 xt + 1 xt-1 + 2 xt-2 + 3 xt-3 + . . . + et

Substitute i = i

infinite geometric lag:

0
1
2
3

=
=
=
=
...

2
3

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et

Copyright 1996

Lawrence C. Marsh
15.16

Geometric Lag Structure

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et


impact multiplier :

interim multiplier (3-period) :

+ +
long-run multiplier :

+ + + + . . .

Copyright 1996

Lawrence C. Marsh
15.17

Geometric Lag Structure


i

0 =

.
.

1 =
2 = 2
3 = 3
4 = 4

geometrically
declining
weights

. .
3

Figure 15.5

Copyright 1996

Lawrence C. Marsh
15.18

Geometric Lag Structure

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et


Problem:
How to estimate the infinite number
of geometric lag coefficients ???
Answer:
Use the Koyck transformation.

Copyright 1996

Lawrence C. Marsh
15.19

The Koyck Transformation

Lag everything once, multiply by and subtract from original:

yt = + xt + xt-1 + xt-2 + xt-3 + . . .) + et


yt-1 = + xt-1 + xt-2 + xt-3 + . . .) + et-1

yt yt-1 = + xt + (et et-1)

Copyright 1996

Lawrence C. Marsh
15.20

The Koyck Transformation


yt yt-1 = + xt + (et et-1)
Solve for yt by adding yt-1 to both sides:

yt = + yt-1 + xt + (et et-1)

yt = + yt-1 + xt + t

Copyright 1996

Lawrence C. Marsh
15.21

The Koyck Transformation


yt = + yt-1 + xt + (et et-1)
Defining = , = , and = ,
use ordinary least squares:

yt = + yt-1 + xt + t
The original structural
parameters can now be
estimated in terms of
these reduced form
parameter estimates.

^= ^

^ = ^

^
^= ^

Copyright 1996

Lawrence C. Marsh
15.22

Geometric Lag Structure

^
^
yt = + xt + ^ xt-1 + ^ xt-2 + ^ xt-3 + . . .) + e^t
^
0
^
1
^
2
^
3

=
=
=

^^

^^
2
^^
3

=
...
^
^
^
^ x +
^
^ +
yt =
x
+

x
+

x
+
.
.
.
+
e
0 t
1 t-1
2 t-2
3 t-3
t

Copyright 1996

Lawrence C. Marsh
15.23

Durbins h-test
for autocorrelation

Estimates inconsistent if geometric lag model is autocorrelated,


but Durbin-Watson test is biased in favor of no autocorrelation.
h= 1 d
2

T1
1 ( T 1)[se(b2)]2

h = Durbins h-test statistic


d = Durbin-Watson test statistic
se(b2) = standard error of the estimate b2

T = sample size

Copyright 1996

Lawrence C. Marsh
15.24

Adaptive Expectations
yt = + x*t + et

yt =
x*t =

credit card debt


expected (anticipated) income
(x*t is not observable)

Copyright 1996

Lawrence C. Marsh
15.25

Adaptive Expectations
adjust expectations
based on past realization:

x*t - x*t-1 = (xt-1 - x*t-1)

Copyright 1996

Lawrence C. Marsh
15.26

Adaptive Expectations
x*t - x*t-1 = (xt-1 - x*t-1)
rearrange to get:

x*t = xt-1 + (1- ) x*t-1


or

xt-1 =

[x*t - (1- ) x*t-1]

Copyright 1996

Lawrence C. Marsh
15.27

Adaptive Expectations
yt = + x*t + et

Lag this model once and multiply by (1):

(1)yt-1 = (1) + (1) x*t-1 + (1)et-1


subtract this from the original to get:

yt = - (1)yt-1+ [x*t - (1)x*t-1]


+ et - (1)et-1

Copyright 1996

Lawrence C. Marsh
15.28

Adaptive Expectations

yt = - (1)yt-1+ [x*t - (1)x*t-1]


+ et - (1)et-1
Since xt-1 =
we get:

[x*t - (1- ) x*t-1]

yt = - (1)yt-1+ xt-1 + ut
where

ut = et - (1)et-1

Copyright 1996

Lawrence C. Marsh
15.29

Adaptive Expectations

yt = - (1)yt-1+ xt-1 + ut
Use ordinary least squares regression on:

yt = 1 + 2yt-1+ 3xt-1 + ut
and we get:

= (12)

^ =

(12)

(12)

Copyright 1996

Lawrence C. Marsh
15.30

Partial Adjustment

y*t = + xt + et
< 1,
towards optimal or desired level, y*t :

inventories partially adjust , 0 <

yt - yt-1 = (y*t - yt-1)

Copyright 1996

Lawrence C. Marsh
15.31

Partial Adjustment

yt - yt-1 = (y*t - yt-1)


= ( + xt + et - yt-1)
= + xt - yt-1+ et
Solving for yt :

yt = + (1 - yt-1 + xt + et

Copyright 1996

Lawrence C. Marsh
15.32

Partial Adjustment

yt = + (1 - yt-1 + xt + et
yt = 1 + 2yt-1+ 3xt + t
Use ordinary least squares regression to get:

^
^ = (1
2)

^=

(12)

(12)

Chapter 16

Copyright 1996

Lawrence C. Marsh16.1

Time
Series
Analysis
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh16.2

Previous Chapters used Economic Models


1. economic model for dependent variable of interest.
2. statistical model consistent with the data.
3. estimation procedure for parameters using the data.
4. forecast variable of interest using estimated model.
Times Series Analysis does not use this approach.

Copyright 1996

Lawrence C. Marsh16.3

Time Series Analysis does not generally


incorporate all of the economic relationships
found in economic models.
Times Series Analysis uses
more statistics and less economics.
Time Series Analysis is useful for short term forecasting only.
Long term forecasting requires incorporating more involved
behavioral economic relationships into the analysis.

Copyright 1996

Lawrence C. Marsh16.4

Univariate Time Series Analysis can be used


to relate the current values of a single economic
variable to:
1. its past values
2. the values of current and past random errors
Other variables are not used
in univariate time series analysis.

Copyright 1996

Lawrence C. Marsh16.5

Three types of Univariate Time Series Analysis


processes will be discussed in this chapter:
1. autoregressive (AR)
2. moving average (MA)
3. autoregressive moving average (ARMA)

Copyright 1996

Lawrence C. Marsh16.6

Multivariate Time Series Analysis can be


used to relate the current value of each of
several economic variables to:
1. its past values.

2. the past values of the other forecasted variables.


3. the values of current and past random errors.
Vector autoregressive models discussed later in
this chapter are multivariate time series models.

Copyright 1996

Lawrence C. Marsh16.7

First-Order Autoregressive Processes, AR(1):


yt = + 1yt-1+ et,

t = 1, 2,...,T.

(16.1.1)

is the intercept.
1 is parameter generally between -1 and +1.
et is an uncorrelated random error with
mean zero and variance e .

Copyright 1996

Lawrence C. Marsh16.8

Autoregressive Process of order p, AR(p) :


yt = + 1yt-1 + 2yt-2 +...+ pyt-p + et

(16.1.2)

is the intercept.
is are parameters generally between -1 and +1.
et is an uncorrelated random error with
mean zero and variance e .

Copyright 1996

Lawrence C. Marsh16.9

Properties of least squares estimator:


AR models always have one or more lagged
dependent variables on the right hand side.
Consequently, least squares is no longer a
best linear unbiased estimator (BLUE),
but it does have some good asymptotic
properties including consistency.

Copyright 1996

Lawrence C. Marsh
16.10

AR(2) model of U.S. unemployment rates


yt = 0.5051 + 1.5537 yt-1
(0.1267)

(0.0707)

0.6515 yt-2
(0.0708)

positive
negative
Note: Q1-1948 through Q1-1978 from J.D.Cryer (1986) see unempl.dat

Copyright 1996

Lawrence C. Marsh
16.11

Choosing the lag length, p, for AR(p):


The Partial Autocorrelation Function (PAF)
The PAF is the sequence of correlations between
(yt and yt-1), (yt and yt-2), (yt and yt-3), and so on,
given that the effects of earlier lags on y t are
held constant.

Copyright 1996

Lawrence C. Marsh
16.12

Partial Autocorrelation Function


Data simulated
from this model:

^
kk 1

yt = 0.5 yt-1

+ 0.3 yt-2 + et

kk is the last (kth) coefficient

in a kth order AR process.

2/ T

2/ T

k
This sample PAF suggests a second
order process AR(2) which is correct.

Copyright 1996

Lawrence C. Marsh
16.13

Using AR Model for Forecasting:


unemployment rate:

yT-1 = 6.63 and yT = 6.20

^
^
^y = ^ +
y
+

T+1
1 T
2 yT-1
=

0.5051 + (1.5537)(6.2)

5.8186

- (0.6515)(6.63)

^
^
^y = ^ +
y
+
2 yT
T+2
1 T+1
=

0.5051 + (1.5537)(5.8186)

5.5062

- (0.6515)(6.2)

^
^
^y = ^ +
y
+

T+1
1 T
2 yT-1
=

0.5051 + (1.5537)(5.5062)

5.2693

- (0.6515)(5.8186)

Copyright 1996

Lawrence C. Marsh
16.14

Moving Average Process of order q, MA(q):


yt = + et + 1et-1 + 2et-2 +...+ qet-q + et
is the intercept.
is are unknown parameters.
et is an uncorrelated random error with
mean zero and variance e .

(16.2.1)

Copyright 1996

Lawrence C. Marsh
16.15

An MA(1) process:
yt = + et + 1et-1

(16.2.2)

Minimize sum of least squares deviations:


T

S(,1) = t=1
e = t=1
yt - -1et-1)
2
t

(16.2.3)

Copyright 1996

Lawrence C. Marsh
16.16

Stationary vs. Nonstationary


stationary:
A stationary time series is one whose mean, variance,
and autocorrelation function do not change over time.

nonstationary:
A nonstationary time series is one whose mean,
variance or autocorrelation function change over time.

Copyright 1996

Lawrence C. Marsh
16.17

First Differencing is often used to transform


a nonstationary series into a stationary series:

yt = z t - z t-1
where z t is the original nonstationary series
and yt is the new stationary series.

Copyright 1996

Lawrence C. Marsh
16.18

Choosing the lag length, q, for MA(q):


The Autocorrelation Function (AF)
The AF is the sequence of correlations between
(yt and yt-1), (yt and yt-2), (yt and yt-3), and so on,
without holding the effects of earlier lags
on yt constant.
The PAF controlled for the effects of previous lags
but the AF does not control for such effects.

Copyright 1996

Lawrence C. Marsh
16.19

Autocorrelation Function

Data simulated
yt = et 0.9 et-1
from this model:
This
sample
AF
suggests
a
first
order
1
rkk
process MA(1) which is correct.
2/ T

2/ T

k
rkk is the last (kth) coefficient

in a kth order MA process.

Copyright 1996

Lawrence C. Marsh
16.20

Autoregressive Moving Average


ARMA(p,q)
An ARMA(1,2) has one autoregressive lag
and two moving average lags:

yt = + 1yt-1 + et + 1et-1 + 2 et-2

Copyright 1996

Lawrence C. Marsh
16.21

Integrated Processes

A time series with an upward or downward


trend over time is nonstationary.
Many nonstationary time series can be made
stationary by differencing them one or more times.
Such time series are called integrated processes.

Copyright 1996

Lawrence C. Marsh
16.22

The number of times a series must be


differenced to make it stationary is the
order of the integrated process, d.
An autocorrelation function, AF,
with large, significant autocorrelations
for many lags may require more than
one differencing to become stationary.
Check the new AF after each differencing
to determine if further differencing is needed.

Copyright 1996

Lawrence C. Marsh
16.23

Unit Root

zt = 1zt -1 + + et + 1et -1
-1 < 1 < 1
1 = 1

(16.3.2)

stationary ARMA(1,1)
nonstationary process

1 = 1 is called a unit root

Copyright 1996

Lawrence C. Marsh
16.24

Unit Root Tests

zt - zt -1 = (1- 1)zt-1 + + et + 1et -1


*

zt = 1zt -1 + + et + 1et -1
where

zt = zt - zt -1
*

(16.3.3)

and 1 = 1- 1

Testing1 = 0 is equivalent to testing 1 = 1

Copyright 1996

Lawrence C. Marsh
16.25

Unit Root Tests


*

H0:1 = 0

vs.

H1:1*< 0

(16.3.4)

Computer programs typically use one of


the following tests for unit roots:
Dickey-Fuller Test
Phillips-Perron Test

Copyright 1996

Lawrence C. Marsh
16.26

Autoregressive Integrated Moving Average


ARIMA(p,d,q)
An ARIMA(p,d,q) model represents an
AR(p) - MA(q) process that has been
differenced (integrated, I(d)) d times.
yt = + 1yt-1 +...+ pyt-p + et + 1et-1 +... + q et-q

Copyright 1996

Lawrence C. Marsh
16.27

The Box-Jenkins approach:


1. Identification
determining the values of p, d, and q.

2. Estimation
linear or nonlinear least squares.

3. Diagnostic Checking
model fits well with no autocorrelation?

4. Forecasting
short-term forecasts of future yt values.

Copyright 1996

Lawrence C. Marsh
16.28

Vector Autoregressive (VAR) Models

Use VAR for two or more interrelated time series:


yt = 0+ 1yt-1 +...+ pyt-p + 1xt-1 +... + p xt-p + et
xt = 0+ 1yt-1 +...+ pyt-p + 1xt-1 +... + p xt-p + ut

Copyright 1996

Lawrence C. Marsh
16.29

Vector Autoregressive (VAR) Models

1.
2.
3.
4.
5.

extension of AR model.
all variables endogenous.
no structural (behavioral) economic model.
all variables jointly determined (over time).
no simultaneous equations (same time).

Copyright 1996

Lawrence C. Marsh
16.30

The random error terms in a VAR model


may be correlated if they are affected by
relevant factors that are not in the model
such as government actions or
national/international events, etc.
Since VAR equations all have exactly the
same set of explanatory variables, the usual
seemingly unrelation regression estimation
produces exactly the same estimates as
least squares on each equation separately.

Copyright 1996

Lawrence C. Marsh
16.31

Least Squares is Consistent


Consequently, regardless of whether
the VAR random error terms are
correlated or not, least squares estimation
of each equation separately will provide
consistent regression coefficient estimates.

Copyright 1996

Lawrence C. Marsh
16.32

VAR Model Specification


To determine length of the lag, p, use:
1. Akaikes AIC criterion
2. Schwarzs SIC criterion

These methods were discussed in Chapter 15.

Copyright 1996

Lawrence C. Marsh
16.33

Spurious Regressions
yt = 1+ 2 xt + t
where

t = 1 t-1 + t

-1 <1 < 1

I(0) (i.e. d=0)

1 = 1

I(1) (i.e. d=1)

If 1 =1 least squares estimates of 2 may


appear highly significant even when true 2 = 0 .

Copyright 1996

Lawrence C. Marsh
16.34

Cointegration
yt = 1+ 2 xt + t

xt and yt are nonstationary I(1)


we might expect that t is also I(1).
If

However, if

xt

and

yt

are nonstationary I(1)

but t is stationary I(0), then xt and


said to be cointegrated.

yt

are

Copyright 1996

Lawrence C. Marsh
16.35

Cointegrated VAR(1) Model


VAR(1) model:

yt = 0+ 1yt-1 + 1xt-1 + et
xt = 0+ 1yt-1 + 1xt-1 + ut
If xt and yt are both I(1) and are cointegrated,
use an Error Correction Model, instead of VAR(1).

Copyright 1996

Lawrence C. Marsh
16.36

Error Correction Model


yt = yt - yt-1 and xt = xt - xt-1
yt = 0+ (1-1)yt-1 + 1xt-1 + et
xt = 0+ 1yt-1 + (1-1)xt-1 + ut
(continued)

Copyright 1996

Lawrence C. Marsh
16.37

Error Correction Model


*

yt = 0+ 1(yt-1 - 1- 2 xt-1) + et
*

xt = 0+ 2(yt-1 - 1- 2 xt-1) + ut
*

0= 0 + 11
*

0= 0 + 21

2=

1-

1
1

1 1
1=
1 - 1
2= 1

Copyright 1996

Lawrence C. Marsh
16.38

Estimating an Error Correction Model


Step
Step 1:
1:
Estimate by least squares:

yt-1 = 1+ 2 xt-1 + t-1


to get the residuals:
^
=
t-1

yt-1 - 1- 2 xt-1

Copyright 1996

Lawrence C. Marsh
16.39

Estimating an Error Correction Model


Step
Step 2:
2:
Estimate by least squares:
*

yt = 0+ 1 t-1 + et
xt = 0+ 2 t-1 + ut

Copyright 1996

Lawrence C. Marsh
16.40

Using cointegrated I(1) variables in a


VAR model expressed solely in terms
of first differences and lags of first
differences is a misspecification.
The correct specification is to use an

Error Correction Model

Chapter 17

Copyright 1996

Lawrence C. Marsh17.1

Guidelines for
Research Project
Copyright 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.

Copyright 1996

Lawrence C. Marsh17.2

What Book Has Covered

Formulation
economic ====> econometric.
Estimation
selecting appropriate method.
Interpretation
how the xts impact on the yt .
Inference
testing, intervals, prediction.

Copyright 1996

Lawrence C. Marsh17.3

Topics for This Chapter


1. Types of Data by Source
2. Nonexperimental Data
3. Text Data vs. Electronic Data
4. Selecting a Topic
5. Writing an Abstract
6. Research Report Format

Copyright 1996

Lawrence C. Marsh17.4

Types of Data by Source


i)

Experimental Data
from controlled experiments.

ii) Observational Data


passively generated by society.
iii) Survey Data
data collected through interviews.

Copyright 1996

Lawrence C. Marsh17.5

Time vs. Cross-Section

Time Series Data


data collected at distinct points in time
(e.g. weekly sales, daily stock price, annual
budget deficit, monthly unemployment.)
Cross Section Data
data collected over samples of units, individuals,
households, firms at a particular point in time.
(e.g. salary, race, gender, unemployment by state.)

Copyright 1996

Lawrence C. Marsh17.6

Micro vs. Macro

Micro Data:
data collected on individual economic
decision making units such as individuals,
households or firms.
Macro Data:
data resulting from a pooling or aggregating
over individuals, households or firms at the
local, state or national levels.

Copyright 1996

Lawrence C. Marsh17.7

Flow vs. Stock

Flow Data:
outcome measured over a period of time,
such as the consumption of gasoline during
the last quarter of 1997.
Stock Data:
outcome measured at a particular point in
time, such as crude oil held by Chevron in
US storage tanks on April 1, 1997.

Copyright 1996

Lawrence C. Marsh17.8

Quantitative vs. Qualitative


Quantitative Data:
outcomes such as prices or income that may
be expressed as numbers or some transformation of them (e.g. wages, trade deficit).
Qualitative Data:
outcomes that are of an either-or nature
(e.g. male, home owner, Methodist, bought
last year, voted in last election).

car

Copyright 1996

Lawrence C. Marsh17.9

International Data

International Financial Statistics (IMF monthly).


Basic Statistics of the Community (OECD annual).
Consumer Price Indices in the European
Community (OECD annual).
World Statistics (UN annual).
Yearbook of National Accounts Statistics (UN).
FAO Trade Yearbook (annual).

Copyright 1996

Lawrence C. Marsh
17.10

United States Data

Survey of Current Business (BEA monthly).


Handbook of Basic Economic Statistics (BES).
Monthly Labor Review (BLS monthly).
Federal Researve Bulletin (FRB monthly).
Statistical Abstract of the US (BC annual).
Economic Report of the President (CEA annual).
Economic Indicators (CEA monthly).
Agricultural Statistics (USDA annual).
Agricultural Situation Reports (USDA monthly).

Copyright 1996

Lawrence C. Marsh
17.11

State and Local Data

State and Metropolitan Area Data Book


(Commerce and BC, annual).
CPI Detailed Report (BLS, annual).
Census of Population and Housing
(Commerce, BC, annual).
County and City Data Book
(Commerce, BC, annual).

Copyright 1996

Lawrence C. Marsh
17.12

Citibase on CD-ROM

Financial series: interest rates, stock market, etc.


Business formation, investment and consumers.
Construction of housing.
Manufacturing, business cycles, foreign trade.
Prices: producer and consumer price indexes.
Industrial production.
Capacity and productivity.
Population.

Copyright 1996

Lawrence C. Marsh
17.13

Citibase on CD-ROM
(continued)

Labor statistics: unemployment, households.


National income and product accounts in detail.
Forecasts and projections.
Business cycle indicators.
Energy consumption, petroleum production, etc.
International data series including trade
statistics.

Copyright 1996

Lawrence C. Marsh
17.14

Resources for Economists

Resources for Economists by Bill Goffe


http://econwpa.wustl.edu/EconFAQ/EconFAQ.html

Bill Goffe provides a vast database of information


about the economics profession including economic
organizations, working papers and reports,
and economic data series.

Copyright 1996

Lawrence C. Marsh
17.15

Internet Data Sources

A few of the items on Bill Goffes Table of Contents:

Shortcut to All Resources.


Macro and Regional Data.
Other U.S. Data.
World and Non-U.S. Data.
Finance and Financial Markets.
Data Archives.
Journal Data and Program Archives.

Copyright 1996

Lawrence C. Marsh
17.16

Useful Internet Addresses

http://seamonkey.ed.asu.edu/~behrens/teach/WWW_data.html
http://www.sims.berkeley.edu/~hal/pages/interesting.html
http://www.stls.frb.org FED RESERVE BK - ST. LOUIS
http://www.bls.gov

BUREAU OF LABOR STATISTICS

http://nber.harvard.edu

NATL BUR. ECON. RESEARCH

http://www.inform.umd.edu:8080/EdRes/Topic/EconData/.ww
w/econdata.html UNIVERSITY OF MARYLAND
http://www.bog.frb.fed.us FEB BOARD OF GOVERNORS
http://www.webcom.com/~yardeni/economic.html

Copyright 1996

Lawrence C. Marsh
17.17

Data from Surveys

The survey process has four distinct aspects:


i) identify the population of interest.
ii) designing and selecting the sample.
iii) collecting the information.
iv) data reduction, estimation and inference.

Copyright 1996

Lawrence C. Marsh
17.18

Controlled Experiments

Controlled experiments were done on these topics:


1. Labor force participation: negative income tax:
guaranteed minimum income experiment.
2. National cash housing allowance experiment:
impact on demand and supply of housing.
3. Health insurance: medical cost reduction:
sensitivity of income groups to price change.
4. Peak-load pricing and electricity use:
daily use pattern of residential customers.

Copyright 1996

Lawrence C. Marsh
17.19

Economic Data Problems

I. poor implicit experimental design


(i) collinear explanatory variables.
(ii) measurement errors.
II. inconsistent with theory specification
(i) wrong level of aggregation.
(ii) missing observations or variables.
(iii) unobserved heterogeneity.

Copyright 1996

Lawrence C. Marsh
17.20

Selecting a Topic

General tips for selecting a research topic:

What am I interested in?


Well-defined, relatively simple topic.
Ask prof for ideas and references.
Journal of Economic Literature (ECONLIT)
Make sure appropriate data are available.
Avoid extremely difficult econometrics.
Plan your work and work your plan.

Copyright 1996

Lawrence C. Marsh
17.21

Writing an Abstract

Abstract of less than 500 words should include:


(i) concise statement of the problem.
(ii) key references to available information.
(iii) description of research design including:
(a) economic model
(b) statistical model
(c) data sources
(d) estimation, testing and prediction
(iv) contribution of the work

Copyright 1996

Lawrence C. Marsh
17.22

Research Report Format


1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

Statement of the Problem.


Review of the Literature.
The Economic Model.
The Statistical Model.
The Data.
Estimation and Inferences Procedures.
Empirical Results and Conclusions.
Possible Extensions and Limitations.
Acknowledgments.
References.

Das könnte Ihnen auch gefallen