Sie sind auf Seite 1von 29

Topic 3

Qualitative Response
Regression Models

Models with Binary Dependent


Variables
In this topic we examine models
that are used to describe choice
behavior, and which do not have
the usual continuous dependent
variable.
Many of the choices that
individuals and firms make are
either-or in nature.
Such questions lead us to the
problem of constructing a
statistical model of binary, eitheror, choices.
Choices can be represented by a
binary variable that takes the value
1 if one outcome is chosen, and
takes the value 0 otherwise.

We will focus on the simplest of


such models, namely the binary or
dichotomous or dummy
dependentregression models:
(I). Linear Probability Model

LPM: A numerical example


Strength & shortcoming
(II). Index Models
The Logit Model
The Probit Model

(I). The Linear Probability Model


(LPM)
A numerical example:

Suppose we have a qualitative


dependent variable (yes or no
response by 40 families on home
ownership).

The dependent variable, Y, takes a


value of 0 or 1, we can think of
their being an underlying
probability that a particular
household will own a home or
not given their level of income, X.
Hence the name linear
probability.

Table 1: Hypothetical data on home


ownership and Income level

LPM: A numerical example


Figure 1: Scatter plot
The LPM for Y is:
Pr(Yi=1|x) = 0+1Xi+ui
where the X is the usual
explanatory variable(s) and the
marginal effect of X on the
probability is

House ownership, Y

12

16

20

24

Such a model could be estimated


by OLS.

Income, in thousand, X

LPM: A numerical example


Estimation method:
OLS used to estimate
LPMs.
The model:
Yi = 0 + 1 Xi + t
Where Y(1 = 1 owns a house,
0 = does not own a house)
X = family income ($000)

LPM: A numerical example


Result Interpretation:
Yi = 0.8423+ 0.0915X i
se = (0.1655)*** (0.0110)***

Houseownership, Y

t = (-5.0891) (8.3202) R 2 = 0.6456

12

16

20

Income, in thousand dollar, X

24

The intercept = -0.8423 gives


the probability that a family
with zero income will own a
house.
The slope (the change in the
probability of owning home for
a unit change in income =
0.0915 means that for a unit
change in income (here $1000),
on average the probability of
owning a house increases by
0.0915 or about 10%
6

LPM: A numerical example


Forecasting:
The probability of owning a
house for a given level of
income. e.g. for X= 12
($12000), the estimated
probability of owning a
house is
Yi = 0 . 8423 + 0 . 0915 (12 )
= 0.2557

LPM: Strengths and Shortcomings


Strengths:
1. Easy to estimate, and easy to
interpret
2. LPM gives good estimates
of the marginal effects near
the centre of the distribution
of the xs

Shortcomings:
The LPM is plagued by
several problems, some are
surmountable and can be
overcome but some are
fundamental.

LPM: Strengths and Shortcomings


Unreasonable predicted values: we
can get predictor < 0 or > 1.

Yi = 0 . 8423 + 0 . 0915 X i
X = 8, Y =
Yi = 0 . 8423 + 0 . 0915 ( 8 )
= 0 . 1103

1
Hosue ownership, Y

1.

X = 20, Y =
Yi = 0 . 8423 + 0 . 0915 ( 22 )
= 1 . 1707

12

16

20

Income, in thousand dollar, X

24

LPM: Strengths and Shortcomings


2.

LPM assumes that the rate of


change of probability per unit
change in the value of the
explanatory variables is constant,
given by the value of the slope.

10

LPM: Strengths and Shortcomings


3.

4.
5.

Since the dependent variable takes


only two values, the error term
takes only two values
This implies that the errors can
no longer be viewed as normal
(binomial distribution)
The errors are also
heteroscedastic
R-Squared no longer a good
measure of fit

3. and 4. imply that while OLS


estimators are unbiased, the
variance of the estimators are
inefficient and inferences using t
and F tests are no longer valid in
small sample.

11

LPM: Strengths and Shortcomings


What do we need?
We need a model that is a
(probability) model that has
these two features:

Nonlinear models!

(1) We need a model where the


probability never goes above 1
and below 0.
(2) the slope of the curve must
diminish as it gets closer to one
i.e. a non-linear model

12

(II). Index Models


Index Models restrict the way in
which the explanatory variables
affect the dependent variable.
Index models use link functions
to transform discrete valued
variables (i.e. nominal or ordinal
variables) into continuous variables
that can be better modelled using
regression.

Back to our house ownership


example:
Y is discretetaking on the values
0 or 1 if someone buys a home, for
instance
 Can imagine a continuous
variable Y* that reflects a
persons desire to buy the
home.
 Y* would vary continuously
with some explanatory variable
like income

13

Index Models
General form of the models:
Pr(Y=1|x) = G(,X)
Where G is usually a cumulative distribution function (CDF).
Through a link function, a variable which is (0,1) can be
transformed into a continuous variable.
Two wisely used link function
 In logit models: G is the CDF of a logistic distribution.
 In Probit Models: G is the CDF of a standard normal
distribution.

14

Index Models
Since y* is unobserved, we use do not know the
distribution of the errors,
In order to use maximum likelihood estimation (ML),
we need to make some assumption about the
distribution of the errors.
The difference between Logistic and Probit models lies
in this assumption about the distribution of the errors

15

Index Models
Probit:
standard
Normal
distribution

logit:
Standard
logistic
distribution

16

Index Models

17

The logit model


LPM vs. Logit

How it works?
In the logit model the probability p
that the observed value y takes the
value 1 is
Pi = E (Y = 1 X i ) =

1
1+ e

-( 0 + i X i )

2.0

1.5

1.0

0.5

Are Pi produced by logit now


limited to 0 and 1?
 If 0 + i X i =
P

1
1 + e

then
=

-0.5

1
= 1
1

1
1 + e

-1.0
2

 If 0 + 1 X 1i = then
P

0.0

10

12

P_LPM

14

16

18

20

22

24

P_LOGIT

= 0
18

The logit model


Estimation of the logit model

Log-Odds Ratio:

Note that :
Probability of yes:

By taking the natural log of the


odds ratio we get:

Pi = E (Y = 1 X i ) =

1
1+ e

( 0 + i X i )

Probability of No:
1 Pi = E (Y = 1 X i ) =

1
1 + e ( 0 + i X i )

One possibility is to convert this


nonlinear function into a linear
regression function.

ln

( )=
Pi
1 Pi

0 + X + ui
i i

so that the 'log-odds ratio' is a


linear function of Xi, but the
probability is still a nonlinear
function of Xi.

( 0 + 1 X i )
pi
1 + e ( 0 + i X i )
=
=
e
1 pi
1 + e (0 + i X i )

19

The logit model


Can the logit model estimated
using OLS?
Infeasible if we are dealing with
individual, or micro, level data
(need Maximum Likelihood)
However, for group data or
replicated data: possible to beware
of heteroscedasticity problem
(WLS)
Example, corresponding to each
level of income level, Xi, there are
Ni families, ni among whom are
home owners, (n i N i ).
[see example 15.7 in Gujarati and
porter 2005 for further details]

Table 1: Hypothetical data on group data


X($000)

Ni

ni

40

50

12

10

60

18

13

80

28

15

100

45

20

70

36

25

65

39

30

50

33

35

40

30

40

25

20

20

The logit model


Features of logit

LOG_ODDS
10

Log of Odds Ratio

5
0
-5
-10
-15
-20
2

10

12

14

16

18

20

22

24

16

18

20

22

24

X ($'000)

P_LOGIT
1.0
0.8

Prob

1. Whereas the LPM assumes


that Pi is linearly related to
Xi, the logit model assumes
that the log of odds ratio is
linearly related to Xi.
2. Logit goes from - to ,
but Pi goes from 0 to 1
3. Logit linear with Xi but
probabilities themselves are
not.

0.6
0.4
0.2
0.0
2

10

12

14

X ($'000)

21

The logit model


LOG_ODDS
10

Log of Odds Ratio

5
0
-5
-10
-15
-20
2

10

12

14

16

18

20

22

24

16

18

20

22

24

X ($'000)

P_LOGIT
1.0
0.8

Prob

If logit is positive, it mean that


when the value of regressor(s)
increases, the odds that the
regressand =1 (meaning some
event of interest happens) increases
(YX )
If logit is negative, it mean that
when the value of regressor(s)
increases, the odds that the
regressand =1 (meaning some
event of interest happens)
decreases (YX )

0.6
0.4
0.2
0.0
2

10

12

14

X ($'000)

22

The logit model


Marginal Effect: A complication arises in interpreting
the estimated s
With a linear probability model, a estimate measures the
ceteris paribus effect of a change in the explanatory variable on
the probability Y equals 1

In the logit model


The derivative is
nonlinear and
depends on the
value of X.

23

The logit model

24

The logit model


Forecasting
The probability of owning a house
for a given level of income. e.g. for
X= 12 ($12000), the estimated
probability of owning a house is
Pi = E (Y = 1 X i ) =
=

1
1 + e -( 0 + i X i )

1 + e - ( 12 .6562 + 0 .8342 (12 ))


=0.0646

25

The Probit Model


In the probit model, we assume the
error in the utility index model is
normally distributed, ui

N(0,2)

Where F is the standard normal


cumulative density function (c.d.f.)

26

The Probit Model


The c.d.f. of the logit and the
probit look quite similar.
Marginal Effect: Calculating the
derivative is moderately
complicated

Where is the density function of


the normal distribution

27

The Probit Model


Forecasting
The probability of owning a house
for a given level of income. e.g. for
X= 12 ($12000), the estimated
probability of owning a house is
P(Y=1|Y)
=P(Z0+1X)
=P(Z -6.6105+0.4323(12))
= P(Z -1.4229)
=0.0764

28

Model Evaluation
Goodness of Fit
McFadden R-Squared

Overall significant
Likelihood Ratio TEST

1-LnUR/LnLR

LR=2(lnLU R lnLR)
Where the LUR is the calculated
likelihood for the full model and
LR is the calculated likelihood for
the restricted model.
The test is distributed according
to a chi-squares distributed, with
the
degree
of
freedom
corresponding to the number of
restrictions.
29

Das könnte Ihnen auch gefallen