Beruflich Dokumente
Kultur Dokumente
Outline
Logit and probit models for binary dependent variables Tobit model for corner solutions
Why do we care?
Lets start with a review of the linear probability model to examine some of its shortcomings The model is given by: y = 0 + 1 x1 + ... + k xk + u where P ( y = 1| x ) = E ( y | x ) = 0 + 1 x1 + ... + k xk
We can get predictions that are either greater than 1 or less than 0! The independent variables cannot be linearly related to the dependent variable for all possible values.
3.
10
11
12
y = 1 y* > 0 We assume that e is independent of x and that e either has the standard logistic distribution or the standard normal distribution Under either assumption e is symmetrically distributed about 0, which implies that 1-G(-z)=G(z) for all real numbers z
14
y* = 0 + x + e,
= P ( 0 + x + e > 0 | x ) = P ( e > ( 0 + x ) | x ) = 1 G ( 0 + x ) = G ( 0 + x )
15
16
19
G ( 0 + 1 ( c + 1) + 2 x2 + ... + k xk ) G ( 0 + 1c + 2 x2 + ... + k xk )
Again, this effect depends on x Note, however, that the sign of 1 is enough to know whether the discrete variable has a positive or negative effect This is because G() is strictly increasing
20
f ( y | xi ; ) = G ( xi ) 1 G ( xi )
y
1 y
, y = 0,1
21
22
( )
24
Logit and Probit Models for Binary Response: Testing Multiple Hypotheses
We can also test for multiple exclusion restrictions (i.e., two or more regression parameters are equal to 0) There are two options commonly used:
A Wald test A likelihood ratio test
25
Logit and Probit Models for Binary Response: Testing Multiple Hypotheses
Wald test:
In the linear model, the Wald statistic, can be transformed to be essentially the same as the F statistic The formula can be found in Wooldridge (2002, Chapter 15) It has an asymptotic chi-squared distribution, with degrees of freedom equal to the number of restrictions being tested In Stata we can use the test command following probit or logit estimation
26
Logit and Probit Models for Binary Response: Testing Multiple Hypotheses
Likelihood ratio (LR) test If both the restricted and unrestricted models are easy to compute (as is the case when testing exclusion restrictions), then the LR test is very attractive It is based on the difference in the log-likelihood functions for the restricted and unrestricted models
Because the MLE maximizes the log-likelihood function, dropping variables generally leads to a smaller log-likelihood (much in the same way are dropping variables in a liner model leads to a smaller R2)
It is asymptotically chi-squared with degrees of freedom equal to the number of restrictions can use lrtest in Stata
27
LR = 2 ( Lur Lr )
Logit and Probit Models for Binary Response: Interpreting Probit and Logit Estimates
Recall that unlike the linear probability model, the estimated coefficients from Probit or Logit estimation do not tell us the magnitude of the partial effect of a change in an independent variable on the predicted probability This depends not just on the coefficient estimates, but also on the values of all the independent variables and the coefficients
28
Logit and Probit Models for Binary Response: Interpreting Probit and Logit Estimates
For roughly continuous variables the marginal effect is approximately by: P ( y = 1| x ) g 0 + x j x j
For discrete variables the estimated change in the predicted probability is given by: G 0 + 1 ( c + 1) + 2 x2 + ... + k xk
( G(
+ 1c + 2 x2 + ... + k xk 0
29
Logit and Probit Models for Binary Response: Interpreting Probit and Logit Estimates
Thus, we need to pick interesting value of x at which to evaluate the partial effects
Often the sample averages are used. Thus, we obtain the partial effect at the average (PEA) We could also use lower or upper quartiles, for example, to see how the partial effects change as some elements of x get large or small If xk is a binary variable, then it often makes sense to use a value of 0 or 1 in the partial effect equation, rather than the average value of xk
30
Logit and Probit Models for Binary Response: Interpreting Probit and Logit Estimates
An alternative approach is to calculate the average partial effect (APE) For a continuous explanatory variable, xj, the APE is: n n n 1 g 0 + xi j = n 1 g 0 + xi j i =1 i =1
The two scale factors (at the mean for PEA and averaged over the sample for the APE) differ since the first uses a nonlinear function of the average and the second uses the average of a nonlinear function
31
32
Example 17.1
Independent variable Coefficient Estimates OLS (robust stderr) -0.0034 (0.0015) 0.038 (0.007) -0.016 (0.002) -0.262 (0.032) 0.013 (0.014) Probit -0.012 (0.005) 0.131 (0.025) -0.053 (0.008) -0.868 (0.119) 0.036 (0.043) Logit -0.021 (0.008) 0.221 (0.043) -0.088 (0.014) -1.44 (0.20) 0.060 (0.075)
33
Husbands income Years of education Age # kids <= 6 years old # kids > 6 years old
Example 17.1
True or false:
The Probit and Logit model estimates suggest that the linear probability model was underestimating the negative impact of having young children on the probability of women participating in the labour force.
34
Example 17.1
How does the predicted probability change as the number of young children increases from 0 to 1? What about from 1 to 2?
Well evaluate the effects at:
Husbands income=20.13 Education=12.3 Experience=10.6 Age=42.5 # older children=1
Example 17.1
From the probit estimates: Going from 0 to 1 small child decreases the probability of labour force participation by 0.334 Going from 1 to 2 small child decreases the probability of labour force participation by 0.256 Notice that the impact of one extra child is now nonlinear (there is a diminishing impact). This differs from the linear probability model which says any increase of one young child has the same impact.
36
37
38
39
We now need to think about how to estimate this model. There are two cases to consider:
When y=0 When y>0
40
Definition of y Definition of y*
41
Take home message: Conditional expectations in the Tobit are much more complicated than in the linear model E(y|x) is a nonlinear of function of both x and . Moreover, this conditional expectation can be shown to be positive for any values of x and .
44
E ( y | y > 0, x ) x j E ( y | x ) x j
= j 1 ( x ) x + ( x )
= j ( x )
Like in probit or logit models, the partial effect will depend on all explanatory variables and parameters
45
46
47
48
49
51
E ( y | x ) = ( x / ) x + ( x / )
We can use this to answer the following question: What is the impact of moving from 0 to 1 young children on the total number of hours worked? Well evaluate for a hypothetical person close to the mean values:
Husbands income: 20.12896 Education: 12 Experience: 11 Age: 43 # older children: 1
53
55
Specification Issues
The Tobit model relies on the assumptions of normality and homoskedasticity in the latent variable model Recall, using OLS we did not need to assume a distributional form for the error term in order to have unbiased (or consistent) estimates of the parameters. Thus, although using Tobit may provide us with a more realistic description of the data (for example, no negative predicted values) we have to make stronger assumptions than when using OLS. In a Tobit model, if any of the assumptions fail, it is hard to know what the estimated coefficients mean.
56
Specification Issues
One important limitation of Tobit models is that the expectation of y, conditional on a positive value, is closely linked to the probability that y>0 The effect of xj on P(y>0|x) is proportional to j, as is the effect on E(y|y>0,x). Moreover, for both expressions the factor multiplying j is positive. Thus, if you want a model where an explanatory variable has opposite effects on P(y>0|x) and E(y|y>0,x), then Tobit is inappropriate. One way to informally evaluate a Tobit model is to estimate a probit model where: w=1 if y>0 w=0 if y=0
57
Specification Issues
The coefficient on xj in the above probit model, say j, is directly related to the coefficient on xj in the Tobit model, j: j = j Thus, we can look to see if the estimated values differ.
For example, if the estimates differ in sign, this may suggest that the Tobit model is in appropriate
58
59
Specification Issues
If we find evidence that the Tobit model is inappropriate, we can use hurdle or two-part models These models have the feature that P(y>0|x) and E(y|y>0,x) depend on different parameters and thus xj can have dissimilar effects on the two functions (see Wooldridge (2002, Chapter 16))
60
Practice questions
17.2, 17.3 C17.1, C17.2, C17.3
61
As there is only one explanatory variable and it takes only two values, there are only two different predicted probabilities: the estimated loan approval probabilities for white and nonwhite applicants Hence, the predicted probabilities, whether we use a probit, logit, or LPM model are simply the cell frequencies: 0.708 for nonwhite applicants 0.908 for white applicants
63
64
65
67
68
69
Partial Effect at the Average White 0.106 (0.024) 0.097 (0.022) 0.129 (0.020)
71