Beruflich Dokumente
Kultur Dokumente
Lecture 9
Models for Censored and
Truncated Data – Truncated
Regression and Sample Selection
10
6
y
0
0 2 4 6
x
PDF(y*)
f(y*)
Prob(y*<5) Prob(y*>5)
5 y*
• The pdf of the observable variable, y, is a mixture of discrete
(prob. mass at Y=5) and continuous (Prob[Y*>5]) distributions.
4
RS – EC 2: Lecture 9
PDF(y*)
Prob(y*<5) Prob(y*>5)
5 Y
Truncated
10
6
y
0
0 2 4 6
x
Truncated regression
• Truncated regression is different from censored regression in the
following way:
Censored regressions: The dependent variable may be censored, but you
can include the censored observations in the regression
Truncated regressions: A subset of observations are dropped, thus, only
the truncated data are available for the regression.
• CASES:
(A-1) Sample selection is done randomly.
s is independent of ε and x. => E(ε|x,s)=E(ε|x).
Since the CLM assumptions are satisfied => we have E(ε|x)=0.
=> OLS is unbiased.
12
RS – EC 2: Lecture 9
• For example,
E(ε|x, s=1) = E(ε|x, y ≤150K)
= E(ε|x, β0+β1x+ ε ≤150K)
= E(ε|x, ε u ≤150K-β0-β1x)
≠E(ε|x) = 0. => OLS is biased.
14
RS – EC 2: Lecture 9
• CASE (A-2) can be more complicated, when the selection rule based
on the x-variable may be correlated with εi.
Example: X is IQ. A survey participant responds if IQ > v.
Now, the sample selection is based on x-variable and a random error v.
Q: If we run OLS using only the truncated data, will it cause a bias?
Two cases:
- (1) If v is independent of ε, then it does not cause a bias.
- (2) If v is correlated with ε, then this is the same case as (B-2). Then,
OLS will be biased.
16
RS – EC 2: Lecture 9
17
Truncated Regression
• Data truncation is (B-1): the truncation is based on the y-variable.
• We know that OLS on the truncated data will cause biases. The model
that produces unbiased estimate is based on the ML Estimation.
18
RS – EC 2: Lecture 9
Truncated Regression
These
Wealth observations
are dropped
from the data.
150K
True regression
Education
Biased regression when
applying OLS to truncated data
19
f (ε i )
f (ε i | y i < c ) = f (ε i | ε i < c i − x i ' β ) =
P (u i < c i − x i ' β )
f (ε i ) f (ε i )
= =
εi ci − xi 'β ci − xi 'β
P( < ) Φ( )
σ σ σ
1 1 ε i2
=
ci − xi 'β
exp( − )
Φ( ) 2 πσ 2 2σ 2
σ
1 εi
φ
σ σ
=
c − xi 'β i
Φ( i )
σ 20
RS – EC 2: Lecture 9
Truncated Normal
• Moments:
Let y*~ N(µ*, σ2) and α = (c – µ*)/σ.
- First moment:
E[y*|y> c] = µ* + σ λ(α) <= This is the truncated regression.
=> If µ*>0 and the truncation is from below –i.e., λ(α) >0–, the mean
of the truncated variable is greater than the original mean
Note: For the standard normal distribution λ(α) is the mean of the
truncated distribution.
- Second moment:
- Var[y*|y> c] = σ2[1 - δ(α)] where δ(α) = λ(α) [λ(α)- α]
=> Truncation reduces variance! This result is general, it applies to
upper or lower truncation given that 0 ≤ δ(α) ≤ 1
21
Truncated Normal
Model: yi* = Xiβ + εi
Data: y = y* | y* > 0
f(y|y*>0,X)
F(0|X)
f(y*|X)
0
Xiβ Xiβ+σλ
• Truncated (from below –i.e., y*>0) regression model:
E(yi| yi* > 0,Xi) = Xiβ + σλi > E(yi|Xi)
22
RS – EC 2: Lecture 9
1 yi − xi 'β
φ
σ σ
Li =
xi 'β
Φ( ) ln(joint density of N values of y*)
σ
• The likelihood function is given by
N N
∑ ∑ε
N 1
Log L (β, σ ) = log Li = − [log( 2 π ) + log( σ 2 )] − 2
i
i =1
2 2σ 2 i =1
N
xi ' β
− ∑ log Φ (
i =1
σ
) log(joint probability
of y*> 0)
(
∂ E y i | X i , y *i > 0 )=β +
(
∂ E ε i | y*i > 0 )
k
∂ X k ,i ∂ X k ,i
∂λ β
= βk + σ
∂ X k ,i
(
= β k + σ λ i2 − α i λ i − k
σ
)
( )
= β k 1 − λ i2 − α i λ i = β k (1 − δ i )
24
RS – EC 2: Lecture 9
Model: yi = β0+ β1 xi + εi
yi = family income in JPY 10,000
xi: husband’s education
• Three cases:
EX1. Use all observations to estimate model
EX2. Truncate sample from above (y<800). Then run the OLS using
on the truncated sample.
EXe. Run the truncated regression model for the data truncated from
above.
25
Truncated regression
Limit: lower = - inf Number of obs = 6274
upper = 800 Wald chi2(11) = 569.90
Log likelihood = -39618.629 Prob > chi2 = 0.0000
• The bias caused by this type of truncation is called sample selection bias.
• This model involves two decisions: (1) participation and (2) amount.
It is a generalization of the Tobit Model.
28
RS – EC 2: Lecture 9
Example: Age affects the decision to donate to charity. But it can have
a different effect on the amount donated. We may find that age has a
positive effect on the decision to donate, but given a positive donation,
younger individuals donate more than older individuals.
30
RS – EC 2: Lecture 9
- Conditional expectation:
E[y|y>0,x] = xi’β + σ12 λ(xi’α)
- Unconditional Expectation
E[y|x] = Prob(y>0|x) * E[y|y>0, x] + Prob(y=0|x)* 0
= Prob(y>0|x) * E[y|y>0, x]
= Φ(xi’α) * [xi’β + σ12 λ(xi’α )]
- Expectations:
- Conditional expectation:
E[y|y>0,x] = xi’β + σ12 λ(zi’α/σ1)
- Unconditional Expectation:
E[y|x] = Φ(zi’α /σ1) * [xi’β + σ12 λ(zi’α /σ1)] 34
RS – EC 2: Lecture 9
• Estimation
- ML –complicated, but efficient
- Two-step –easier, but not efficient. Not the usual standard errors.35
∂E [ y i | y i > 0 ] z α
= β k − α k (ρσ 1 )δ k = β k − α k λ i 2 − − i λ i
∂wik
σ1
36
RS – EC 2: Lecture 9
σ2
µ2|1 = µ 2 + ρ ( x1 − µ1 )
σ1
σ12 σ2 σ x1
Note: µ2|1 = µ2 + ( ) ( x1 − µ1 ) = µ2 + 122 ( x1 − µ1 )
σ1σ2 σ1 σ1
RS – EC 2: Lecture 9
(2) Observations with y>0 –i.e., yi* = zi’α + ε1,i >0 => Di=1.
- f(y|Di=1,x,z) * Prob(Di=1|x,z,y)
(2.a) f(y|Di=1,x) = P(D=1|y,x) f(y|x)]/P(D=1,x) (Bayes’ Rule)
= P(D=1|y,x) f(y|x)]/P(D=1,x)
- f(y|x) = (1/σ2) φ((yi - xi’β)/σ2) 39
41
• Problems:
- Not efficient (relative to MLE)
- Getting Var[b] is not easy (we are estimating α too).
We can do this test using the estimator for βλ, bλ, from the second step
of Heckman’s two-step procedure.
Identification issue II
We are not sure about the functional form. We may not be comfortable
interpreting nonlinearities as evidence for endogeneity of the
covariates.
45
. *******************************
. *Next, estimate the probit *
. *selection equation *
. *******************************
. probit s educ exper expersq nwifeinc age kidslt6 kidsge6
educ .1 090 655 .0 156 096 6 .99 0. 000 . 078 383 5 .13 974 76
exper .0 438 873 .0 163 534 2 .68 0. 008 . 011 743 4 .07 603 13
expersq -.0 008 591 .0 004 414 -1 .95 0. 052 -. 001 726 7 8.4 9e- 06
lambda .0 322 619 .1 343 877 0 .24 0. 810 -. 231 888 9 .29 641 26
_cons -.5 781 032 . 306 723 -1 .88 0. 060 -1 .18 099 4 .0 247 88
47
lwage
educ .1090655 .015523 7.03 0.000 .0786411 .13949
exper .0438873 .0162611 2.70 0.007 .0120163 .0757584
expersq -.0008591 .0004389 -1.96 0.050 -.0017194 1.15e-06
_cons -.5781032 .3050062 -1.90 0.058 -1.175904 .019698
s
educ .1309047 .0252542 5.18 0.000 .0814074 .180402
exper .1233476 .0187164 6.59 0.000 .0866641 .1600311
expersq -.0018871 .0006 -3.15 0.002 -.003063 -.0007111
nwifeinc
age
-.0120237
-.0528527
.0048398
.0084772
-2.48
-6.23
0.013
0.000
-.0215096
-.0694678
-.0025378
-.0362376
Note H0:ρ=0 cannot
kidslt6 -.8683285 .1185223 -7.33 0.000 -1.100628 -.636029
kidsge6 .036005 .0434768 0.83 0.408 -.049208 .1212179 be rejected. There is
_cons .2700768 .508593 0.53 0.595 -.7267473 1.266901
little evidence that
mills
lambda .0322619 .1336246 0.24 0.809 -.2296376 .2941613 sample selection
rho
sigma
0.04861
.66362875
bias is present. 48
lambda .03226186 .1336246