Sie sind auf Seite 1von 5

W.J.

Burke

Estimating and Interpreting Craggs Tobit Alternative using Stata

Appendix: Stata log from example using Zambian fertilizer subsidy data
------------------------------------------------------------------------------log: C:\~\craggit.log
log type: text
opened on: 12 Feb 2009, 21:28:48
.
. */Load data: Re-write to match your own file structure
.
. clear
. set mem 8m
(8192k)
. use "C:\~\zam_fert_ex.dta", clear
.
. */Examine data for dependant variables
.
. describe
Contains data from C:\Documents and Settings\aec_user\Desktop\Craggit\zam_fert_ex.dta
obs:
6,378
vars:
7
11 Feb 2009 14:41
size:
114,804 (98.6% of memory free)
------------------------------------------------------------------------------storage display
value
variable name
type
format
label
variable label
------------------------------------------------------------------------------prov
byte
%8.0g
prov
province
educ
byte
%8.0g
ad09
highest level of education
completed
age
byte
%9.0g
age of household head
qbasal_g
int
%9.0g
subsidized basal fertilizer
(kg) obtained from gov't
disttown
float %9.0g
distance to nearest district
town (km) from center of sea
cland
float %9.0g
unit:hectare, owned by
households, only asked in 99/00
survey
basal_g
byte
%9.0g
binary indicator for subsidized
fertilizer purchases
------------------------------------------------------------------------------Sorted by: prov
.
. tab basal_g
basal_g |
Freq.
Percent
Cum.
------------+----------------------------------0 |
5,536
86.80
86.80
1 |
842
13.20
100.00
------------+----------------------------------Total |
6,378
100.00
. sum qbasal_g if basal_g==1
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------qbasal_g |
842
179.171
215.1514
1
3300
.
. */Estimate Cragg's double hurdle model
.
. craggit basal_g disttown cland educ age, sec(qbasal_g disttown cland educ age)
Estimating Cragg's tobit alternative
Assumes conditional independence
initial:
feasible:
rescale:
rescale eq:
Iteration 0:

log
log
log
log
log

likelihood
likelihood
likelihood
likelihood
likelihood

=
=
=
=
=

-<inf>
-1.316e+08
-692711.38
-7868.8964
-7868.8964

(could not be evaluated)

W.J. Burke
output omitted
Iteration 12:

Estimating and Interpreting Craggs Tobit Alternative using Stata

log likelihood = -7498.6146

Number of obs
Wald chi2(4)
Prob > chi2

Log likelihood = -7498.6146

=
=
=

6378
310.27
0.0000

-----------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------Tier1
|
disttown | -.0069824
.0010228
-6.83
0.000
-.008987
-.0049779
cland |
.1013351
.008374
12.10
0.000
.0849223
.1177478
educ |
.0510291
.0054314
9.40
0.000
.0403837
.0616744
age |
.0065328
.0014688
4.45
0.000
.0036541
.0094116
_cons | -1.717704
.0929843
-18.47
0.000
-1.89995
-1.535458
-------------+---------------------------------------------------------------Tier2
|
disttown | -4.081985
12.8194
-0.32
0.750
-29.20755
21.04358
cland |
346.8211
195.7674
1.77
0.076
-36.87603
730.5182
educ |
248.1769
178.1743
1.39
0.164
-101.0384
597.3922
age |
12.17913
20.46613
0.60
0.552
-27.93375
52.29201
_cons | -10130.17
7305.61
-1.39
0.166
-24448.91
4188.557
-------------+---------------------------------------------------------------sigma
|
_cons |
1110.695
391.0378
2.84
0.005
344.2749
1877.115
-----------------------------------------------------------------------------.
. */Notice the coefficients and standard errors (SE) are identical to tier by
. */tier regression. However, now all the coefficient estimates are in the
. */system's active memory simultaneously, which will be useful. First, we can
. */calculate x1g and x2b for each observation as a scaler (new variables).
.
. predict x1g, eq(Tier1)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

predict x2b, eq(Tier2)


*/We can also calculate the standard deviation of the second tier latent
*/variable's error term for each observation.
predict sigma, eq(sigma)
*/Note: craggit is equipped to handle heteroskedastic standard errors where
*/the SE is a function of observables using the 'het(varlist)' option. If
*/that is used, predict will generate a unique SE for each observation
*/Now, all the information we need to calculate the predicted values and
*/partial effects for every observation is either predicted as a new variable
*/or stored in Stata's active memory.
*/To calculate the probability y=0, from equation (1)
gen Pw0 = 1 - normal(x1g)

*/To calculate the probabilty y>0, from equation (2)


gen Pw1 = normal(x1g)

*/To calculate expected values it is useful to generate a new variable for


*/the Inverse Mills Ratio scaler:
gen IMR = normden(x2b/sigma)/normal(x2b/sigma)

*/To calculate the expected value of y, given that y > 0, from equation (3):
gen Eyyx2 = x2b + sigma*IMR

W.J. Burke
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
>
.
.
.
.
.

Estimating and Interpreting Craggs Tobit Alternative using Stata

*/To calculate the unconditional expected value of y from equation (4):


gen Eyx1x2 = normal(x1g)*(x2b + sigma*IMR)
*/We can also calculate partial effects for a given independant variable. For
*/example, we can see the partial effect of "Hectares of land under
*/cultivation" on probability of fertilizer use and fertilizer demand.
*/To calculate the partial effect on the probability that y > 0
*/from equation (5):
gen dPw1_dxj = [Tier1]_b[cland]*normden(x1g)

*/To calculate the partial effect on the expected value of y, given y > 0,
*/from equation (6):
gen dEyyx2_dxj=[Tier2]_b[cland]*(1-IMR*(x2b/sigma+IMR))

*/to calculate the partial effect on the unconditional expected value of y


*/from equation (7):
gen dEy_dxj=[Tier1]_b[cland]*normd(x1g)*(x2b+sigma*IMR) ///
+[Tier2]_b[cland]*normal(x1g)*(1-IMR*(x2b/sigma+IMR))
*/We can calculate the Average Partial Effect (APE) of 'cland' for the sample
*/on, say, the unconditional expected value using the summarize command.
sum dEy_dxj

Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------dEy_dxj |
6378
4.690667
7.955384
.444227
356.5416
.
.
. */The APE is 4.69: for each addl hectare owned demand goes up 4.69 kg on average
. */We can also see how this APE varies across, say, provinces using 'tabulate'
.
. tab prov, sum(dEy_dxj)
|
Summary of dEy_dxj
province |
Mean
Std. Dev.
Freq.
------------+-----------------------------------central |
5.4383738
9.235746
814
copperbel |
4.9780074
3.8727589
410
eastern |
4.3869186
11.489428
1954
luapula |
4.1558484
1.7418157
168
lusaka |
5.1324321
6.5510018
264
northern |
4.9731253
3.668755
658
nwestern |
3.5899366
3.4598165
330
southern |
5.6969612
6.2086771
972
western |
3.5020153
2.4753813
808
------------+-----------------------------------Total |
4.6906668
7.9553839
6378
.
.
.
.
.
.
.
.
.

*/Of course, the standard deviations from these summaries cannot be used for
*/inference. To evaluate the statistical significance of the APE, we can use
*/a series of commands to "bootstrap" a standard error. This requires re*/calculating every step of the process of obtaining the APE numerous times.
**Inference using bootstrapping
capture program drop APEboot

. program define APEboot, rclass


1. preserve
2. craggit basal_g disttown cland educ age, sec(qbasal_g disttown cland educ
> age)

W.J. Burke

Estimating and Interpreting Craggs Tobit Alternative using Stata

3. predict bsx1g, eq(Tier1)


4. predict bsx2b, eq(Tier2)
5. predict bssigma, eq(sigma)
6. gen bsIMR = normden(bsx2b/bssigma)/normal(bsx2b/bssigma)
7. gen bsdEy_dxj=[Tier1]_b[cland]*normd(bsx1g)*(bsx2b+bssigma*bsIMR) ///
> +[Tier2]_b[cland]*normal(bsx1g)*(1-bsIMR*(bsx2b/bssigma+bsIMR))
8. sum bsdEy_dxj
9. return scalar ape_xj=r(mean)
10. matrix ape_xj=r(ape_xj)
11. restore
12. end
.
. bootstrap ape_xj = r(ape_xj), reps(100): APEboot
(running APEboot on estimation sample)
Warning:

Since APEboot is not an estimation


e(sample), bootstrap has no way to
used in calculating the statistics
observations are used. This means
from the resampling due to missing

command or does not set


determine which observations are
and so assumes that all
no observations will be excluded
values or other reasons.

If the assumption is not true, press Break, save the data, and drop
the observations that are to be excluded. Be sure that the dataset
in memory contains only the relevant data.
Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
x................xx.......xx.........x............
..........x...x.......x.x.........................
Bootstrap results

command:
ape_xj:

50
100

Number of obs
Replications

=
=

6378
90

APEboot
r(ape_xj)

-----------------------------------------------------------------------------|
Observed
Bootstrap
Normal-based
|
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------ape_xj |
4.690667
.9082595
5.16
0.000
2.910511
6.470823
-----------------------------------------------------------------------------Note: One or more parameters could not be estimated in 10 bootstrap
replicates; standard error estimates include only complete
replications.

.
.
.
.
.
.
.
.
>

*/Depending on the model, this can be a very time consuming process,


*/especially if you need inference on the APE of multiple variables.
*/An alternative would be to approximate the SE useing the Delta Method
**Inference using the Delta Method
quietly craggit basal_g disttown cland educ age, sec(qbasal_g disttown cland
educ age)

.
. sum x1g
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------x1g |
6378
-1.182504
.3477446
-2.23537
2.200933
. local x1gbar = r(mean)
. sum x2b
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+--------------------------------------------------------

W.J. Burke
x2b |

Estimating and Interpreting Craggs Tobit Alternative using Stata


6378

-7721.258

1254.86

-10146.9

4173.1

. local x2bbar = r(mean)


.
.
>
>
>
>
>

nlcom [Tier1]_b[cland]*normd(`x1gbar')*(`x2bbar'+[sigma]_b[_cons]*
(normden(`x2bbar'/[sigma]_b[_cons])/normal(`x2bbar'/[sigma]_b[_cons])))
+[Tier2]_b[cland]*normal(`x1gbar')*(1-(normden(`x2bbar'/
[sigma]_b[_cons])/normal(`x2bbar'/[sigma]_b[_cons]))*(`x2bbar'/
[sigma]_b[_cons]+(normden(`x2bbar'/[sigma]_b[_cons])/normal(`x2bbar'/
[sigma]_b[_cons]))))

>
>
>
>
>

_nl_1: [Tier1]_b[cland]*normd(-1.182503929681159)*(-7721.258346323161+[
sigma]_b[_cons]* (normden(-7721.258346323161/[sigma]_b[_cons])/normal(-7721.2
58346323161/[sigma]_b[_cons]))) +[Tier2]_b[cland]*normal(-1.182503929681159)*
(1-(normden(-7721.258346323161/ [sigma]_b[_cons])/normal(-7721.258346323161/[
sigma]_b[_cons]))*(-7721.258346323161/ [sigma]_b[_cons]+(normden(-7721.258346
323161/[sigma]_b[_cons])/normal(-7721.258346323161/ [sigma]_b[_cons]))))

///
///
///
///
///

-----------------------------------------------------------------------------|
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------_nl_1 |
3.849253
3.012931
1.28
0.201
-2.055983
9.754489
-----------------------------------------------------------------------------.
. */Now we can use the calculated APE and the SE from the delta method:
.
. di normden(4.6906668/3.012931)
.11873967
.
. */The p-value is .119
.
. log close
log: C:\~\craggit.log
log type: text
closed on: 12 Feb 2009, 21:37:37
------------

Das könnte Ihnen auch gefallen