Beruflich Dokumente
Kultur Dokumente
JM Oakes
Our interest here is in math scores from a standardized achievement test. The treatment
variable, experimental = 1 or control = 0, is named condition, or cond for short.
Fit a mixed model using xtmixed. We are regressing the outcome variable, math, on
cond (treatment or control) with a random effect for school, since subjects are nested in
schools. The var option tells Stata to give us variances instead of standard deviations,
which is the default.
. xtmixed math cond || school: , var
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0:
Iteration 1:
Number of obs
Number of groups
=
=
311
20
9
15.6
25
Wald chi2(1)
Prob > chi2
=
=
4.50
0.0339
-----------------------------------------------------------------------------math |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------cond |
19.17173
9.040171
2.12
0.034
1.453321
36.89014
_cons |
509.6286
6.381632
79.86
0.000
497.1209
522.1364
----------------------------------------------------------------------------------------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------school: Identity
|
var(_cons) |
262.7821
135.4762
95.66709
721.8204
-----------------------------+-----------------------------------------------var(Residual) |
2150.299
178.1555
1827.997
2529.428
-----------------------------------------------------------------------------LR test vs. linear regression: chibar2(01) =
14.19 Prob >= chibar2 = 0.0001
What do we make of this? First, the model converged quickly in two iterations. There are
311 subjects nested in 20 schools. The condition effect is 19.17. This is the estimated
effect of the intervention on subjects. Its the delta, . The standard error of this effect is
9.04.
Importantly, this SE is correct since it accounts for the nesting in schools. The ratio of
the 19.17 to 9.04 is 2.12, which Stata labels it as a Z. This label and corresponding pvalue (0.034) would be okay if the standard-error were calculated on many degrees of
freedom. Such practice reflects the idea of aymptotic inference, which may be viewed as
infinite subjects or df. However, in practice upwards of 30-40 groups per arm would
probably suffice (we say more about required number of groups and/or subjects below).
It follows that what is not correct for these data is the associated p-value, 0.034 and
confidence intervals. Why, because, again, Stata assumes infinite degrees of freedom;
Statas z test is valid asymptotically. But we cannot make this assumption since we have
only 20 schools. And as you know, df for a treatment effect in a GRT is 2(number of
groups per condition 1), which in this case is 18, which is a tad less than infinity!
How can we get Stata to give us the correct p-value? Simple. We have to tell Stata to
evaluate our effect estimate and corresponding standard error with a t-distribution and
the correct df. This is easy with Statas built in t-distribution function, ttail.
. display 2 * (ttail(18, (_coef[cond] / _se[cond])))
Basically, the ttail function takes this form ttail(df, t-ratio) . We type display to
have Stata display the value of this function with appropriate inputs. We have to
manually calculate the df, but every GRT student knows how to do this. Then we simply
pick-off the values of the treatment effect coefficient, defined internally in Stata as
_coef[cond] as well as corresponding standard error which is _se[cond] . Multiplying the
result by a factor of two gives us a two-tail p-value instead of the default one-tail. This is
always appropriate in GRTs.
The result is
.04809365
In other words, p = 0.048, a value some would say is statistically significant. Note,
though, that this value is greater than the incorrect but default output p-value of 0.034,
which is based on asymptotic theory.
Parenthetically, you could also type
. display 2 * (ttail(18, 2.12)))
But there would be rounding error. Its better to use the high-precision system variables.
Note that if our treatment variable was not named cond but, say, Tx, we would type
. display 2 * (ttail(18, (_coef[Tx] / _se[Tx])))
Which yields t-test with appropriate df (according to you!); the nice thing here is the 95%
CIs are also corrected. The downside some extraneous information comes with it (I set it
apart with horizontal line).
+---------------------------------------------------------------------------------+
|
eq
parm
estimate
stderr
dof
t
p
min95
max95 |
|---------------------------------------------------------------------------------|
|
math
cond
19.172
9.040
18
2.121
0.048
0.179
38.164 |
|---------------------------------------------------------------------------------|
|
math
_cons
509.629
6.382
18
79.859
0.000
496.221
523.036 |
| lns1_1_1
_cons
2.786
0.258
18
10.807
0.000
2.244
3.327 |
| lnsig_e
_cons
3.837
0.041
18
92.616
0.000
3.750
3.924 |
+---------------------------------------------------------------------------------+
Moving on
What is the intraclass correlation coefficient (ICC) and variance component ratio (VCR)
of this analysis?
The Stata output shows the school-level variance to be 262.7821 and the individual or
residual variance to be 2150.299. Ignore the standard errors and associated 95%CI for
these values.
It follows that the ICC and VCR are ratios of these values. We again use system
variables and the display command to get what we want.
To see the components, along with all other model coefficients, tell Stata to display the
results from the most recently estimated model. The results are stored in a behind-thescenes matrix, named e(b). To display the matrix, use the matrix list command.
. matrix list e(b)
e(b)[1,4]
y1
math:
math: lns1_1_1:
lnsig_e:
cond
_cons
_cons
_cons
19.17173 509.62865 2.7856627 3.8366811
It is important to appreciate that Stata stores the variance components (last two
columns above) in the natural log and standard deviation form. Obviously we need to
do a little manipulation, exponentiating and squaring the stored results to get the
variances on the scale we want.
To get the group-level variance we exponentiate
itself.
_coef[lns1_1_1:]
exp(_coef[lnsig_e:])* exp(_coef[lnsig_e:])
2150.2993
This is ugly text, no? To get ICC and VCR well have to take ratios of these, which will be
even more ugly!. So, lets define nicely named macro-variables to hold the quantities.
. local var_e =
exp(_coef[lnsig_e:])* exp(_coef[lnsig_e:])
The VCR is
. di `var_g / `var_e
.12220725
Number of obs
Number of groups
=
=
311
20
9
15.6
25
Wald chi2(4)
Prob > chi2
=
=
155.74
0.0000
-----------------------------------------------------------------------------math |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------cond |
5.771752
5.449119
1.06
0.290
-4.908324
16.45183
pov | -12.14481
7.861378
-1.54
0.122
-27.55283
3.263208
gender |
1.842925
4.593764
0.40
0.688
-7.160687
10.84654
read |
7.197725
.6182529
11.64
0.000
5.985972
8.409479
_cons |
-268.623
67.2485
-3.99
0.000
-400.4277
-136.8184
----------------------------------------------------------------------------------------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------school: Identity
|
var(_cons) |
38.59627
46.5025
3.638982
409.3651
-----------------------------+-----------------------------------------------var(Residual) |
1571.626
130.5526
1335.492
1849.513
-----------------------------------------------------------------------------LR test vs. linear regression: chibar2(01) =
1.07 Prob >= chibar2 = 0.1507
What do me make of this output? Pretty much the same conclusions as the simpler,
unadjusted, model above. The first thing to note is that the sample size has not changed;
there remain 311 subjects nested in 20 schools. This means there are no missing values
for any of the variables included in the model, which is good.
The estimated treatment effect is now 5.77 with a standard error of 5.45, yielding a ratio
of 1.06.
The proper p-value is again calculated on 18 df, and evaluated with the t-distribution.
. display 2 * (ttail(18, (_coef[cond] / _se[cond])))
.30351236
Not surprisingly, the treatment effect is not statistically significant from zero.
It is appropriate to treat the effects of poverty, gender, and reading score as nuisance
variables since they would not be needed if randomization better balanced such
background characteristics. Ignore such effects.
The variance components may again be captured by exploiting the models matrix of
estimated coefficeints.
. mat list e(b)
e(b)[1,7]
y1
math:
cond
5.7717525
math:
pov
-12.14481
math:
gender
1.8429245
math:
math:
read
_cons
7.1977253 -268.62305
exp(_coef[lnsig_e:])* exp(_coef[lnsig_e:])
. di `var_g' / `var_e'
.02455818
lns1_1_1:
_cons
1.8265779
lnsig_e:
_cons
3.679933
How about adjusted means? These are useful since giving concrete values of adjusted
values of the outcome measure frequently helps readers/reviewers get a sense of the
treatment effect on the scale of the outcome measure itself.
In SAS we use LSMEANS, but in Stata we use adjust. The basic syntax is
. adjust variables in model to adjust for , by(cond) other options
More concretely,
. adjust pov gender read , by(cond) se format(%9.3f)
The top line of the top box indicates that the dependent variable is math score, along
with other information novices may ignore. The second line of the top box shows values
used in the prediction equation, which by default are equal to the means of the
covariates. Recall that the mean of a (0,1) variable is the proportion yes or true for
the measure. Thus, by default Statas adjust command yields what SAS requires by OM
(for observed margins) option.
The bottom box of the output shows the model adjusted means for the treatment and
control condition. The treatment condition mean is 522.46 and the control is 516.69.
Notice that the difference is 5.77, which is the estimated treatment effect.
Its easy to aslo requst 95% confidence intervals, but we discourage doing so unless you
are confident asymptotic assumptions are defensible; in other words, the number of
groups in your study exceeds 30 per arm.
What about the values of the random effects, sometimes called Empirical Bayes (EB)
estimates or BLUPS; in class, we call these bumps.
. predict re, reffects
Now list them, but only one per group (ie, school). To do this generate a variable to
condition on such that only one value per school is displayed.
. egen single = tag(school)
. list cond school re if single, noobs
+---------------------------+
| cond
school
re |
|---------------------------|
|
0
2
4.464902 |
|
1
3
2.849438 |
|
0
12
1.03236 |
|
1
19
1.836211 |
|
0
23
4.210702 |
|---------------------------|
|
1
24
-7.145421 |
|
0
25
-1.866368 |
|
1
27
-1.431592 |
|
0
31
-6.018359 |
|
1
32
-1.188469 |
|---------------------------|
|
0
35
-.8096194 |
|
1
41
-3.062465 |
|
0
43
-.3606333 |
|
1
44
1.590811 |
|
0
45
-3.073813 |
|---------------------------|
|
1
53
1.860326 |
|
0
70
1.314626 |
|
1
74
1.790454 |
|
0
75
1.106202 |
|
1
86
2.900706 |
+---------------------------+
10
We find it informative to graphically plot the distribution of BLUPS from random effects in
comparison to BLUEs, or fixed effects in class we call BLUEs shifts.
There are many ways to estimate fixed effects in Stata. One might be tempted to use
Statas areg command. Areg is short for absord regression, which effectively fits a model
with dummy variables for each school, but doesnt display such results. (Just remember,
this is a pedagogical exercise, fixed effect models are not appropriate for GRT data.)
The (seemingly) appropriate command is
. areg
math cond
Number of obs
F( 3,
288)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
311
36.53
0.0000
0.4070
0.3617
39.672
-----------------------------------------------------------------------------math |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------cond | (dropped)
pov | -5.209037
8.679059
-0.60
0.549
-22.29147
11.87339
gender |
2.100189
4.678475
0.45
0.654
-7.108151
11.30853
read |
6.780649
.6553348
10.35
0.000
5.490797
8.070502
_cons | -221.0522
71.67932
-3.08
0.002
-362.134
-79.97044
-------------+---------------------------------------------------------------school |
F(19, 288) =
1.279
0.196
(20 categories)
Notice the desired cond effect is dropped. This is because condition is perfectly collinear
with school indicator variables. The areg model does not permit nested effects. Nor
does the other useful command, xtreg, with the fe option. In other words we cannot
easily drop one school per treatment condition so that we can estimate the cond effect.
What can be done? A good and too frequently overlooked command is anova. Statas
anova command permits nested effects with the vertical bar, such that a|b implies a is
nested in b. Thus, we can type
11
Source |
SS
df
MS
-------------+-----------------------------Model | 311072.476
22
14139.658
Residual | 453283.016
288 1573.89936
-------------+-----------------------------Total | 764355.492
310 2465.66288
Number of obs
F( 22,
288)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
311
8.98
0.0000
0.4070
0.3617
39.672
-----------------------------------------------------------------------------math
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-----------------------------------------------------------------------------_cons
-210.2625
72.07997
-2.92
0.004
-352.1328
-68.39213
cond
1
-14.8864
13.96549
-1.07
0.287
-42.37376
12.60097
2
(dropped)
gender
1
-2.100189
4.678475
-0.45
0.654
-11.30853
7.108151
2
(dropped)
pov
1
5.209037
8.679059
0.60
0.549
-11.87339
22.29147
2
(dropped)
read
6.780649
.6553348
10.35
0.000
5.490797
8.070502
school|cond
2 1
10.41479
12.52372
0.83
0.406
-14.23484
35.06442
3 2
-2.270355
12.77133
-0.18
0.859
-27.40734
22.86663
12 1
4.428507
15.92129
0.28
0.781
-26.90833
35.76534
19 2
-3.225934
14.31962
-0.23
0.822
-31.41031
24.95845
23 1
15.7866
14.7453
1.07
0.285
-13.23562
44.80883
24 2
-31.24932
13.06868
-2.39
0.017
-56.97156
-5.527088
25 1
-12.05313
14.34601
-0.84
0.402
-40.28945
16.18319
27 2
-17.45405
14.76243
-1.18
0.238
-46.50998
11.60188
31 1
-29.38767
14.33302
-2.05
0.041
-57.59843
-1.176906
32 2
-17.99994
16.72715
-1.08
0.283
-50.9229
14.92301
35 1
-4.355241
14.42814
-0.30
0.763
-32.75321
24.04273
41 2
-20.88286
14.12454
-1.48
0.140
-48.68328
6.917559
43 1
-2.261484
13.08057
-0.17
0.863
-28.00713
23.48416
44 2
-6.041949
14.3659
-0.42
0.674
-34.31742
22.23352
45 1
-13.80596
13.81846
-1.00
0.319
-41.00395
13.39203
53 2
1.074164
16.61647
0.06
0.949
-31.63095
33.77928
70 1
4.113912
15.55775
0.26
0.792
-26.5074
34.73522
74 2
-2.264692
14.62513
-0.15
0.877
-31.05039
26.521
75 1
(dropped)
86 2
(dropped)
------------------------------------------------------------------------------
This is correct but there seems to be no easy and general way to extract the school-level
fixed effects for further manipulation. Recall, the point of this exercise is to extract fixed
school effects to compare to the (proper) random school effects.
12
Incidentially, to estimate the model in SAS we type and get same thing
proc mixed data=grt.postonly_x;
class school cond;
model math = cond pov gender read school(cond) / solution;
run;
GRT.POSTONLY_X
MATH
Diagonal
REML
Profile
Model-Based
Residual
(output omitted)
Effect
Intercept
COND
COND
POV
GENDER
READ
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL(COND)
SCHOOL
COND
0
1
2
12
23
25
31
35
43
45
70
75
3
19
24
27
32
41
44
53
74
86
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
Estimate
Standard
Error
DF
t Value
Pr > |t|
-207.15
-14.8864
0
-5.2090
2.1002
6.7806
10.4148
4.4285
15.7866
-12.0531
-29.3877
-4.3552
-2.2615
-13.8060
4.1139
0
-2.2704
-3.2259
-31.2493
-17.4541
-17.9999
-20.8829
-6.0419
1.0742
-2.2647
0
72.3880
13.9655
.
8.6791
4.6785
0.6553
12.5237
15.9213
14.7453
14.3460
14.3330
14.4281
13.0806
13.8185
15.5577
.
12.7713
14.3196
13.0687
14.7624
16.7271
14.1245
14.3659
16.6165
14.6251
.
288
288
.
288
288
288
288
288
288
288
288
288
288
288
288
.
288
288
288
288
288
288
288
288
288
.
-2.86
-1.07
.
-0.60
0.45
10.35
0.83
0.28
1.07
-0.84
-2.05
-0.30
-0.17
-1.00
0.26
.
-0.18
-0.23
-2.39
-1.18
-1.08
-1.48
-0.42
0.06
-0.15
.
0.0045
0.2873
.
0.5489
0.6538
<.0001
0.4063
0.7811
0.2852
0.4015
0.0412
0.7630
0.8629
0.3186
0.7916
.
0.8590
0.8219
0.0174
0.2381
0.2828
0.1404
0.6744
0.9485
0.8770
.
13
Okay, back to Stata. We want to construct a simple (0,1) dummy variable for school, and
not exlude any school from this process. It is convenient to use the xi command for
doing this. But lets do a little work to make things work out nice later.
xi, noomit i.cond*i.school
The result of this is fine, but a little confusing since every interaction between condition
and school is generated as a dummy. Since our schools are nested in condition, we
dont have this structure. To make things more visually appealing, lets drop all of newly
generated interaction terms that are not legitimate. To see which one, tabulate school by
condition: the combinations with 0 subjects are not legitimate (eg, school 2, cond 1).
. tab school cond
|
cond
school |
0
1 |
Total
-----------+----------------------+---------2 |
25
0 |
25
3 |
0
25 |
25
12 |
10
0 |
10
19 |
0
15 |
15
23 |
13
0 |
13
24 |
0
22 |
22
25 |
14
0 |
14
27 |
0
14 |
14
31 |
14
0 |
14
32 |
0
9 |
9
35 |
14
0 |
14
41 |
0
16 |
16
43 |
21
0 |
21
44 |
0
15 |
15
45 |
16
0 |
16
53 |
0
9 |
9
70 |
11
0 |
11
74 |
0
14 |
14
75 |
18
0 |
18
86 |
0
16 |
16
-----------+----------------------+---------Total |
156
155 |
311
drop
14
Now lets fit regular fixed-effect regression model with a fixed effect for school, but we
leave out one school per condition as reference. Any school per condition will do, but
lets follow lead of the anova analysis above and omit schools 75 and 86.
. reg math cond gender pov read
conXsch_0_2- conXsch_0_70
Source |
SS
df
MS
-------------+-----------------------------Model | 311072.476
22
14139.658
Residual | 453283.016
288 1573.89936
-------------+-----------------------------Total | 764355.492
310 2465.66288
conXsch_1_3- conXsch_1_74
Number of obs
F( 22,
288)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
311
8.98
0.0000
0.4070
0.3617
39.672
-----------------------------------------------------------------------------math |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------cond |
14.8864
13.96549
1.07
0.287
-12.60097
42.37376
gender |
2.100189
4.678475
0.45
0.654
-7.108151
11.30853
pov | -5.209037
8.679059
-0.60
0.549
-22.29147
11.87339
read |
6.780649
.6553348
10.35
0.000
5.490797
8.070502
conXsch_0_2 |
10.41479
12.52372
0.83
0.406
-14.23484
35.06442
conXsch_0_12 |
4.428507
15.92129
0.28
0.781
-26.90833
35.76534
conXsch_0_23 |
15.7866
14.7453
1.07
0.285
-13.23562
44.80883
conXsch_0_25 | -12.05313
14.34601
-0.84
0.402
-40.28945
16.18319
conXsch_0_31 | -29.38767
14.33302
-2.05
0.041
-57.59843
-1.176906
conXsch_0_35 | -4.355241
14.42814
-0.30
0.763
-32.75321
24.04273
conXsch_0_43 | -2.261484
13.08057
-0.17
0.863
-28.00713
23.48416
conXsch_0_45 | -13.80596
13.81846
-1.00
0.319
-41.00395
13.39203
conXsch_0_70 |
4.113912
15.55775
0.26
0.792
-26.5074
34.73522
conXsch_1_3 | -2.270355
12.77133
-0.18
0.859
-27.40734
22.86663
conXsch_1_19 | -3.225934
14.31962
-0.23
0.822
-31.41031
24.95845
conXsch_1_24 | -31.24932
13.06868
-2.39
0.017
-56.97156
-5.527088
conXsch_1_27 | -17.45405
14.76243
-1.18
0.238
-46.50998
11.60188
conXsch_1_32 | -17.99994
16.72715
-1.08
0.283
-50.9229
14.92301
conXsch_1_41 | -20.88286
14.12454
-1.48
0.140
-48.68328
6.917559
conXsch_1_44 | -6.041949
14.3659
-0.42
0.674
-34.31742
22.23352
conXsch_1_53 |
1.074164
16.61647
0.06
0.949
-31.63095
33.77928
conXsch_1_74 | -2.264692
14.62513
-0.15
0.877
-31.05039
26.521
_cons |
-222.04
72.31614
-3.07
0.002
-364.3752
-79.70485
------------------------------------------------------------------------------
Notice this result is same as anova above, except for estimated intercept, which we
ignore anyway.
Now we have to do some work to extract and re-merge school-level fixed effects. And
remember, the estimated school-specific coefficients are differences with respect to the
omitted reference school, per condition, so we have to extract condition means too.
parmest , saving(temp, replace)
preserve
use temp, clear
gen str3 id = substr(parm, -2,2)
replace id = subinstr(id, "_" ,"0" , 1)
keep est id
destring id, force replace
keep if id ~= .
save fe, replace
restore
sort school
merge school using fe
drop _merge
15
mean fe
replace
gen fe2
replace
if single, over(cond)
fe = 0 if fe == .
= fe - _coef[fe:0] if cond == 0
fe2 = fe - _coef[fe:1] if cond == 1
+---------------------------------------+
| cond
school
re
fe2 |
|---------------------------------------|
|
0
2
4.464902
13.12676 |
|
0
12
1.03236
7.140474 |
|
0
23
4.210702
18.49857 |
|
0
25
-1.866368
-9.341164 |
|
0
31
-6.018359
-26.6757 |
|
0
35
-.8096194
-1.643273 |
|
0
43
-.3606333
.4504827 |
|
0
45
-3.073813
-11.094 |
|
0
70
1.314626
6.82588 |
|
0
75
1.106202
2.711967 |
|---------------------------------------|
|
1
3
2.849438
7.761139 |
|
1
19
1.836211
6.805561 |
|
1
24
-7.145421
-21.21783 |
|
1
27
-1.431592
-7.422559 |
|
1
32
-1.188469
-7.968448 |
|
1
41
-3.062465
-10.85137 |
|
1
44
1.590811
3.989546 |
|
1
53
1.860326
11.10566 |
|
1
74
1.790454
7.766802 |
|
1
86
2.900706
10.0315 |
+---------------------------------------+
We find the dotplot graphing command most useful for visualing the two histograms
together. The following code does this, and adds a few bells-and-whistles for
appearance.
16
Comparative Histograms
Estimated school effects, N=20
20
15
10
Shifts & Bumps
5
0
-5
-10
-15
-20
-25
-30
-35
RE Model
FE Model
17