Beruflich Dokumente
Kultur Dokumente
1 Tremblay, P. F., & Gardner, R. C. (1996). On the growth of structural equation modeling in psychological journals.
Structural Equation Modeling, 3, 93-104.
Copyright © 2009 by Mike Cheung, SEM 5
2 Hershberger, S. L. (2003). The growth of structural equation modeling: 1994-2001. Structural Equation Modeling, 10,
35-46.
Copyright © 2009 by Mike Cheung, SEM 6
Motivation (X) 1 2 3 4 5 7 8 12 13 15
Performance (Y) 9 13 10 12 16 12 20 13 17 22
Copyright © 2009 by Mike Cheung, SEM 11
Copyright © 2009 by Mike Cheung, SEM 12
3 Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, N.J.: L. Erlbaum Associates.
Copyright © 2009 by Mike Cheung, SEM 14
Copyright © 2009 by Mike Cheung, SEM 15
var(ey) var(x1)
y x1
b1
[ var y
]
cov y , x1
cov y , x1 var x 1 = [ b12 var x 1 var e y b1∗var x 1
b1∗var x 1 var x 1 ]
[ 10.606 5.536
5.536 7.781 ]
Copyright © 2009 by Mike Cheung, SEM 18
MODEL:
y ON x1; ! y is regressed ON x1
! Note. Variances of the independent variable (x1)
! and the error (y) will be estimated by default
Selected output:
MODEL RESULTS
Two-Tailed
Estimate S.E. Est./S.E. P-Value
Y ON
X1 0.712 0.093 7.687 0.000
Copyright © 2009 by Mike Cheung, SEM 21
Intercepts
Y 1.773 0.610 2.908 0.004
Residual Variances
Y 6.600 0.933 7.071 0.000
Interpretations:
1.The regression equation is y =1.770.71 x 1
2.The regression coefficient of x1 is 0.71, which is statistically significant
at .05.
3.The estimates divided by their corresponding standard errors (SEs)
approximately follow a standard norm distribution. If they are larger
than 1.96, they are statistically significant at .05.
Copyright © 2009 by Mike Cheung, SEM 22
y x1
b1
b2
cov (x 1,x 2)
x2
v ar(x 2)
Copyright © 2009 by Mike Cheung, SEM 23
1.Input: ex1b.inp
2.Output: ex1b.out
TITLE: A multiple regression model
DATA: FILE = ex1.txt;
VARIABLE: NAMES ARE y x1-x2;
USEVAR ARE ALL; ! Use all variables in the analysis
MODEL:
y ON x1 x2; ! y is regressed ON x1 and x2
! Note. Covariances among the independent variables
! are estimated by default
x1 WITH x2; ! Optional: Request the variance/covariance of x1 and x2
OUTPUT: SAMPSTAT; ! Request the sample statistics for checking
Copyright © 2009 by Mike Cheung, SEM 24
Selected output:
MODEL RESULTS Two-Tailed
Estimate S.E. Est./S.E. P-Value
Y ON
X1 0.671 0.101 6.650 0.000
X2 0.106 0.107 0.992 0.321
X1 WITH
X2 2.965 0.785 3.775 0.000
Means
X1 5.975 0.278 21.526 0.000
X2 6.021 0.262 22.976 0.000
Intercepts
Y 1.379 0.725 1.901 0.057
Variances
X1 7.703 1.089 7.071 0.000
X2 6.867 0.971 7.071 0.000
Residual Variances
Y 6.535 0.924 7.071 0.000
Copyright © 2009 by Mike Cheung, SEM 25
Interpretations:
1.The estimated regression coefficients for x1 and x2 are 0.671 and 0.106,
respectively.
2.The coefficient of x1 is statistically significant while the coefficient of
x2 is not.
Copyright © 2009 by Mike Cheung, SEM 26
D1 D1
X1 Y1 1.00 X1 Y1 1.00
D2 D2
X2 Y2 1.00 X2 Y2 1.00
Copyright © 2009 by Mike Cheung, SEM 27
3.Model identification:
1.Model identification concerns whether there is a unique solution for
the model being tested.
2.If the model is not identified, there will be no solution for the
proposed model. Thus, we cannot test the proposed model.
4.Degrees of freedom (dfs) of a model:
1.Let p*=p(p+1)/2 be the no. of pieces of information available in the
covariance matrix
1.Let q be the no. of free parameters in the model
2.Then df = p* - q
5.A necessary but not sufficient condition for the identification of any
Copyright © 2009 by Mike Cheung, SEM 28
SEM model is df ≥0 .4
1.Underidentification:
1.df < 0
2.no solution
3.e.g., 3=x+y
2.Just identification:
1.df=0
2.perfect fit
3.e.g., 3=x+y and 1=x-y then x=2 and y=1
3.Overidentification:
1.df > 0
2.no perfect solution
4 Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.
Copyright © 2009 by Mike Cheung, SEM 29
M
a b
c
X Y
4.To test the mediation effect, we have to test the significance of the
product term a*b.
5.The problem is that the sampling distribution of the product term is
usually non-normal.
Copyright © 2009 by Mike Cheung, SEM 32
5 MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V. (2002). A comparison of methods to
test the significance of the mediated effect. Psychological Methods, 7, 83-104.
6 Cheung, M.W.L. (2007). Comparison of approaches to constructing confidence intervals for mediating effects using
structural equation models. Structural Equation Modeling, 14, 227-246.
Copyright © 2009 by Mike Cheung, SEM 33
1.Data: ex2.dat
2.Input: ex2a.inp
3.Ouput: ex2a.out
TITLE: Simple mediating effect
DATA: FILE IS ex2.dat; ! Raw data are required for bootstrap
VARIABLE: NAMES X M Y; ! X: Independent variable, M: Mediator
! Y: Dependent variable
USEVARIABLES ARE ALL;
M ON X; ! Path X -> M
OUTPUT: SAMPSTAT;
Selected output:
TESTS OF MODEL FIT
Copyright © 2009 by Mike Cheung, SEM 34
Chi-Square Test of Model Fit
Value 0.000
Degrees of Freedom 0
P-Value 0.0000
MODEL RESULTS
Two-Tailed
Estimate S.E. Est./S.E. P-Value
Y ON
M 0.396 0.094 4.228 0.000
X -0.033 0.105 -0.310 0.756
M ON
X 0.517 0.100 5.190 0.000
Intercepts
M -0.644 0.337 -1.913 0.056
Y 0.231 0.321 0.721 0.471
Residual Variances
M 11.182 1.581 7.071 0.000
Y 9.788 1.384 7.071 0.000
Copyright © 2009 by Mike Cheung, SEM 35
Interpretations:
1.The model is just identified. That is, the df=0. In other words, we cannot
tell whether or not the proposed model fits the data.
2.The direct effect is -0.03, p=.76.
3.The indirect effect is 0.396*0.517=0.20.
4.However, we don't know if the indirect effect is significant.
5.We may request a bootstrap CI on the indirect effect in Mplus:
1.Input: ex2b.inp
2.Output: ex2b.out
Copyright © 2009 by Mike Cheung, SEM 36
TITLE: Simple mediating effect with bootstrap CI
DATA: FILE IS ex2.dat; ! Raw data are required for bootstrap
VARIABLE: NAMES X M Y; ! X: Independent variable, M: Mediator
! Y: Dependent variable
USEVARIABLES ARE ALL;
MODEL CONSTRAINT:
NEW(ind_effect); ! Create a new variable for indirect effect
ind_effect = p1*p2;
Selected output:
CONFIDENCE INTERVALS OF MODEL RESULTS
Y ON
M 0.177 0.222 0.396 0.584 0.642
X -0.307 -0.243 -0.033 0.165 0.225
M ON
X 0.248 0.316 0.517 0.694 0.752
Intercepts
M -1.482 -1.316 -0.644 0.031 0.227
Y -0.592 -0.368 0.231 0.858 1.066
Residual Variances
M 6.727 7.600 11.182 15.489 16.978
Y 6.104 6.953 9.788 12.283 13.055
New/Additional Parameters
IND_EFFE 0.076 0.100 0.204 0.334 0.386
Copyright © 2009 by Mike Cheung, SEM 38
Interpretations:
1.The estimated indirect effect is 0.20 with a 95% CI (0.10, 0.33).
2.Since the 95% CI does not include 0, the estimated indirect effect is
statistically significant at .05.
Copyright © 2009 by Mike Cheung, SEM 39
F1 F2
X1 X2 X3 X4
3.Model identification:
1.If there is no constraint,
1.p*=(4*5)/2=10,
2.q=(3 factor variances/covariance, 4 factor loadings, 4 error
variance)=11,
Copyright © 2009 by Mike Cheung, SEM 43
3.df=p*-q=-1!
4.Metric of a latent variable:
1.What is the mean of a latent variable?
2.What is the variance of a latent variable?
5.Solutions:
1.To overcome the identification problem in our example, we have to:
2.Approach 1:
1.Fix the factor variances at some specific positive values, usually 1.0
2.This applies to independent (exogenous) latent variables only
Copyright © 2009 by Mike Cheung, SEM 44
[ ]
a11 0
3. =
[ 1.0
]
cor f 2, f 1 1.0 and
=
a21, 0
0 a32 , df = 1
0 a 42
4.We set the scale of the latent factors having a mean of 0 and a
variance of 1.0.
3.OR
4.Approach 2:
1.Fix a loading from the latent variable to one observed variable at a
specific non-zero value, usually 1.0
2.This applies to independent and dependent (endogenous) latent
variables
Copyright © 2009 by Mike Cheung, SEM 45
[ ]
1.0 0
4. =
a 21, 0
0 1.0 and
0 a 42
=
[
var f 1
]
cov f 2, f 1 var f 2 , df = 1
Copyright © 2009 by Mike Cheung, SEM 46
1.00 1.00
F1 F2
F1 F2
1.00 1.00
X1 X2 X3 X4
X1 X2 X3 X4
MODEL: MODEL:
! Free the loading of y1 ! Fix the loading of y1 at 1.0
f1 BY y1* y2; f1 BY y1@1.0 y2;
! Free the loading of y3 ! Provide starting value if necessary
f2 BY y3* y4; f2 BY y3@1.0 y4*0.5;
10.Parameter estimation:
1.We try to find the parameter estimates such that the model implied
covariance matrix (theory) is as close to the sample covariance matrix
(data) as possible.
2.We usually use maximum likelihood (ML) estimation method. When
the sample size is reasonably large and the data are multivariate
normal, ML is a good method.
Copyright © 2009 by Mike Cheung, SEM 50
8 Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria: What did
they really say? Organizational Research Methods, 9, 202-220.
Copyright © 2009 by Mike Cheung, SEM 56
9 Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance structure analysis: Conventional criteria
versus new alternatives. Structural Equation Modeling, 6(1), 1-55.
10 Marsh, H. W., & Hau, K. T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing approaches
to setting cutoff values for fit indices and dangers in overgeneralizing Hu and Bentler's (1999) findings. Structural
Equation Modeling, 11, 320-342.
Copyright © 2009 by Mike Cheung, SEM 59
F1 F2 F1 F2
a a b b
X1 X2 X3 X4 X1 X2 X3 X4
Copyright © 2009 by Mike Cheung, SEM 62
MODEL:
! Constrain both factor loadings
f1 BY y1* (1)
y2 (1);
! Constrain both factor loadings
f2 BY y3* (2)
y4 (2);
OUTPUT: SAMP;
Copyright © 2009 by Mike Cheung, SEM 66
Selected output:
TESTS OF MODEL FIT
Value 4.777
Degrees of Freedom 3
P-Value 0.1889
MODEL RESULTS
Two-Tailed
Estimate S.E. Est./S.E. P-Value
F1 BY
Y1 0.458 0.054 8.462 0.000
Y2 0.458 0.054 8.462 0.000
F2 BY
Y3 0.544 0.048 11.403 0.000
Y4 0.544 0.048 11.403 0.000
Copyright © 2009 by Mike Cheung, SEM 67
F2 WITH
F1 0.360 0.125 2.880 0.004
Intercepts
Y1 2.002 0.052 38.151 0.000
Y2 2.454 0.053 46.400 0.000
Y3 2.007 0.052 38.801 0.000
Y4 2.511 0.055 46.032 0.000
Variances
F1 1.000 0.000 999.000 999.000
F2 1.000 0.000 999.000 999.000
Residual Variances
Y1 0.617 0.068 9.032 0.000
Y2 0.630 0.069 9.117 0.000
Y3 0.507 0.061 8.264 0.000
Y4 0.597 0.067 8.974 0.000
Copyright © 2009 by Mike Cheung, SEM 68
F1 F2
F1 F2
1.00
X1 X2 X3 X4
X1 X2 X3 X4
0.00
MODEL:
OUTPUT: SAMP;
Selected output:
TESTS OF MODEL FIT
Value 16.314
Degrees of Freedom 2
P-Value 0.0003
Copyright © 2009 by Mike Cheung, SEM 73
Value 71.367
Degrees of Freedom 6
P-Value 0.0000
CFI/TLI
CFI 0.781
TLI 0.343
Loglikelihood
H0 Value -1570.267
H1 Value -1562.110
Information Criteria
Estimate 0.154
90 Percent C.I. 0.091 0.228
Probability RMSEA <= .05 0.005
Value 0.058
Note. This model does not fit the data well. It is clear that it is not a good
model.
Copyright © 2009 by Mike Cheung, SEM 75
11 Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-
step approach. Psychological Bulletin, 103, 411-423.
Copyright © 2009 by Mike Cheung, SEM 76
Copyright © 2009 by Mike Cheung, SEM 77
12 Wheaton, B., Muthén, B., Alwin, D., & Summers, G. (1977). Assessing reliability and stability in panel models. In D.
R. Heise (Ed.): Sociological Methodology 1977 (pp. 84-136). San Francisco: Jossey-Bass.
Copyright © 2009 by Mike Cheung, SEM 79
SES AL71
AL67
4.The proposed SEM is:
6.Mplus syntax:
Copyright © 2009 by Mike Cheung, SEM 81
MODEL:
alien67 BY anomia67 power67;
alien71 BY anomia71 power71;
ses BY educ SEI;
OUTPUT: SAMP;
Copyright © 2009 by Mike Cheung, SEM 82
Interpretations::
2
4. df =6=71.55, p.0001 , CFI=0.97; TLI=0.92 and
RMSEA=0.108. The model fits the data marginally.
5.Before analyzing a full SEM, we may improve the model fit by model
modifications.
6.Since Anomia subscale and the Powerlessness subscale were
measured twice (1967 and 1971), it is reasonable to expect that the
measurement errors might be correlated.
7.We may use the modifications indices to provide some hints. We may
add MODINDICES in OUTPUT (see ex4b.inp).
1.Modification indices (MI):
1.For each fixed parameter specified, there is a MI for it;
Copyright © 2009 by Mike Cheung, SEM 83
2
2.The predicted drop in overall value if that parameter is freed;
3.We can free the parameters with large MIs if they are theoretically
justified.
2.Expected parameter change (EPC) value:
1.For each fixed parameter specified, there is an EPC for it;
2.The predicted change, in either a positive or negative direction, for
each parameter if the parameter is freed;
3.The directions should be consistent with our theory.
Copyright © 2009 by Mike Cheung, SEM 84
M.I. E.P.C. Std E.P.C. StdYX E.P.C.
WITH Statements
! Correlated errors
anomia67 WITH anomia71;
power67 WITH power71;
OUTPUT: SAMP;
Selected output:
2
1. df =4=4.74, p=.32 ; CFI=1.00; TLI=1.00 and RMSEA=0.014.
The model has an excellent fit.
2. We may compare whether this model (with correlated errors) is better
than the previous one (without correlated errors) by using a chi-square
2
difference test: =66.81, df =2, p.001 .
Copyright © 2009 by Mike Cheung, SEM 86
MODEL:
alien67 BY anomia67 power67;
alien71 BY anomia71 power71;
ses BY educ SEI;
! Correlated errors
anomia67 WITH anomia71;
power67 WITH power71;
Copyright © 2009 by Mike Cheung, SEM 88
! Structural model
alien67 ON ses (p1);
alien71 ON alien67 (p2);
alien71 ON ses;
MODEL CONSTRAINT:
NEW(ind_effect); ! Create a new variable for indirect effect
ind_effect = p1*p2;
Selected output:
2
1. df =4=4.74, p=.32 ; CFI=1.00; TLI=1.00 and RMSEA=0.014.
The model fits the data very well.
2. The direct effect is -0.227, p < .001.
3. The indirect effect is -0.575*0.607= -0.349, p < .05.
Copyright © 2009 by Mike Cheung, SEM 89
MODEL RESULTS Two-Tailed
Estimate S.E. Est./S.E. P-Value
ALIEN67 BY
ANOMIA67 1.000 0.000 999.000 999.000
POWER67 0.979 0.062 15.896 0.000
ALIEN71 BY
ANOMIA71 1.000 0.000 999.000 999.000
POWER71 0.922 0.059 15.498 0.000
SES BY
EDUC 1.000 0.000 999.000 999.000
SEI 0.522 0.042 12.363 0.000
ALIEN67 ON
SES -0.575 0.056 -10.195 0.000
ALIEN71 ON
ALIEN67 0.607 0.051 11.897 0.000
SES -0.227 0.052 -4.334 0.000
ANOMIA67 WITH
ANOMIA71 1.623 0.314 5.176 0.000
Copyright © 2009 by Mike Cheung, SEM 90
POWER67 WITH
POWER71 0.339 0.261 1.298 0.194
Variances
SES 6.797 0.649 10.474 0.000
Residual Variances
ANOMIA67 4.731 0.453 10.440 0.000
POWER67 2.564 0.403 6.360 0.000
ANOMIA71 4.399 0.515 8.541 0.000
POWER71 3.070 0.434 7.070 0.000
EDUC 2.802 0.507 5.527 0.000
SEI 2.646 0.181 14.598 0.000
ALIEN67 4.841 0.467 10.359 0.000
ALIEN71 4.083 0.404 10.104 0.000
New/Additional Parameters
IND_EFFE -0.349 0.041 -8.538 0.000
13 Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Copyright © 2009 by Mike Cheung, SEM 93
14 Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147-
177.
Copyright © 2009 by Mike Cheung, SEM 96
5.Example:
1.Data: ex5.dat:
1.We use “999” as the missing value in the data.
2.Other values are possible.
3.Be careful in choosing the missing values!
4.
2.Input: ex5a.inp
3.Output: ex5a.out
Copyright © 2009 by Mike Cheung, SEM 97
TITLE: Using FIML to handle missing data
MODEL:
f1 BY x1-x3*;
f2 BY x4-x6*;
f1@1.0;
f2@1.0;
OUTPUT: SAMP;
Selected output:
SUMMARY OF ANALYSIS
Number of groups 1
Number of observations 500
Copyright © 2009 by Mike Cheung, SEM 98
TITLE: Using listwise deletion to handle missing data
MODEL:
f1 BY x1-x3*;
f2 BY x4-x6*;
f1@1.0;
f2@1.0;
OUTPUT: SAMP;
Selected output:
SUMMARY OF ANALYSIS
Number of groups 1
Number of observations 338
Copyright © 2009 by Mike Cheung, SEM 99