Sie sind auf Seite 1von 9

Cross-Validation and Information Criteria in Causal Modeling

Author(s): Christian Homburg


Source: Journal of Marketing Research, Vol. 28, No. 2 (May, 1991), pp. 137-144
Published by: American Marketing Association
Stable URL: http://www.jstor.org/stable/3172803 .
Accessed: 16/02/2015 07:33
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Marketing Association is collaborating with JSTOR to digitize, preserve and extend access to
Journal of Marketing Research.

http://www.jstor.org

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

CHRISTIAN
HOMBURG*
Manyapplicationsof causal modelingin marketinginvolveselectionamong several competingcausal models. Theauthorinvestigateswhethercommoncriteriafor
model selectionsuch as cross-validationindicesand informationcriteriaare likely
to lead to discoveryof the correctpopulationmodel. Guidanceon the use of these
selectioncriteriain practiceis providedfor substantivemarketingresearchers.Resultsindicatethat the adequacyof cross-validation
dependscriticallyon the method
used for sample-splitting.The author suggests the applicationof Snee's DUPLEX
algorithmin this context. For situationsin whichthe assumptionof multinormally
distributedvariablesis justified,informationcriteriaare foundto be highlyappropriate for model selection, outperformingcross-validationmethodsin several respects.

and
Cross-Validation
Information
Causal

Criteria

in

Modeling

The analysis of linear models involving latent (i.e.,


unmeasured) variables and corresponding indicator (i.e.,
observed) variables is widely known in the marketing
area. Marketing researchers usually refer to this methodology as "causal modeling," though technically "covariance structuremodeling" is a more appropriateterm
(see, e.g., Fornell 1986). The main reason for the increasing use of the causal modeling approachis probably
its ability to take errors in measurement into account so
that relations between the error-free parts of observed
responses can be analyzed. Most of the pioneering applications in marketing are by Bagozzi (see Aaker and
Bagozzi 1979; Bagozzi 1976, 1980b), who also wrote a
monograph on causal analysis in marketing (Bagozzi
1980a) and edited the 1982 special issue of the Journal
of MarketingResearch on the same topic (Bagozzi 1982).
Overviews of applications in different areas of marketing
research are provided by Fornell (1986) and Homburg
(1989).

Recent methodological literaturehas discussed the application of causal modeling in an exploratory framework (see, e.g., Bentler 1986; Cudeck and Browne 1983;
Homburg 1989; Homburg and Dobratz 1991; MacCallum 1986). Interestingly, however, the idea of applying causal analysis in an exploratory way was formulated previously by Jdreskog (1971, 1977), who
suggested carrying out methods of stepwise model modification in the context of causal modeling. These methods, frequently referred to as "specification searches,"
are not the focus of this article; they are discussed in
detail by Saris, de Pijper, and Zegwaart (1979), MacCallum (1986), and Homburgand Dobratz(1991), among
others. Examples of marketingapplicationsof such search
procedures are Sujan's (1986) analysis of salespeople's
motivation and Gaul and Homburg's (1988) investigation of the use of data analysis techniques by German
market research agencies.
A second way of applying causal analysis in an exploratory research context consists of specifying several
alternative models and selecting from that set of models
the one that seems most appropriate for describing the
structuresunderlying the observed data. Cross-validation
methods and information criteria are two well-known approaches in this context. The simulation study reported
here investigated and compared the quality of these selection methods in order to provide guidance for substantive marketing researchers on the use of these approaches in practice.

*Christian Homburg is head of the Marketing and Strategy Department, KSB AG, Frankenthal, Germany.
As the article is based on the author's doctoral dissertation, he thanks
Wolfgang Gaul of Karlsruhe University for many helpful discussions
on model selection. Thanks are due to Adolfo Varillas for computational assistance and to Stefan Suitterlinfor useful suggestions. Special
thanks go to three anonymous JMR reviewers, whose comments improved a previous version of the article.

137
Journal of Marketing Research
Vol. XXVIH (May 1991), 137-44

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

MAY1991
JOURNAL
OF MARKETING
RESEARCH,

138
After a brief discussion of the application of crossvalidation methods and information criteria for selecting
among several alternative causal models, the simulation
study is described in which the diverse approaches to
model selection were analyzed and compared on the basis of artificial data. Concluding remarks address implications for the use in practice of the methods investigated.
CROSS-VALIDATIONAND INFORMATION
CRITERIA
Consider a situationin which several alternativecausal
models have been specified, one of which is to be selected as most appropriatefor describing the structures
underlying the observed data. The basic idea in crossvalidation is to seek a model that will have the greatest
predictive validity in future samples ratherthan a model
that best reproducesstructuresof one specific sample that
may be inappropriateto futureobservationsfrom the same
population. Traditionally, cross-validation has been a
widely used method of model comparison. The researcher divides the sample into two subsamples, estimates the parameterson one, and validates on the other
(work by Mosier 1951 is an early example). The application of cross-validation procedures to causal models
was suggested by Cudeck and Browne (1983); an example from the marketing literature is the MacKenzie,
Lutz, and Belch (1986) study of attitude toward the ad
as a mediator of advertising effectiveness.
Given two subsamples, A and B, the researcher fits
each of the competing models to the data from subsample A and measures the prediction accuracy of one specific model by the cross-validation index
(1)

FAIB = F(SB,

A),

where S, denotes the sample covariance matrix of subsample B and LA is the covariance matrix reproduced by
the model on the basis of subsample A. In this formula,
F denotes an arbitrarydiscrepancy function (see, e.g.,
Browne 1984) commonly used for parameterestimation
in causal modeling. The choice of the model with the
greatest estimated predictive validity is made by selecting the model that yields the smallest cross-validation
index. Cudeck and Browne (1983) suggested that this
process should be repeated with subsample B as the calibration sample and subsample A as the validation sample (double cross-validation), yielding cross-validation
indices FB/A. Ideally, the same model yields the lowest
values with respect to both indices. If this is not the case,
one should decide on a subset of two or more models
for further consideration.
Literatureon cross-validation provides little guidance
on how the researchershould split the sample (see, e.g.,
Dorans and Drasgow 1980). A common solution is to
halve the sample randomly. To the best of our knowledge, in all applications of cross-validation to causal
modeling in which sample-splitting is carried out, some
random method is used (see, e.g., Balderjahn 1986; Cu-

deck 1985; Cudeck and Browne 1983). As this approach


does not take into account the information available in
the data, one runs the risk of building calibration and
validation samples that do not have similar statistical
properties and, therefore, are not really comparable.
Consequently, the results obtained via the cross-validation procedure are likely to be biased. A suitable algorithm for carrying out sample-splitting for cross-validation purposes should ensure that neither the calibration
sample nor the validation sample alone contains a subset
of observations that could bias the cross-validation.
Rather, the entire range of observations should be represented in both subsets. Snee (1977) proposed an algorithm called DUPLEX that seems to be appropriatefor
sample-splitting in cross-validation.
First, the data matrix is standardized and orthogonalized via a Choleski type of decomposition as described
by Kennardand Stone (1969). Next, Euclidean distances
between all possible pairs of objects are calculated from
the orthonormized data matrix. The two objects that are
farthest apartare assigned to the calibration sample; then
the two objects in the remaining sample that are farthest
apart are assigned to the validation set. In the next step
the object that is farthest from the two objects in the
calibration sample (average linkage distance) is assigned
to the calibration sample; then the same step is carried
out for the validation sample. This alternation between
the two subsets continues until all objects in the sample
have been assigned either to the calibration or to the validation subsample.
To the best of our knowledge, the DUPLEX algorithm
has not yet been applied in cross-validation of causal
models. The only reported application in a marketing
context of which we are aware is the analysis of the effectiveness of industrial print advertisements across
product categories by Hanssens and Weitz (1980).
Information criteria are another well-known concept
for selecting among several alternative models. These
criteria take the form of a penalized likelihood function-that is, the negative natural logarithm of the likelihood function plus a penalty term, which increases with
the number of parameters (see, e.g., Sclove 1987). Obviously, one will select the model that produces the minimum value of the information criterion. The most popular information criterion is Akaike's (1974) AIC, which
is based on the principle of minimizing the negative entropy also known as the Kulback-Leibler information
quantity (see Bozdogan 1987 for an excellent study of
the general AIC theory and its analytical extensions).
Another informationcriterionwas introducedby Schwarz
(1978), who worked from the Bayesian viewpoint.
The application of these two indices in causal modeling was suggested by Cudeck and Browne (1983), who
also gave the formula
(2)

and
(3)

AIC = FML(S,Z) + 2t/n


SIC = FML(S,Z) + t log(n)/n

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

IN CAUSALMODELING
AND INFORMATION
CRITERIA
CROSS-VALIDATION

for the computationof the two informationcriteria.Here


FMLdenotesthe discrepancyfunctionassociatedwith the
maximumlikelihoodapproach(see, e.g., J6reskog1967,
1969), t denotes the numberof parametersto be estimated in the model, and n is the sample size. These
expressions,referredto as "rescaledinformationcriteria," are slightly more convenientcomputationallythan
the originalones, given the outputof the LISRELcomputerprogram.
An importantproblemwith informationcriteriais that
they are associated with the maximumlikelihood approachwhich, in causal modeling, is based on the assumptionthatthe observedvariableshave a multivariate
normaldistribution
(see, e.g., J6reskogandS6rbom1982,
1984). This assumptionobviously is too restrictivein
many applications.Cross-validation,in contrast,is not
associatedwith a specific estimationmethod. It can be
appliedin the contextof maximumlikelihoodestimation
as well as in connectionwith least squaresor generalized
least squaresmethodsthat, for large samples, are not
associatedwith restrictivedistributional
assumptions(see,
e.g., Browne 1984 for a discussionof such estimation
methods).Methodsof cross-validationthereforeseem to
be somewhat more flexible than informationcriteria.
However,informationcriteriahave some importantbenefits. For example,they do not requireaccess to the raw
data but can be appliedin situationsin which the data
to be analyzedare availableonly as covarianceor correlationmatrices,which is not truefor cross-validation.
Anotherdrawbackof cross-validationis that it underutilizes availableinformationin both the calibrationand
validationstages, as it does not use all of the data in
eitherof the two steps (see, e.g., Cooil, Winer,andRados 1987). Small samples aggravatethe problem, especially in view of the fact thatthe causalmodelingapstatisticaltheory.
proachis basedentirelyon large-sample

139

models. Furthermore,the effects of sample size and of


the structureof the true populationmodel on the reliabilityof selectioncriteriawere analyzed.Anotherissue
of interestwas whethercross-validationcanbe improved
by using the DUPLEX algorithmfor sample-splitting
ratherthan some randommethod.
These issues are of considerableimportancefor applicationsof causal modeling in marketingbecause in
many empirical studies competing hypotheses about
marketingphenomenaareto be evaluated.Oftenit is not
obvious which criterionleads to the identificationof a
best model. Subjectivedecisionsaremorecommonthan
applicationsof explicit selectioncriteria.
Design

Though cross-validationand informationcriteriaare


well-establishedconceptsfor selectingamongalternative
models in various fields of multivariateanalysis, their
usefulness in causal modeling has not been demonstrated.CudeckandBrowne(1983) illustratedproperties
of these criteriain a series of analysesbased on longitudinalempiricaldata.
The studyreportedhere, in contrast,was designedto

As the main objective of the study was to find out


underwhich conditionsthe selectioncriteriaconsidered
succeedin detectingfrom severalcompetingmodelsthe
previouslydefinedtruepopulationmodel, the first step
consistedof specifyingsometruemodel, whichthenwas
used to constructartificialdata. Two differentpopulation modelswereused, the structures
of whichare shown
in Figure 1. Populationmodel A was takenfrom Bearden, Sharma,and Teel (1982). It was used in theirsimulationstudyfor analyzingsample-sizeeffects on the X2
measureand other statisticsused in evaluatingcausal
models. Model B is an own example. The models are
shownin the notationof the LISRELapproach(see, e.g.,
J6reskogand Sorbom1982).
The main reasonfor carryingout the analysison two
differentpopulationmodels is to analyze whetherthe
structureof the model underinvestigationsignificantly
affectsthe likelihoodof successof selectioncriteria.The
essentialdifferencebetweenthe two models is the numberof indicatorsusedto definea latentvariable.In model
A each of the latentvariables(circles in Figure 1) has
threeindicators(squares),whereasin modelB five latent
variablesare measuredby two indicatorsand one latent
variablehas only one indicator.Gerbingand Anderson
(1985) observedthatparameterestimatesin causalmodeling tend to be more reliablewhen latentvariablesare
definedby more than two indicators.One objectiveof
the analysiswas to find out whethera similareffect occurs in connectionwith selectioncriteria.
The next step was to assign values to all of the parametersin the two models. For the model structures
shown in Figure 1 in connectionwith the assumptionof
uncorrelatederrorvariables,these true parametersthen

explore the adequacy of the different selection criteria in


an objective framework based on artificial data generated from a previously specified true population model.
The analysis consisted of applying the different selection
criteria to several models, one of which was the true
population model, to find out whether or not the criteria
detect this true model as being optimal among the models
considered.
Specifically investigated was the issue of whether there
is a unique best criterionfor comparing alternativecausal

define a true population covariance matrix for each of


the two models. If all model variables have zero means,
a common assumption in causal modeling (see, e.g.,
Joreskog and Sorbom 1984), these population covariance matrices define two distinct multivariate normal
distributions. Samples from these distributions can be
generated by using subroutine GGNSM from IMSL
(1982). Samples of 50, 75, 100, 150, 200, 300, 400,
500, 750, and 1000 objects were used. For each population model and each sample size, 10 different samples

ANALYSIS
Purpose

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

140

JOURNAL
OF MARKETING
MAY1991
RESEARCH,
Figure 1
TRUEPOPULATION
MODELS

modelA

X,

82 x2 x2
3 1211
X,
*12

Y'

mode B

021

Y3 E
4

, 41224
x

k22

x,

x.

ye
v.

x
model

B
X,

v.X

eu

x
2 X2 x

E2ys
#112

a,, 83
42

84
x

-*E
43
x6

Es5

Y Y2 31
732

..

12

Y3

r42

723

were generated so that the analysis was based on a total


of 200 samples for which the corresponding true models
were known. This concept of generating artificial data
is employed commonly in simulation studies in causal
modeling (see, e.g., Bearden, Sharma, and Teel 1982;
Boomsma 1985; Gerbing and Anderson 1985; Homburg
and Dobratz 1991; MacCallum 1986). The values of the
population parametersand the two population covariance
matrices are not reported here. For model A these values
are the same as the ones specified by Bearden, Sharma,
and Teel (1982). It should be emphasized that parameter
values were chosen in such a way as to achieve convergent and discriminantvalidity of indicators (see, e.g.,
Bagozzi 1980a).
After samples had been generated, the misspecified
models were formulated. This approach of fitting misspecified models to artificial data has been used previously in the field of causal modeling to study several
related issues (see, e.g., MacCallum 1986; Saris, de
Pijper, and Zegwaart 1979). One must distinguish between misspecifications in the measurement model and
misspecifications in the structuralequation model-that

is, in the relations between latent variables. The issue of


misspecified measurement models has been discussed by
Anderson and Gerbing (1982, 1988), Gerbing and Hunter
(1982), Fornell (1983), and Gerbingand Anderson(1984).
Here, analysis is restricted to structural misspecifications. The main reason for this restriction is that meaningful model selection seems doubtful if misspecifications are presentin both the measurementand the structural
models (see also remarks by MacCallum 1986). The
structure of the measurement model should be established prior to the analysis of structuralrelations between
latent variables (see also the two-step approach recommended by Anderson and Gerbing 1988). Several procedures for improving measurement models can be used
fairly independently of the structuralequation model under consideration(see, e.g., Andersonand Gerbing 1982).
Twelve misspecified models were formulated for each
of the two population models. Six of them involved estimation of more parameters than were in the corresponding populationmodel, whereas the other six models
had more degrees of freedom than the corresponding true
model. The study was not restricted to the analysis of
nested models, as both cross-validation and information
criteria allow for the comparison of non-nested models.
For each of the population models and each sample size,
the true model was combined randomly with two misspecified models, one having more parameters and one
having fewer parametersthan the true model. Thus, for
each population model and each sample size there was
a group of three models; each of these groups contained
the true model, one model with fewer parameters than
the true model, and one model with more parameters
than the true model. The reason for this specific design
is twofold. First, it is important to analyze a selection
criterion's ability to detect both an overparameterized
model and an underparameterizedmodel. Second, this
design seems to match adequately a common situation
in empirical causal modeling in which the researcherhas
a certain baseline model containing only those causal relations considered necessary to describe the data and a
model involving many causal parameters, whereas the
true model is suspected to be somewhere between those
two models.
Results
The most importantresult pertains to the ability of the
criteria to detect the true population model. Three different situations are possible: the true model is identified
as the unique optimal model (situation 1), the criterion
suggests more than one model as being optimal and the
true model is among them (situation 2), and the criterion
suggests a wrong model (situation 3). The corresponding
frequencies are reported in Table 1. As an example, for
a sample of 200 objects and population model A, method
CVD (cross-validation associated with the DUPLEX
method) identifiedthe true populationmodel as the unique
optimal one (situation 1) in seven of 10 cases. In two
cases, more than one model was identified as being op-

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

AND INFORMATION
CRITERIA
IN CAUSALMODELING
CROSS-VALIDATION

141

Table 1
STUDY:FREQUENCIES
FROMSIMULATION
FORSITUATIONS
RESULTS
1/2/3a
Sample

size

Population model A

50
75
100
150
200
300
400
500
750
1000

1/1/8
2/4/4
3/3/4
3/4/3
4/3/3
7/2/1
7/3/0
8/2/0
4/1/5
1/3/6

Total

40/26/34

(both pop-

SIC

4/3/3
3/4/3
5/3/2
6/1/3
7/2/1
6/3/1
7/2/1
9/1/0
3/3/4
2/2/6

4/0/6
5/0/5
6/0/4
7/0/3
8/0/2
7/0/3
8/0/2
9/0/1
10/0/0
10/0/0

4/0/6
6/0/4
7/0/3
7/0/3
9/0/1
8/1/1
9/0/1
9/0/1
10/0/0
10/0/0

0/3/7
0/4/6
0/7/3
2/4/4
5/2/3
3/6/1
6/1/3
6/1/3
2/5/3
1/3/6

52/24/24

74/0/26

79/1/20

25/36/39

CVD

CVR

Total
65

/62

Population model B

AIC

CVD

CVR

/73

96

/50

/54

CVR

CVD

AIC

2/2/6
3/2/5
4/1/5
4/3/3
3/5/2
5/3/2
7/3/0
7/2/1
6/3/1
3/2/5

3/0/7
4/0/6
6/0/4
6/0/4
7/0/3
7/0/3
7/0/3
8/0/2
9/0/1
9/0/1

44/26/30

66/0/34

AIC
140 /0/60

SIC
4/0/6
4/0/6
7/0/3
7/0/3
7/0/3
8/0/2
8/0/2
9/0/1
9/1/0
10/0/0
74/1/25

SIC
153 /2

/45

ulation
32.5/31.0/36.5%
48.0/25.0/27.0%
70.0/0/30.0%
76.5/1.0/22.5%
models)
by DUPLEXalgorithm;AICis Akaike
by randommethod;CVD is cross-validation,sample-splitting
"CVRis cross-validation,sample-splitting
informationcriterion;SIC is Schwarzinformationcriterion.
Example:For a samplesize of 200 objectsand populationmodel A, methodCVD identifiedthe truemodel as the uniqueoptimalmodel in
seven cases; in two cases, more thanone model was identifiedas being optimaland the truepopulationmodel was amongthem;in one case,
CVD suggesteda misspecifiedmodel.

timal and the true populationmodel was among them


(situation2) and in one case CVD suggested a wrong
model.
One finding from Table 1 is that all of the selection
criteriainvestigatedproducebetterresultsfor population
model A than for populationmodel B in the sense that
resultsassociatedwith situation1 are much more common in connectionwith model A than with model B.
Obviously,the differentstructuresof the two population
modelsaffectthe qualityof selectioncriteriain the sense
thatresultstend to be less reliablewhen latentvariables
are definedby few indicators.This findingis consistent
with results obtainedby Andersonand Gerbing(1984)
andGerbingand Anderson(1985), who reporton some
more problemsin connectionwith models in which the
ratio(Fornell1983)is too small.Thus,
variables-to-factor
model selectioncriteriashouldnot be appliedto causal
models in which severalfactorsare definedby only one
indicator.
A second interestingobservationis the effect of samresults.Clearly,
ple size on the qualityof cross-validation
is not appropriate
for smallsamples(with
cross-validation
fewer than 300 cases). This result is not surprisingas
causalmodelingis basedon asymptoticstatisticaltheory
and in view of the sample-splittingnecessaryfor carFormedium-sized
samples(300
ryingoutcross-validation.
results are very
cross-validation
500
objects),
through
reliablein the sense thatdecisionsin favor of a misspecified model (i.e., situation3) hardly occur. Interestingly, however,the resultsbecomeinvalidin large samples. The reason for this effect is a clear tendencyof
models in
cross-validationto favor overparameterized

connectionwith large samples.This findingconfirmsa


statementby CudeckandBrowne(1983) thata saturated
model is likely to have the greatestpredictivevalidityif
the samplesize is very large. A similareffect does not
occur in connectionwith informationcriteria.Theirresults clearly improveas sample size increasesand, for
sufficientlylarge samples,those criteriaare almostsure
to detectthe true populationmodel.
Withrespectto differentmethodsfor sample-splitting,
cross-validationobviously producessignificantlybetter
resultswith the DUPLEXalgorithmthan with the random method. Hence, results from cross-validationare
much more reliable when the informationavailablein
the datais used for sample-splittingso as to createsubsets each of which representsthe entirerangeof observationsinsteadof randomlyassigningan objectto either
of two samples.A drawbackof the DUPLEXalgorithm
is that the methodis computationallyvery demanding.
As an example, for a sampleof 1000 objects, the DUPLEXalgorithmrequiredabout40 minutesof CPU time
on a Siemens7881 mainframe.
For severalreasons, it is not easy to comparecrossvalidationand informationcriteria. Situation2 occurs
much more frequentlyin cross-validationas these criteria are two-dimensional:situation2 will occur whenever one of the two cross-validationindices detectsthe
truepopulationmodelandthe otherindexmakesa wrong
decision. Summingthe frequenciesfor situations1 and
2 showsthatif cross-validationis basedon the DUPLEX
methodit is able to competewith informationcriteria,
but it cannot if a randommethodis used for samplesplitting. A majordrawbackof cross-validationthat is

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

142

MAY1991
JOURNAL
OF MARKETING
RESEARCH,

not overcome by means of the DUPLEX algorithm is the


drastic effect of sample size.
Another problem in comparing cross-validation and
informationcriteriais that these selection criteriaare based
on different assumptions. Information criteria are formulated according to the maximum likelihood approach
which, in causal modeling, is based on the assumption
of multinormally distributed indicator variables whereas
cross-validation is not associated with specific distributional assumptions. Therefore, a substantive researcher
must analyze whether the data to be analyzed can be
described by a multivariatenormal distribution(see Bentler 1985 for corresponding tests available in the EQS
computer program). If so, information criteria should be
used to select among competing causal models. However, if significant departuresfrom multinormality occur
and if, additionally, the sample to be analyzed is medium-sized (300 through 500 objects), cross-validation
(in connection with the DUPLEX algorithm) is appropriate.
Though cross-validation and information criteria are
well-established concepts for selecting among several alternative models, in various fields of multivariate analysis substantive researchers use fit indices as a basis for
model selection in causal analysis. It is interesting,
therefore, to observe how well-known fit indices in the
field of causal modeling perform in the context of the
simulation study.' Obviously, fit indices that do not take
into account the number of parameters to be estimated
in a model are not suitable for purposes of model selection as they often (always, in the case of nested models)
suggest an overparameterizedmodel. For example, error
rates (i.e., frequencies for situation 3) of more than 90%
are observed for GFI, RMR (see, e.g., Jireskog and
S6rbom 1982), and the incremental fit index suggested
by Bentler and Bonett (1980). Measures of fit that do
take into account the number of parameters to be estimated are provided by the AGFI (Joresko? and S6rbom
1982) and the p-value associated with the X statistic. For
both of these measures, error rates between 45 and 50%
are observed, even in connection with reasonably large
samples. Thus, they do not provide an alternative to the
application of cross-validation and information criteria.
CONCLUSIONS
Many applications of causal modeling in marketing research involve the selection of a "best approximating"
model among a class of competing models with different
numbers of parameters. Whenever a marketing researcher has a thorough theoretical understandingof the
structures underlying the data, a confirmatory (i.e., hypotheses-testing) research context is appropriate. However, as Rust and Schmittlein (1985) pointed out, decision areas in marketing are seldom so well understood
'The discussion of fit indices is based on a suggestion by an anonymous JMR reviewer.

that only one reasonable model can be constructed to


describethe phenomenaat hand. In such situations, model
selection will provide an alternative to hypothesis-testing.
The study reported here investigated whether wellknown model selection criteria such as cross-validation
indices and informationcriteriaare appropriatefor causal
models. The results show that the variables-to-factor ratios (Fornell 1983) of the models under investigationmust
not be too small if selection criteria are applied. Furthermore, the samples used for the analysis should contain at least 200 cases when information criteria are to
be applied, and the reliability of the results will improve
as sample size increases. Cross-validation, in contrast,
is found to be appropriateonly for medium-sized samples (300 through 500 objects).
The findings suggest that investigators can take several steps to enhance the likelihood of success in the empirical application of model selection criteria. First, it is
important to check whether the data to be analyzed can
be described approximatelyby a multivariatenormal distribution. If so, informationcriteria suggested by Akaike
(1974) and Schwarz (1978) are appropriate for model
selection. If considerable departures from normality are
observed, a distribution-freeapproach such as cross-validation can be applied (if the sample size is in the suitable range). Here the sample-splittingapproachmust help
ensure that neither subsample alone contains a subset of
observations that could bias the cross-validation. The
simulation study illustrates that the DUPLEX algorithm
(Snee 1977) is appropriate for this purpose. Additionally, in several empirical applications (see, e.g., Homburg 1989), it has been observed that contradictions between the two cross-validationindices FB and FB/Ahardly
ever occur in connection with the DUPLEX method
whereas they do occur if sample-splitting is carried out
by some random method. This property is another desirable aspect of the DUPLEX approachbecause, as Cudeck and Browne (1983) pointed out, in situations in
which no one model is optimal in both cross-validations,
furtherconsiderations should be based on more than one
model, which makes the interpretation of results very
difficult.
It is importantthat research continue in this area. Future research should, for example, assess the effects of
moderate violations of the normality assumption on the
applicability of information criteria. Further, it would be
interesting to compare the performance of the DUPLEX
method with that of other algorithms for sample-splitting
(see, e.g., Picard and Cook 1984). A shortcoming of the
DUPLEX algorithm is the computational cost involved.
Possibly other methods can be found that are less demanding computationally and perform as well as the
DUPLEX approach.
Finally, a caveat about the approaches discussed here
is in order. Clearly, model selection involves human
judgment (Cudeck and Browne 1983). Any statistical
procedure,including the ones described here, cannot lead

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

AND INFORMATION
CRITERIA
IN CAUSALMODELING
CROSS-VALIDATION
to scientific progress unless moderatedby careful thought
and judgment on the part of the researcher. Specifically,
as Fornell (1983) pointed out, substantive theory must
be included in the analysis. Marketingresearchersshould
keep that fact in mind whenever applying one of the approaches discussed here.
REFERENCES
Aaker, David A. and Richard P. Bagozzi (1979), "Unobservable Variables in Structural Equation Models With an Application in Industrial Selling," Journal of Marketing Research, 16 (May), 147-58.
Akaike, Hirotugu (1974), "A New Look at the StatisticalModel
Identification," IEEE Transactions on Automatic Control,
19, 716-23.
(1987), "Factor Analysis and AIC," Psychometrika,
52, 317-32.
Anderson, James C. and David W. Gerbing (1982), "Some
Methods for Respecifying Measurement Models to Obtain
Unidimensional Construct Measurement," Journal of Marketing Research, 19 (November), 453-60.
and (1984), "The Effect of Sampling Erroron
Convergence, Improper Solutions, and Goodness-of-Fit Indices for Maximum Likelihood Confirmatory Factor Analysis," Psychometrika, 49, 155-73.
- and (1988), "StructuralEquation Modeling in
Practice: A Review and Recommended Two-Step Approach," Psychological Bulletin, 103 (May), 411-23.
Bagozzi, Richard P. (1976), "Toward a General Theory for
the Explanation of the Performance of Salespeople," unpublished doctoral dissertation, Northwestern University.
(1980a), Causal Models in Marketing. New York: John
Wiley & Sons, Inc.
-(1980b), "Performance and Satisfaction in an Industrial Salesforce: An Examination of Their Antecedents and
Simultaneity," Journal of Marketing, 44 (Spring), 65-77.
(1982), "Introductionto Special Issue on Causal Modeling," Journal of MarketingResearch, 19 (November), 403.
Balderjahn, Ingo (1986), Das umweltbewuBte Konsumentenverhalten. Berlin: Duncker & Humbolt.
Bearden, William O., Subash Sharma, and Jesse E. Teel (1982),
"Sample Size Effects on Chi Squareand Other Statistics Used
in Evaluating Causal Models," Journal of Marketing Research, 19 (November), 425-30.
Bentler, Peter M. (1985), Theory and Implementation of EQS:
A Structural Equations Program. Los Angeles: BMDP Statistical Software, Inc.
(1986), Lagrange Multiplier and Wald Tests for EQS
and EQS/PC. Los Angeles: BMDP Statistical Software, Inc.
- and Douglas G. Bonett (1980), "Significance Tests
and Goodness-of-Fit in the Analysis of Covariance Structures," Psychological Bulletin, 88, 588-606.
Boomsma, Anne (1985), "Nonconvergence, Improper Solutions, and Starting Values in LISREL Maximum Likelihood
Estimation," Psychometrika, 50, 229-42.
Bozdogan, Hamparsum(1987), "Model Selection and Akaike's
Information Criterion (AIC): The General Theory and Its
Analytical Extensions," Psychometrika, 52, 345-70.
Browne, Michael W. (1984), "AsymptoticallyDistribution-Free
Methods for the Analysis of Covariance Structures,"British
Journal of Mathematical and Statistical Psychology, 37, 6283.

143

Cooil, Bruce, Russell S. Winer, and David L. Rados (1987),


"Cross-Validationfor Prediction," Journal ofMarketing Research, 24 (August), 271-9.
Cudeck, Robert (1985), "A Structural Comparison of Conventional and Adaptive Versions of the ASVAB," Multivariate Behavioral Research, 20 (July), 305-22.
1- and Michael W. Browne (1983), "Cross-Validation of
Covariance Structures," Multivariate Behavioral Research,
18 (May), 147-67.
Dorans, Neil J. and Fritz Drasgow (1980), "A Note on CrossValidating Prediction Equations," Journal of Applied Psychology, 65 (December), 728-9.
Fornell, Claes (1983), "Issues in the Application of Covariance Structure Analysis: A Comment," Journal of Consumer Research, 9 (March), 443-8.
-(1986), "A Second Generation of Multivariate Analysis: Classification of Methods and Implications for Marketing Research," working paper, University of Michigan.
Gaul, Wolfgang and Christian Homburg (1988), "The Use of
Data Analysis Techniques by German Market Research
Agencies," Journal of Business Research, 17 (August), 6779.
Gerbing, David W. and James C. Anderson (1984), "On the
Meaning of Within-FactorCorrelatedMeasurement Errors,"
Journal of Consumer Research, 11 (June), 572-80.
and
(1985), "The Effects of Sampling Error
and Model Characteristics on Parameter Estimation for
Maximum Likelihood Confirmatory Factor Analysis," Multivariate Behavioral Research, 20, 255-71.
and John E. Hunter (1982), "The Metric of the Latent
Variables in the LISREL IV Analysis," Educational and
Psychological Measurement, 42, 423-7.
Hanssens, Dominique M. and Barton A. Weitz (1980), "The
Effectiveness of Industrial Print Advertisements Across
Product Categories," Journal of Marketing Research, 17
(August), 294-306.
Homburg, Christian (1989), Exploratorische Ansatze der Kausalanalyse als Instrumentder Marketingplanung. Frankfurt:
Verlag Peter Lang.
- and Andreas Dobratz (1991), "Causal Analysis via
Specification Searches," Statistical Papers (forthcomuing).
IMSL (1982), IMSL Library: Reference Manual, Vol. 2, 9th
ed. Houston: InternationalMathematical and Statistical Libraries.
Jireskog, Karl G. (1967), "Some Contributions to Maximum
Likelihood Factor Analysis," Psychometrika, 32, 443-82.
-(1969), "A General Approach to Confirmatory Maximum Likelihood Factor Analysis," Psychometrika, 34, 183202.
- (1971), "Statistical Analysis of Sets of Congeneric
Tests," Psychometrika, 36, 109-33.
"StructuralEquation Models in the Social Sci- -(1977),
ences: Specification, Estimation, and Testing," in Applications of Statistics, P. R. Krishnaiah,ed. Amsterdam:NorthHolland Publishing Company, 265-87.
and Dag Sirbom (1982), "Recent Developments in
Structural Equation Modeling," Journal of Marketing Research, 19 (November), 404-16.
and (1984), LISREL VI: Analysis of Linear
Structural Relationships. Mooresville, IN: Scientific Software, Inc.
Kennard, Ronald W. and Larry A. Stone (1969), "Computer
Aided Design of Experiments," Technometrics, 11 (January), 137-48.

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

144

OF MARKETING
JOURNAL
MAY1991
RESEARCH,

MacCallum,Robert (1986), "SpecificationSearchesin Covariance StructureModeling," Psychological Bulletin, 100,

107-20.
MacKenzie,Scott B., RichardJ. Lutz, and GeorgeE. Belch
(1986), "TheRole of AttitudeTowardthe Ad as a Mediator
of AdvertisingEffectiveness:A Test of CompetingExplanations," Journal of Marketing Research, 23 (May), 130-

43.
Mosier, CharlesI. (1951), "Problemsand Design of CrossValidation," Educational and Psychological Measurement,
11, 5-11.

Picard,RichardR. and R. Dennis Cook (1984), "Cross-Validation of Regression Models," Journal of the American
Statistical Association, 79 (September), 575-83.

Rust, RolandT. and DavidC. Schmittlein(1985), "A Bayesian Cross-ValidatedLikelihoodMethodfor ComparingAlternativeSpecificationsof QuantitativeModels,"Marketing

Saris, Willem E., Willem M. de Pijper,and Paul Zegwaart


(1979), "Detectionof SpecificationErrorsin LinearStructural Equation Models," in Sociological Methodology 1979,

K. F. Schuessler,ed. SanFrancisco:Jossey-BassInc., Publishers.


Schwarz, Gideon (1978), "Estimatingthe Dimension of a
Model," Annals of Statistics, 6, 461-4.

Sclove, StanleyL. (1987), "Applicationof Model Selection


Criteriato Some Problemsin MultivariateAnalysis,"Psychometrika, 52, 333-43.

Snee, RonaldD. (1977), "Validationof RegressionModels:


Methodsand Examples,"Technometrics,19, 415-28.
Sujan, Harish(1986), "SmarterVersusHarder:An Exploratory AttributionalAnalysis of Salespeople'sMotivation,"
Journal of Marketing Research, 23 (February), 41-9.

Science, 4 (1), 20-40.

ReprintNo. JMR282101

This
is
publication
available in microform.
_

UniversityMicrofilmsInternational
reproducesthis publicationin microform:microfiche and 16mmor 35mmfilm.Forinformation
aboutthis publicationor any of the more than
13,000 titles we offer, completeand mail the
couponto: UniversityMicrofilmsInternational,
300 N. Zeeb Road,Ann Arbor,MI48106.Call
us toll-freefor an immediateresponse:
800-521-3044.Or call collect in Michigan,
Alaska and Hawaii:313-761-4700.
O Pleasesendinformation
aboutthesetitles:

Name
Company/Institution

University
Mi roailms
International

Address
City
State

Zip

Phone (

This content downloaded from 143.107.162.79 on Mon, 16 Feb 2015 07:33:24 AM


All use subject to JSTOR Terms and Conditions

Das könnte Ihnen auch gefallen