Beruflich Dokumente
Kultur Dokumente
2, 1994 145
by
Steven C. Dunn
Idaho State University
Rohert F. Seaker
Penn State University
and
Matthew A. Waller
Western Michigan University
paradigm, especially when latent variables are involved. Surveys and structured
interviews account for almost half of the research published in this category. Survey
research in business logistics often does not use the data from the survey for
hypothesis testing. Instead the data is used to make inferences that are not statistically
verifiable. This is partially a result of the fact that latent variables have not been
scientifically measured, a necessary step for testing theory.
The purpose of this paper is to suggest a logistics research methodology for
the scientific analysis and testing of latent variables. The methodology is drawn
from well established techniques utilized in other areas of business research. This
paper is divided into three sections. The first provides an analysis of logistics research
methods, accomplished through an extension of the review of the articles in five
"top-tier" logistics research journals performed by Dunn et al.* The second section
assesses the need for, and describes the arguments in favor of, scientific research
within the framework of logistics. The concepts of theory-testing and the role of
science are discussed. In section three, a detailed technical description of the
methodology for construct measurement is presented as a guideline for interested
logistics researchers. The conclusion calls for the application of this methodology
in future logistics research.
FIGURE 1
When the two axes of the Meredith et al. matrix Jire applied, the majority
of logistics research tends to fall into three areas. The interpretive/perceptive (27%),
artificial/axiomatic (24%), and logical-positivist/perceptive (23%) paradigms account
for three-quarters of the total research articles.
Figure 2 presents the findings of Dunn et a l . ' '
FIGURE 2
Within science, there is the assumption that there is a real, tangible world
where phenomena are independent of human interpretation or conjecture. It contains
rules for analyzing natural phenomena. Science aims to observe and explain these
natural phenomena. It provides a dependability to knowing. Science is not concerned
with ideas lying within the metaphysical or theological realm. If there is to be
progress in the construction of knowledge, it is real world phenomena that will
provide observable, measurable structure to ongoing research processes. Therefore,
methodology used to test theories rooted in this objective world should also be
as free of human persuasion as possible.
Science can provide such tests. The scientific approach has a self-correction
characteristic so that scientific activities and conclusions are controlled and verified
so that the end result is dependable knowledge.
Scientific research is composed of objective observation and systematic
procedures. The purpose of the scientific method is to test theory. However, theory
can be built or adjusted based on unexpected test results. For example, a control
variable may unexpectedly provide a strong moderating effect to an independent
variable's explanation of the dependent variable. At this juncture, the original theory
may require some additional thought.
Science attempts to provide a consistent and unbiased view of observable
phenomena and how they relate to each other. It does so while maintaining objectivity
in both measurement and testing. Testing procedures are systematic and controlled
in that the investigation itself provides a strong assurance that the results are accurate
and consistent.
Science alone is not a panacea. There are weaknesses associated with the
scientific method. Howson and Urbach^^ have exposed some of the problems of
induction as they relate to knowledge. They probed for underlying assumptions
in scientific methodologies. For example, Hume's principle of the uniformity in
nature emphasizes that those ascribing to scientific research assume an a priori
"truth" that the future will resemble the past. It allows the scientist to assuredly
predict future occurrences based upon current experimental results.
Relationships between constructs built on perceptions (e.g., customer
satisfaction), prevalent in the behavioral sciences, have a tendency to be less
generalizable through time. In the natural sciences such as oceanography, tides can
be accurately and consistently predicted by the lunar positioning. In contrast, the
behavioral sciences (where, for example, transportation buying behavior might be
152 DUNN, SEAKER, AND WALLER
Research within the field of logistics may best be approached through the
application of multiple methods, both quantitative and qualitative. Eisenhardt,^-'
Jick,'''* and Campbell and Fisk^^ among others suggest that this triangulated approach
to research can create better assurances that variances are trait-related and not
method-related.
Logistics knowledge should be approached with an array of diverse
methodological tools to ensure the coexistence of both theory-testing and theory-
building. Weick^^ provides a model based on Thomgate's postulate of commensurate
complexity illustrating the strengths and weaknesses in the various approaches to
research in the social sciences. In this conception, the power of research outcomes
can be measured by its adeptness at providing the virtues of generalizability, internal
validity, or simplicity. There appears to be a form of mutual exclusivity regarding
each.
As any single research method attempts to satisfy any two of the three virtues,
the third virtue is increasingly alienated. For example, as empirical science attempts
to provide both external and internal validity in its outcomes, simplicity is sacrificed.
In contrast, as case study research attempts to capture both internal validity and
simplicity as its virtues, it loses generalizability.
Figure 3 incorporates Weick's model into an encompassing research framework.
By combining the large array of multiple logistics theories and multiple methods
into a collective research effort, a more powerful means of extracting new, meaningful,
and dependable knowledge is made available.
Research in business logistics tends to span most of the approaches in Figure
3. Although there is much survey research on business logistics topics, it frequently
lacks scientific rigor. This is especially true in terms of construct measurement.
The next section of this paper provides a framework for construct measurement
that should facilitate scientific survey research in business logistics.
FIGURE 3
SNOWUDCS
Lab
Analytic
Models
Case
Multiple
Case Studies
JOURNALOF BUSINESS LOGISTICS, Vol. 15, No. 2, 1994 155
Items or questions can then be listed that would measure, indirectly, the
constructs. For example, to measure decentralization several questions might be
asked. The following are examples of possible survey questions:
1. Routine logistics decisions are not formalized. (Circle one.)
A. Strongly Agree
B. Agree
C. Neutral
D. Disagree
E. Strongly Disagree
These are only two of the possible items that might reflect a decentralized
logistics function. As mentioned above, suppose the researcher developed seven
such questions to measure, indirectly, the extent of decentralization in a logistics
function. Should all of these questions be included in the measurement of
decentralization? What statistical techniques are available to decide which questions
to use in measuring this construct? Do these seven questions adequately cover
the scope of what is meant by decentralization? Do these seven questions measure
decentralization per se or do they also measure another construct which is included
in the theory?
The balance of this section is aimed at answering these and other questions
related to latent variable measurement. Figure 4 is a flow chart of the process
of scale development and validation. It is important to note that this process is
iterative, as well as, sequential.
156 DUNN, SEAKER, AND WALLER
FIGURE 4
no
1
1 Criterion Related Validity
Predictive Validity |
1
H Concurrent Validity
\ yes
h
no Nomologicai Validity
i yes
Test Theory
JOURNAL OF BUSINESS LOGISTICS, Vol. 15, No. 2, 1994 157
Content Validity
In order to measure latent variables, constructs should be carefully defined
from the literature and the author's understanding of the constructs. A set of tentative
items can then be produced to measure each construct. This approach helps ensure
content validity:
One can imagine a domain of meaning that a particular construct is intended
to measure. Content validity refers to the degree that one has
representatively sampled from that domain of meaning. ...[T]he researcher
should search the literature carefully to determine how various authors
have used the concept which is to be measured. ...[Also], researchers should
rely on their own observations and insights and ask whether they yield
additional facets to the construct under consideration.
Content validity exists when the scope of the construct is adequately reflected by
the items as a group. Unfortunately, there is no rigorous way to assess content
validity.^^ Multiple items are typically used to measure constructs so that construct
measurement will be thorough.^' Another term for measure is scale, that is, a
collection of items mapped into one variable. If content validity does not exist,
then there is no reason to proceed with the analysis because the desired construct
is not being properly represented by the group of items. This means that the
researchers will not be able to use the scale to test the hypothesis.
A logistics researcher may wish to measure customer service. If the researcher
uses only number of stockouts as a measure, the scale may lack content validity
because it does not sufficiently span the scope of the meaning of customer service.
The researcher may wish to change the name of the construct to "number of
stockouts." However, if the researcher is really interested in measuring customer
service, then other items should be included in the scale.
Substantive Validity
It is not possible for a scale to have content validity without having substantive
validity.'*^ If a measure has substantive validity, then its items are conceptually
or theoretically linked to the construct.'^' Content validity differs from substantive
validity in that content validity deals with a set of items (scale) whereas substantive
validity deals with each individual item of a construct. Anderson and Gerbing suggest
158 DUNN, SEAKER, AND WALLER
purifying the set of items, using a test for substantive validity."^^ Item purification
involves eliminating those items that do not agree with the set of items.
Many times researchers attempt to purify a set of items in a pretest by using
item-to-total correlations or contribution to Cronbach's coefficient alpha. The problem
with this approach is that the correlations are typically not statistically significant
because the sample size is often small in a pilot survey. As a result, the researcher
may be eliminating items that should not be eliminated or may be retaining items
that will weaken construct validity.
One solution would be to increase the sample size of the pretest. If the researcher
is to use exploratory factor analysis, the sample size must be fairly large. Exploratory
factor analysis is a statistical technique used to find which variables or items are
agreeing with one another. Those that do not agree with a given scale can be
eliminated as long as content validity is not jeopardized. This raises another problem,
namely, cost. To use exploratory factor analysis there should be at least ten
observations for each item'*-' and at least 200 observations'*'* even if there are fewer
than twenty items on the questionnaire. Therefore, use of exploratory factor analysis
in a pilot study is potentially very costly.
One solution to this dilemma is to assess substantive validity in a pretest setting
using the approach of Anderson and Gerbing.^^ They performed an empirical study
where their substantive validity assessment technique was able to predict which
items should have been retained and which should have been deleted. The results
of the substantive validity assessment were stable across multiple pretest samples.
The methodology for evaluating substantive validity is an item-sort task.'*^
Experts or a sample of the population to be surveyed are given a sheet of paper
that defines each of the constructs and another sheet of paper that contains all
of the items in random order. The participants are asked to match the items to
the constructs that they best represent. When this method is used as the pretest,
then the pretest population should be representative of the study population."*^
Anderson and Gerbing'** have developed two indices for evaluating the
substantive validity. One index is the proportion of substantive agreement (PSA),
which measures how well the items were assigned to their hypothesized constructs.
PSA is the number of participants assigning an item to its hypothesized construct
divided by the number of participants. The PSA index is an element of the unit
interval. A high value of the PSA index indicates agreement as to the construct
JOURNAL OF BUSINESS LOGISTICS. Vol. 15, No. 2, 1994 159
that a given item is assigned to. For example, if an item has been assigned to
a given construct by eight of the ten participants, the PSA value is 0.80.
The other index is the substantive validity coefficient (CSV). CSV is the
difference between the number of participants assigning an item to its hypothesized
construct and the highest number of assignments of an item to any other construct;
that quantity is then divided by the number of observations. For example, suppose
that six of ten participants assigned an item to the hypothesized construct, three
assigned it to another construct, and one assigned it to another construct. In this
case CSV = (6-3)/10 or 0.30.
The CSV index is a member of the interval from -1.0 to +1.0. Some items
are often eliminated in this stage in the research methodology. A critical value
of 0.5 was used to determine which items should be eliminated in the Gerbing
and Anderson study."*^ This stage helps assure substantive validity.
The substantive validity test together with the pilot study is intended to improve
the efficiency of the questionnaire as much as possible before data are collected.
This helps ensure high quality research. A rigorous statistical analysis of bogus
data is of little value. If substantive validity does not exist for an item and the
item is still included in the scale, then that item will probably have a low correlation
with the other items in the scale. Therefore, it will probably be eliminated during
the ensuing scale refinement stage of the analysis.
The research of Hunt, Sparkman, and Wilcox^° shows that the most common
type of error detected in a pretest is missing alternative errors, i.e., questionnaire
errors that result from not including the correct alternative in the set of alternatives.
The use of inappropriate vocabulary was the second most frequently detected error.
Telephone interviews were found superior to personal interviews for detecting errors.
After the pretest has been accomplished, the survey can be sent out. Several
books and articles exist on survey research^^ and related topics.^
Scale Refinement
Scale reliability, hereafter referred to as reliability, refers to the internal
consistency of the items that are used to measure a latent construct. Internally
consistent items form a homogeneous set in that they vary together statistically.
Therefore, in this context the term "reliability" refers to the accuracy or precision
160 DUNN, SEAKER, AND WALLER
If a scale has a low alpha value (below 0.60 for a new scale), then the researchers
can examine the item correlation matrix of that scale. Those items with low
item-to-item correlations can be deleted from the scale as long as the scale will
retain its content validity.
Exploratory factor analysis is a technique that is used to find latent variables
(factors) that are represented by a group of items or observed variables. It is often
referred to as a variable reduction technique,^^ and can be used to further refine
the scales by analyzing the observed variables. Items that are poorly related to
all factors or clearly represent more than one dimension are removed, resulting
in more reliable scales.
Gerbing and Anderson^^ encouraged the use of both item-to-scale correlation
(e.g., item-total correlations) and exploratory factor analysis to provide preliminary
scales. Exploratory factor analysis differs from confirmatory factor analysis, which
will be discussed in the section on construct validity. However, both types of factor
analysis have the following underlying assumption in common: Factor analysis is
based on the fundamental assumption that some underlying factors, which are smaller
in number than the number of observed variables, are responsible for the covariation
among the observed variables.^'
Scale refinement, within the context of this paper, is conducted on the same
data that is used to test the hypotheses. This is standard practice. However, it would
be better to refine the scales with one set of data and then test the hypotheses
with another set of data. The problem with this is that it is often too expensive
to acquire the extra data set.
Unidimensionality
Construct validity depends on how well the scale of a construct actually measures
that construct. A scale is construct valid ...
(1) to the degree that it assesses the magnitude and direction of a
representative sample of the characteristics of the construct; and (2) to
the degree that the measure is not contaminated with elements from the
domain of other constructs or error.
However, a scale cannot have construct validity unless it is unidimensional. It
is acceptable to have a multidimensional construct, but scales must be unidimensional.
^62 DUNN, SEAKER, AND WALLER
unidimensionality refinement of the scale, the scale may lose its content validity
because it no longer reflects the scope of the construct. Content validity must always
be considered when eliminating items from a scale. Sometimes the researcher is
faced with a dilemma—namely, eliminate an item to achieve unidimensionality,
resulting in a loss of content validity or keep the item, maintaining content validity
while not achieving unidimensionality.
If the researcher uses a scale that is not unidimensional, then it will reduce
the scale's variance. In other words, items that are reflecting one construct in a
scale will, to some extent, off-set changes in items in the same scale that reflect
another construct. This may result in a reduction in the explanatory power of the
model as a whole.
Reliability
Once unidimensionality has been established, reliability can be assessed using
Cronbach's coefficient alpha. Cronbach's coefficient alpha was discussed prior to
this section because the concept of reliability is useful in the discussion of
unidimensionality.
Reliability is of little significance until unidimensionality has been established.
It is possible to have a reliable scale that is measuring more than one construct,
fulfilling a sufficient condition for lack of construct validity. This is why
unidimensionality should be established before reliability is established.
Construct Validity
Once unidimensionality has been established, construct validity can be
investigated. Construct validity is the extent to which a scale measures the construct
it was intended to measure. The concept of construct validity has been difficult
for scientists to operationalize.^' Convergent validity and discriminant validity are
the criteria most frequently used to support construct validity.'^
Convergent validity is the degree to which there is agreement between two
or more attempts to measure the same construct through dissimilar methods.'^
Discriminant validity depends on the degree to which scales measure distinct
constructs.^** When convergent validity and discriminant validity are found, construct
validity is supported.
For a thorough discussion of convergent validity and various ways of checking
for it see Bagozzi, Yi, and Phillips.^^ In this discussion, assessment of convergent
DUNN, SEAKER, AND WALLER
Criterion-Related Validity
Criterion-related validity refers to how well a scale correlates with the criterion
it is trying to predict. If the criterion exists in the present, then it is called concurrent
validity; if the criterion exists in the future, then it is called predictive validity.
The following is an example of predictive validity: GMAT scores are used
by business schools to predict how well a given applicant for an MBA program
will perform if allowed to enter the program. Therefore, to measure the predictive
validity of GMAT scores, the researcher might find the correlation between GMAT
scores and grade-point-averages of graduating MBAs. The higher the correlation,
the greater the support for the test's predictive validity.
Suppose a researcher was using subjective items to measure a construct called
manufacturing quality. To assess concurrent validity, the researcher could conduct
JOURNALOF BUSINESS LOGISTICS, Vol. 15, No. 2, 1994 165
Nomologicai Validity
Nomologicai validity of a construct exists when the construct relates to the
other research constructs in a way that is consistent with the underlying theory.^
Therefore, at this stage of the analysis the line between construct validation and
theory testing becomes blurred. This obfuscation occurs because the approach to
investigation of nomologicai validity and the testing of theory are identical.
If a construct, in the nomologicai network, does not behave in the way theory
predicts, does that show that the construct is not measuring the appropriate latent
variable? Or is the theory wrong? Therefore, if nomologicai validity does not hold,
the situation becomes equivocal. However, if the nomologicai validity of a construct
is supported, then the part of the theory which involves that construct is also supported.
CONCLUSION
This paper calls for a more scientific approach to empirical research in business
logistics, especially when latent variables are constituted in the theory. A
methodological road map is provided to guide researchers through the often enigmatic
maze of empirical science as it attempts to operationalize construct. Figure 4 is
a flowchart of this process.
166 DUNN, SEAKER, AND WALLER
The process recommended in this paper has important ramifications for logistics
research. The paper is intended as a recommendation to logistics researchers as
an important step forward conceming survey research, specifically, the methodologies
used in that research.
Adoption of the procedures presented in this paper will enable a common set
of standards to be set for measuring latent variables in logistics. It will allow the
logistics research community to find common ground in its understanding of often
elusive concepts. This will provide for a stronger inference within the paradigm,
resulting in improved logistics practice.
Application of standard methodological procedures will bring a common
discipline and order, enhancing both academic and professional acceptance of
research. Additionally, it will enable logistics researchers to contribute to and draw
from the scholarly work in other disciplines. This is important because logistics
is a cross-functional discipline.
The methodology utilized for survey research in the perceptive paradigm can
have ramifications for other forms of logistics research. Although the deductive
forms of logistics research such as model building are very sophisticated, they
are hindered by a lack of generalizability. The more inductive case study approach
suffers from the same problem. Empirical science will facilitate generalizability,
hence improve theory building.
Logistics researchers have successfully generated a plethora of hypotheses and
propositions. Now as logisticians enter the realm of theory testing, articles addressing
methodology can be expected to follow. These contributions, collectively, will bring
about a more vigorous and pervasive theoretical base to the discipline. It is hoped
that this paper will provide encouragement for researchers who are interested in
this genre of research by reducing some of the barriers that may have inhibited
them.
As the logistics discipline accumulates sound theory it will become more
applicable for practice, thus reducing lag between research and application. The
dependability and generalizability of the theory will facilitate such an outcome.
Logistics researchers will not need to worry about compromising relevance for
theory when this approach is taken.
JOURNAL OF BUSINESS LOGISTICS, Vol. 15, No. 2, 1994 167
NOTES
^E. Carmines and R. Zeller, Reliability and Validity Assessment, Sage University
Paper Series on Quantitative Applications in the Social Sciences, 07-017 (Beverly
Hills, Calif.: Sage Publishing, 1979).
reference as Note 28.
reference as Note 28.
^^J. Saraph, P. Benson, and R. Schroeder, "An Instrument for Measuring the
Critical Factors of Quality Management," Decision Sciences 20, no. 4 (1989):
810-829.
J. Kim and C. Mueller, Introduction to Factor Analysis, Sage University
Paper Series on Quantitative Applications in the Social Sciences, 07-013 (Beverly
Hills, Calif.: Sage Publications, 1978).
J. Anderson and D. Gerbing, "Structural Equation Modeling in Practice: A
Revie and Recommended Two-Step Approach," Psychological Bulletin 103, no.
3 (1988): 411-423.
^ ' j . Kim and C. Mueller, Factor Analysis: Statistical Methods and Practical
Issues, Sage University Paper Series on Quantitative Applications in the Social
Sciences, 07-014 (Beverly Hills, Calif.: Sage Publications, 1978), p. 12.
^^J. Peter, "Construct Validity: A Review of Basic Issues and Marketing
Practices," Journal of Marketing Research 18 (1981): 133-145.
^^D. Gerbing and J. Anderson, "An Updated Paradigm for Scale Development
Incorporating Unidimensionaiity and Its Assessment," 7oMrna/ of Marketing Research
25 (1988): 186-192.
^"^K. Joreskog and P. Sorban, LISREL 7: A Guide to the Program and Application,
2nd ed. (Chicago, SPSS, 1989).
^^D. Garvin, "Competing on the Eight Dimensions of Qaaiiiy," Harvard Business
Review 65, no. 6 (1987): 101-109.
reference as Note 63.
reference as Note 63.
"^^J. Long, Confirmatory Factor Analysis, Sage University Paper Series on
Quantitative Applications in the Social Sciences, 07-034 (Beverly Hills, Calif.: Sage
Publications, 1983), p. 12.
reference as Note 63.
JOURNAL OF BUSINESS LOGISTICS, Vol. 15, No. 2, 1994 171
laude) from the University of Missouri. His research interests focus on quality
management, logistics strategy, and supply chain management.