Sie sind auf Seite 1von 21

Feature Topic: Construct Measurement in Strategic Management

Exploring the Dimensions of


Organizational Performance:
A Construct Validity Study

Organizational Research Methods


16(1) 67-87
The Author(s) 2013
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1094428112470007
orm.sagepub.com

P. Maik Hamann1, Frank Schiemann1,2,


Lucia Bellora1, and Thomas W. Guenther1

Abstract
Organizational performance is a fundamental construct in strategic management. Recently,
researchers proposed a framework for organizational performance that includes three dimensions:
accounting returns, growth, and stock market performance. We test the construct validity of
indicators of these dimensions by examining reliability, convergent validity, discriminant validity, and
nomological validity. We conduct a confirmatory factor analysis with 19 analytically derived indicators on a sample of 37,262 firm-years for 4,868 listed U.S. organizations from 1990 to 2010. Our
results provide evidence of four, rather than three, organizational performance dimensions. Stock
market performance and growth are confirmed as separate dimensions, whereas accounting returns
must be decomposed into profitability and liquidity dimensions. Robustness analyses indicate
stability of our inferences for three dissimilar industries and for a period of 21 years but reveal that
organizational performance dimensions underlie dynamics during years in which environmental
instability is high. Our study provides an initial contribution to the clarification of the important organizational performance construct by defining four dimensions and validating indicators for each
dimension. Thus, we provide essential groundwork for the measurement of organizational performance in future empirical studies.
Keywords
factor analysis, quantitative research, reliability and validity, measurement models, organizational
performance
Organizational performance (OP) is fundamental to strategic management research. Research in this
field builds on the assumption that strategy influences OP (Lubatkin & Shrieves, 1986). Furthermore,
1

Faculty of Business Management and Economics, Technische Universitat Dresden, Dresden, Germany
School of Business, Economics, and Social Science, University of Hamburg, Hamburg, Germany

Supplementary material for this article is available on the journals website at http://orm.sagepub.com/supplemental.
Corresponding Author:
P. Maik Hamann, Technische Universitat Dresden (TU Dresden), Faculty of Business Management and Economics, Chair of
Business Management especially Management Accounting and Control, D-01062 Dresden, Germany.
Email: Lehrstuhl.controlling@mailbox.tu-dresden.de

Downloaded from orm.sagepub.com by guest on February 26, 2015

68

Organizational Research Methods 16(1)

OP is the most common concept addressed in empirical studies in this field; for example, 28% of 439
empirical articles reviewed by March and Sutton (1997) and 29% of 722 articles reviewed by Richard,
Devinney, Yip, and Johnson (2009) include OP in their research design.
The OP construct refers to the phenomenon in which some organizations are more successful than
others. A construct is a conceptual term that researchers define to describe a real phenomenon and is
unobservable by nature (Edwards & Bagozzi, 2000). Consequently, OP is subject to the problem of
unobservables in strategic management research (Godfrey & Hill, 1995, p. 519). This problem is
best described in reference to the predictive validity framework (PVF). The PVF includes two levels:
the conceptual level and the operational level (Bisbe, Batista-Foguet, & Chenhall, 2007). At the conceptual level, theories explain relationships between constructs through propositions. Subsequently,
these propositions are empirically tested at the operational level, at which researchers apply indicators to measure a construct. Indicators are observed scores or quantified records (Edwards &
Bagozzi, 2000). The link between the two levels (i.e., between constructs and their indicators) is
crucial to advances in theoretical relationships between constructs. Only if this link is rigorously
established can empirical findings at the operational level be used to test theoretical propositions
involving unobservables at the conceptual level. This link is established by examining construct
validity. Construct validity reflects the correspondence between a construct and a measure taken
as evidence of the construct (Edwards, 2003, p. 329). Construct validity encompasses four criteria:
reliability, convergent validity, discriminant validity, and nomological validity (Schwab, 2005).
Paradoxically, in the past, a majority of strategic management researchers regarded construct validity
and the measurement of constructs as low-priority topics (Boyd, Gove, & Hitt, 2005). Consequently,
unobservables (e.g., OP) have often been measured by single indicators whose construct validity has
rarely been assessed. From the PVF, it follows that related theoretical inferences from such studies are
seriously undermined (Combs, Crook, & Shook, 2005; Starbuck, 2004; Venkatraman & Grant, 1986).
Because of its importance for strategic management research, a growing number of studies examine the measurement of OP. These studies are shown in Table 1 and encompass two groups: (a) factor analyses of the dimensionality of OP (Devinney, Yip, & Johnson, 2010; Fryxell & Barton, 1990;
Rowe & Morrow, 1999; Venkatraman & Ramanujam, 1987) and (b) reviews of the OP measurement
practices used in strategic management research (Murphy, Trailer, & Hill, 1996; Richard et al.,
2009; Tosi, Werner, Katz, & Gomez-Mejia, 2000). The first group of studies provides evidence
of the multidimensionality of OP. However, these studies disagree on the number of OP dimensions
and do not systematically examine the construct validity of indicators that measure these dimensions. Reviews of OP measurement practice provide evidence that empirical studies in strategic
management research employ a plethora of different and unrelated indicators (Murphy et al.,
1996); for example, Richard et al. (2009) reviewed 213 studies and identified 207 different OP
indicators. In this review, 49% of the studies measure OP with a single indicator despite the
multidimensional nature of OP, and 52% of the studies employ only cross-sectional data sets. However, none of the aforementioned studies develop a framework of the dimensions of OP at the
conceptual level or examine the construct validity of OP indicators based on such a framework.
Combs et al. (2005) directly address the first gap in the literature and develop a framework of the
OP dimensions based on a synthesis of prior studies that focus on OP dimensions and a review of OP
measurement practices. They divide OP into three dimensions: accounting returns, stock market
performance, and growth. Subsequently, they test the OP framework by conducting a confirmatory
factor analysis (CFA) based on a correlation matrix of five OP indicators derived from a metaanalysis. Despite the significant contribution made by Combs et al., their study has three limitations.
First, Combs et al. do not offer clear definitions of the OP dimensions. Specification of the conceptual domain and clear definitions of constructs are prerequisites for construct validity (Schwab,
2005). Second, the CFA with three factors and five OP indicators does not satisfy the twoindicator rule of model identification (Kline, 2011). Consequently, Combs et al. offer only

Downloaded from orm.sagepub.com by guest on February 26, 2015

Downloaded from orm.sagepub.com by guest on February 26, 2015

69

Venkatraman and Ramanujam


(1987)

19 (8)

10

10

86

168

995
(586)

311 (2,398
firm-years)

Not reported

52

213
Efficiency
Liquidity
Profit

MTMM and CFA

CFA

PCA
(CFA)

CFA

EFA

Sales measures
(sales growth)

Profitability

Sales growth
Profit growth

Liquidity
Sales measures
Profitability
(sales growth)
Sales efficiency
Profit growth
Income efficiency
Absolute income
Employee efficiency
Accounting-based measures

Accounting measure
Cash flow/profitability
dimension
Financial (accounting)

Studies using primary or secondary data

Narrative review

Narrative review

Absolute financial
Change in
performance
financial
Return on equityshort
performance
term
Return on equitylong term
Financial performance

Growth

Market-based
measures

Stock market

Market value

Shareholder
return

Stock
performance
Market return

Stock market

Dimensions of Organizational Performance

Accounting returns

Reviews and meta-analytic studies

Method

Not reported Narrative review and


(238 studies)
meta-analytical
CFA
Not reported Meta-analytical EFA
(137 studies)

Number of
Studies/
sample size

Subjective
reputation
rating
Size

Product market
performance
Size

Internal
performance
indicators

Operational
performance

Operational
Performance

Note: We allocate the dimensions of organizational performance to the framework of Combs et al. (2005), which is shown in boldface. This framework also separates operational performance and
organizational performance. CFA confirmatory factor analysis; EFA exploratory factor analysis; PCA principal components analysis; MTMM multitrait-multimethod matrix.
a
Combs et al. (2005) and Tosi et al. (2000) only report the overall number of primary studies that they use in their reviews. This number is provided in parentheses.
b
Murphy et al. (1996) conduct a narrative review and an empirical analysis that is based on the results of their review. Consequently, we include this study in both lists. Furthermore, the results of their
exploratory PCA and their CFA are different insofar as the CFA encompasses only a subset of the indicators that are employed in the PCA. We present details pertaining to their CFA in parentheses.

Fryxell and Barton (1990)

9 (4)

Rowe and Morrow (1999)

Murphy et al. (1996)b

n/a

Devinney, Yip, and Johnson


(2010)

n/a

30

Tosi, Werner, Katz,


and Gomez-Mejia (2000)a

Richard, Devinney, Yip, and


Johnson (2009)
Murphy, Trailer,
and Hill (1996)b

Combs, Crook, and Shook


(2005)a

Study

Number of Number of
Dimensions Indicators

Table 1. Previous Studies That Examine the Dimensions of Organizational Performance.

70

Organizational Research Methods 16(1)

preliminary empirical evidence for their proposed OP framework. Third, Combs et al. do not test the
construct validity of OP indicators, which represent their conceptual framework at the operational
level of the PVF.
We address the limitations of Combs et al. (2005). First, we discuss the semantic relationships of
OP to related performance constructs and clearly define the OP dimensions. Second, we empirically
test Combs et al.s proposed OP framework. Third, we examine the construct validity of indicators of
the OP dimensions by addressing reliability, convergent validity, discriminant validity, and nomological validity. Our study closes the remaining gap in the literature because it proposes a measurement scheme for OP at the operational level and systematically examines its construct validity.
To achieve this aim, we conduct a CFA based on the following: (a) an analytically derived set of
19 OP indicators; (b) secondary, objective OP data; and (c) a large sample of listed U.S. organizations. Our sample consists of 37,262 firm-years for 4,868 listed U.S. organizations from three
dissimilar industries (industrial, consumer services, and technology) over a 21-year period beginning in 1990. We also analyze the robustness of our results for each of the 21 annual data sets and
for each of the three industries.
Our empirical results demonstrate that growth and stock market performance depict distinct
dimensions of OP. In contrast to Combs et al. (2005), the accounting returns dimension must be
decomposed into the two dimensions of liquidity and profitability. For each of the four OP dimensions, we identify a set of indicators that provides a reliable and construct-valid measurement
scheme of the related OP dimension. Our robustness analyses provide strong evidence of stable
inferences of the construct validity of the four-dimensional OP measurement scheme across both
time and industries. However, during periods of high environmental instability (e.g., after the burst
of the dotcom bubble in 2002 and during the financial crisis beginning in 2008), the fourdimensional measurement models fit to the data is weakened.
Our study is among the few in strategic management that directly address the link between the
conceptual level and the operational level of an important construct in the field. We concur with
Venkatraman (2008) in that it is this type of attention to the details of construct operationalization
that is needed in strategy research (p. 791) and that it will thus increase the rigor of future research.
We contribute to the literature on OP in three ways. First, we contribute to the clarity of the OP
construct in terms of definitions, semantic relationships, contextual conditions, and coherence (Suddaby, 2010).1 Second, we propose a construct-valid measurement scheme of OP and its four dimensions. Researchers are encouraged to apply the measurement scheme in future empirical studies.
Third, we highlight the importance of measuring OP using longitudinal data. Cross-sectional OP
data may be prone to weak construct validity during years of high environmental instability. Thus,
researchers employing cross-sectional OP data should carefully evaluate the validity of their
measurement.

Dimensions of Organizational Performance


The conceptual domain of OP can be specified only by relating this construct to the broader construct of organizational effectiveness. Organizational effectiveness is defined as the degree to
which organizations are attaining all the purposes they are supposed to (Strasser, Eveland, Cummins, Deniston, & Romani, 1981, p. 323). Organizations obtain different effectiveness assessments
based on diverse constituencies. Therefore, organizational effectiveness encompasses OP and other
performance concepts (i.e., corporate environmental or social performance), which are relevant for
practice and research.
In the strategic management literature, researchers concentrate on operational performance and
OP (Venkatraman & Ramanujam, 1986). Operational performance refers to the fulfillment of operational goals within different value chain activities that may lead to subsequent OP (Combs et al.,

Downloaded from orm.sagepub.com by guest on February 26, 2015

Hamann et al.

71

2005). Common performance indicators, such as growth in market share, product quality, patent
filings, or marketing effectiveness, measure distinct dimensions of operational performance.
In contrast, OP is defined as the economic outcomes resulting from the interplay among an organizations attributes, actions, and environment (Combs et al., 2005, p. 261). The definition of OP
corresponds to measurement practices in strategic management research because a majority of
researchers assess OP based on economic indicators (Murphy et al., 1996; Richard et al., 2009).
Thus, OP is synonymous with the concepts of financial performance or corporate economic performance (Fryxell & Barton, 1990). OP is relevant to both research and practice because in the legal
system (i.e., bankruptcy law or commercial law) and in economic theory, OP (i.e., economic outcomes) constitutes the final aim of economic activities.
Combs et al. (2005) propose a consistent OP framework with three dimensions: accounting
returns, stock market performance, and growth.
Accounting returns are defined as the historical performance of organizations that is assessed
through the use of financial accounting data as published in annual reports (Fryxell & Barton,
1990). As shown in Table 1, Combs et al. (2005) argue for a single accounting returns dimension,
whereas other studies identify several dimensions that are derived from accounting returns indicators. However, we expect at least two separate dimensions to be reflected by accounting returns
indicators. First, a liquidity dimension, which is defined as a firms ability to meet its financial obligations based on cash flows generated from its current operations, is expected (Weygandt, Kimmel,
& Kieso, 2010). Second, a profitability dimension, defined as an organizations efficiency in utilizing production factors to generate earnings, is expected. Accounting research highlights the difference between earnings (e.g., net profit) and cash flows that is traced to revenue and expense accruals
(e.g., Dechow, 1994). Accruals mitigate timing and matching problems associated with the allocation of cash flows to single periods but are subject to distortions caused by discretionary accounting
choices (e.g., a depreciation method or the useful life of assets). Additionally, Rappaport (1993)
stresses the divergence between the accounting-based return on investment and the cash flow rate
of return.
Stock market performance reflects the perceptions of investors regarding organizations future
performance (Fryxell & Barton, 1990). This dimension is measured using capital market indicators,
such as total shareholder return (TSR). However, capital market indicators are also influenced by the
momentum and volatility of capital markets, the economy, and psychological effects (Richard et al.,
2009). Stock market performance reflects future OP, in contrast with accounting returns, which
entail a historical perspective. As shown in Table 1, previous studies provide consistent evidence
regarding stock market performance as a distinct OP dimension.
Organizational growth is defined as a change in an organizations size over time. Organizational
growth is a dynamic construct that is commonly evaluated based on three concepts of size: sales,
employees, and assets (Weinzimmer, Nystrom, & Freeman, 1998). As shown in Table 1, previous
studies that investigate the OP dimensions focus on sales growth and disregard employment and
asset growth.
Previous examinations of the dimensionality of OP are subject to three limitations. First, the
number of indicators used is small. For example, Fryxell and Barton (1990) use four indicators, and
Venkatraman and Ramanujam (1987) employ three indicators. However, a small number of indicators may not capture the entire conceptual domain of a construct. Second, indicators are often not
chosen analytically. For example, Murphy et al. (1996) chose 19 OP indicators based on their frequent usage by researchers. These indicators include absolute returns (e.g., net income), return ratios
(e.g., return on assets), size (e.g., number of employees), and ratios of balance sheet items (e.g., debt
to equity). Given the conceptual domain of OP, the adequacy of some of these indicators is questionable; for example, size and static balance sheet items differ conceptually from OP (Combs et al.,
2005; Tosi et al., 2000). If indicators are chosen inadequately, spurious factors may emerge or true

Downloaded from orm.sagepub.com by guest on February 26, 2015

72

Organizational Research Methods 16(1)

factors may be obscured in factor analyses (Fabrigar, Wegener, MacCallum, & Strahan, 1999).
Third, cash flow return indicators are absent in the majority of previous studies. Devinney et al.
(2010) and Rowe and Morrow (1999), who both include a single cash flow return indicator in their
factor analysis (cash flow return on sales and cash flow return on equity, respectively), are exceptions. This limitation is important because we expect the single accounting returns dimension proposed by Combs et al. (2005) to divide into two dimensions (i.e., liquidity and profitability) when the
convergence of cash flow returns and profitability indicators is examined systematically.

Research Design
Assessment of Construct Validity
During the process of construct validation, four criteria are evaluated: reliability, convergent validity, discriminant validity, and nomological validity (Schwab, 2005). We employ CFA to examine
construct validity. First, the theory-testing approach of CFA is appropriate for the evaluation of the
two competing models, the three-OPdimension model and the four-OPdimension model, that
emerged from our discussion of previous research. Second, this approach enables an examination
of the overall fit of a measurement model to a data set. Third, CFA permits researchers to test the
significance of factor loadings. Fourth, CFA supplies indices that provide insights into reliability,
convergent validity, and discriminant validity (Bagozzi, Yi, & Phillips, 1991; OLeary-Kelly &
Vokurka, 1998). Table 2 presents the methods and indices that are applied to assess the criteria
of construct validity (see also Bagozzi & Yi, 1988).
Prior to the assessment of the construct validity criteria in a CFA, the overall fit of the measurement model to the data must be established (Anderson & Gerbing, 1988). The assessment of the
overall measurement model fit to the data is based on the chi-square statistic, the Comparative Fit
Index (CFI), the root mean square error of approximation (RMSEA), the standardized root mean
square residual (SRMR), and the Akaikes Information Criterion (AIC). The methodological literature criticizes the use of definite cutoff criteria for these goodness-of-fit indices. Goodness-of-fit
indices are sensitive to the misspecification of a model and to sample size, model types, and data
non-normality. Consequently, definite cutoff criteria may yield a high Type I error (i.e., rejecting
acceptable misspecified models) if they are too conservative (Marsh, Hau, & Wen, 2004). We
account for this cutoff criteria ambiguity by differentiating between cutoff criteria for acceptable and
good fits of the measurement model to the data and by reporting more than one goodness-of-fit
index, as recommended by Hu and Bentler (1999).2 We compare the competing models of OP based
on their overall measurement model fit to the data. Hereafter, we employ the best fitting model to
examine the four criteria of construct validity.
Reliability is defined as the ratio of systematic variance to total variance (i.e., the degree to which
an indicator is free of random error). Reliability is a necessary prerequisite for validity (Schwab,
2005). Convergent validity is defined as the extent to which multiple indicators represent a common
construct. A number of indicators of the same construct should exhibit high levels of covariance to
be considered valid measures of the construct in question (Bagozzi et al., 1991). In contrast, discriminant validity is defined as the degree of divergence among indicators that are designed to measure
different constructs (Edwards, 2003). The methods that we apply to assess these criteria of construct
validity are presented in Table 2.
Nomological validity is based on evidence pertaining to the relationships between measures of
the construct under investigation and measures of other constructs. This evidence should be consistent with relevant theory or with the results of previous empirical studies (Schwab, 2005). Consequently, we test the relationships between the dimensions of OP and the determinants and
consequences of OP. Capon, Farley, and Hoenig (1990) conducted a meta-analysis of the

Downloaded from orm.sagepub.com by guest on February 26, 2015

Downloaded from orm.sagepub.com by guest on February 26, 2015

73

Convergent validity

Reliability

Overall fit of the


measurement
model to the data

Steps in Assessing
Construct Validity

Standardized factor loadings

Average variance extracted

Construct reliability

Item reliability

Akaikes information
criterion (AIC)

Standardized root mean


square residual (SRMR)

Root mean square error of


approximation (RMSEA)

(continued)

H0hypothesis of the likelihood ratio test is the exact fit of a specified model to a population
(MacCallum, Browne, & Sugawara, 1996, p. 132).
Acceptance of H0: p value > .05.
The CFI describes the relative improvement in the fit of the model in comparison with the fit of the
independence model. Thus, this index overcomes sample size effects (Bentler, 1990, pp. 245-246).
Acceptable fit: CFI > .90; good fit: CFI > .95.
The RMSEA measures the discrepancy between the covariance matrix estimated from the model and
the observed matrix. This criterion adjusts for the model degrees of freedom (MacCallum et al.,
1996, p. 134).
Acceptable fit: RMSEA < .08; good fit: RMSEA < .05.
The SRMR measures the mean overall difference between observed and predicted correlations (Hu &
Bentler, 1999, p. 1).
Acceptable fit: SRMR < .08; good fit: SRMR < .05.
AIC is a predictive fit index that measures model fit based on the models capacity to be replicated in
future samples. This criterion considers the model degrees of freedom and allows for a comparison of
different non-nested measurement models; lower values indicate a better fit (Akaike, 1974, p. 716).
Item reliability is examined using the R2 value that is associated with each indicator to factor equation.
This criterion measures the strength of the linear relationship between an indicator and its latent
factor (Bagozzi & Baumgartner, 1994, p. 402). Acceptable item reliability: R2 > .4.
Construct reliability represents the proportion of systematic variance in a set of indicators (Bagozzi &
Baumgartner, 1994, p. 403; Edwards, 2003, pp. 344-345). Acceptable construct reliability > .6.
Average variance extracted measures the amount of variance in a set of indicators that is accounted
for by the latent factor in the model (Fornell & Larcker, 1981, pp. 45-46). Acceptable average
variance extracted > .5.
Factor loadings with the theoretically predicted sign, an estimate above .5 (acceptable convergence) or
above .7 (good convergence), and statistical significance constitute evidence of convergence
(Carlson & Herdman, 2010, p. 1).

Chi-square statistic of the


likelihood ratio test

Comparative Fit Index (CFI)

Explanation and Thresholds for Acceptability

Assessment Criteria

Table 2. Statistics and Methods That Are Applied to Assess Construct Validity.

74

Downloaded from orm.sagepub.com by guest on February 26, 2015

Antecedents and consequences

The average variance extracted for a factor is compared with all squared correlations of this factor
with other factors in the overall measurement model. If the average variance extracted is greater
than the squared correlations in all cases, this result is a strong indicator of discriminant validity
(Fornell & Larcker, 1981, p. 46).
The model fits of two measurement models are compared for each possible combination of pairs of
factors. In the first model, the correlation between the two factors is constrained to 1.0, whereas
this correlation parameter is freely estimated in the second model. Finally, a chi-square difference
test between the chi-square values of these two models is performed. A statistically significant
difference indicates adequate discriminant validity (Anderson & Gerbing, 1988, p. 416).
We model research and development intensity (), capital investment intensity (), market
concentration (), and market share () as antecedents and survival () as a consequence of
organizational performance in a structural equation model.

Explanation and Thresholds for Acceptability

a
The expected signs are given in parentheses. Capon, Farley, & Hoenig (1990); Lee (2009); and Bercovitz and Mitchell (2007, p. 72) provide empirical evidence for our analysis of
nomological validity.

Nomological validitya

Fornell-Larcker criterion

Discriminant validity

Chi-square difference test

Assessment Criteria

Steps in Assessing
Construct Validity

Table 2. (continued)

Hamann et al.

75

determinants of financial performance. Their findings are based on 320 primary studies and provide
the most robust results pertaining to the determinants of OP. Four of the constructs that Capon et al.
examine are applicable in our analysis to establish nomological validity: research and development
(R&D) intensity (measured as R&D expenses divided by sales), capital investment intensity (measured as capital investment divided by sales), market concentration (measured as the Herfindahl
index), and market share (measured as an organizations sales divided by the total sales in its industry). Lee (2009) conducted a study on the determinants of firm performance (measured as the return
on assets) in a sample of 7,158 listed U.S. organizations and replicated the findings of Capon et al.
regarding these four constructs. Furthermore, OP has been found to strongly influence organizations long-term business survival (Bercovitz & Mitchell, 2007). Thus, we include survival as a consequent construct in our analysis to establish nomological validity.3

Identification of Indicators
We apply a systematic approach to derive a set of OP indicators, which are presented in Table 3.
First, we identify a number of OP indicators that have been used in prior empirical studies, as
reviewed by Combs et al. (2005) and Richard et al. (2009). These indicators are typically ratios that
measure accounting returns. Accounting returns ratios consist of a numerator that expresses the
accounting outcome and a denominator that indicates an organizations size. Second, we insert these
previously applied indicators in a matrix by presenting the accounting outcomes in rows and the size
measures in columns. Third, we add indicators to complete the matrix by systematically considering
all possible combinations of accounting outcomes and size measures. Previous factor analyses of OP
indicators do not systematically cover all of these combinations. In this matrix, we derive growth
indicators from all size measures and indicators of accounting outcome change. In addition, we
employ all stock market performance indicators, as identified by Combs et al. (2005). Our set of
indicators includes six hybrid indicators (see footnote a in Table 3). Hybrid indicators address overlapping areas of two or more OP dimensions.
As shown in Table 3, our set of indicators has three important characteristics. First, the models
that are derived from our set of indicators satisfy the two-indicator rule of model identification for
CFA (Kline, 2011).
Second, we directly examine the consequences of the inclusion of hybrid indicators. Hybrid indicators may represent a parsimonious manner of measuring two OP dimensions with a single indicator, but they lead to the interpretational confounding of these dimensions (Burt, 1976).
Consequently, Combs et al. (2005) recommend the avoidance of hybrid indicators but do not test
this avoidance empirically. For each of the two competing OP frameworks (i.e., the three-OP
dimension model and the four-OPdimension model), we define two submodels. The first submodel
includes all 19 OP indicators, whereas the second submodel excludes the six hybrid indicators. Thus,
we perform CFAs for four competing models. We expect a lower fit to the data for the models with
hybrid indicators because hybrid indicators are subject to confounded measurement (Burt, 1976).
Third, we are confident that our set of indicators is content-valid regarding the dimensions of OP.
Content validity is a prerequisite for construct validity and refers to the extent to which an indicator
represents the conceptual domain of a construct (Edwards, 2003). Content validity is established by
(a) defining the construct of interest and (b) selecting indicators that are theoretically and logically
connected to the construct. We developed definitions of the OP dimensions in the previous section,
and we discuss the relations between these OP dimensions and our set of indicators in the following
section.
Accounting returns are measured using eight indicators that are related to different aspects of this
dimension. Employees, assets, and the market value of equity constitute conceptually different production factors. Sales provide a basis on which to examine economic product market success. Cash

Downloaded from orm.sagepub.com by guest on February 26, 2015

76

Downloaded from orm.sagepub.com by guest on February 26, 2015

Employees

Sales

Size Measures
Assets

Market Value

Cash flow

Cash flow
growtha
Income
growtha

Change in
Accounting
Outcomes

Note: Indicators are presented in boldface if they have been used in empirical studies that measure organizational performance (OP), as reviewed by Combs, Crook, and Shook (2005) and
Richard, Devinney, Yip, and Johnson (2009). The remaining indicators have been developed to complete the matrix analytically. Indicators are italicized if they have been employed in previous
factor analyses of OP indicators (see Table 1).
a
We exclude hybrid indicators from two of the four competing models because these indicators may conflate two of the OP dimensions. Thus, we are able to directly investigate the effect of
hybrid indicators in the proposed models.
b
We exclude market value growth from our analysis because total shareholder return (TSR) is defined as the sum of change in share price plus any dividends paid out in relation to the previous
years share price. Consequently, TSR and market value growth are highly correlated (r .964; p < .001).

Accounting returns

Cash flow return on


Cash flow return on
Cash flow return on
Cash flow return per
employee
sales
assets
market valuea
Net income
Return per
Return on
Return on
Return on
employee
sales
assets
market valuea
Growth
Employment
Sales
Assets
Market value
growth
growth
growth
growthb
Stock market performance Total shareholder return, Sharpe ratio, Jensens alpha, Treynor index, Tobins Qa, and market-to-book ratioa

OP Dimensions

Accounting
Outcomes

Table 3. Set of Indicators That Are Used for Construct Validation.

Hamann et al.

77

flow represents liquidity generated from operating activities, and net profit represents the profitability of these activities.4 Consequently, we are able to examine whether the four cash flow return indicators and the four net income return indicators belong to a single accounting returns dimension or to
two distinct dimensions: a liquidity dimension and a profitability dimension. Absolute accounting
return measures are excluded from the CFA because they reflect size. Consequently, absolute
accounting return measures loaded together with absolute size measures on a single factor in an
exploratory factor analysis conducted by Tosi et al. (2000). The two indicators that are based on the
market value of equity also address the stock market performance dimension and are thus hybrid
indicators.
The growth dimension is measured by five indicators.5 Three of these indicators denote major
size concepts, namely, sales, employment, and assets (Weinzimmer et al., 1998), and two indicators
represent changes in accounting outcomes (cash flow growth and net income growth). The last two
indicators are hybrids because they address the growth dimension and either the accounting return
dimension or the liquidity and profitability dimensions.
Stock market performance is measured by six indicators that are divided into two groups. The
first group of four indicators represents change in the market perceptions of an organizations value
during a specific time period. Three of these indicators (Jensens alpha, the Sharpe ratio, and the
Treynor index) relate the stock market return of a share to its risk (Combs et al., 2005). TSR represents a shareholders gain over a given time period. TSR is the change in share price plus dividends
relative to the previous years share price. The second group of two indicators represents ratios of
market value to book value. Tobins Q is calculated as the ratio of the market value of an organizations equity plus the book value of its liabilities to the book value of its total assets. Market-to-book
ratio represents the market value of equity in relation to the book value of equity (Richard et al.,
2009). Rappaport (1993) criticized the market-to-book ratio, stating that it is an unreliable proxy for
stock market performance because it can be influenced by distortions of book value caused by discretionary accounting choices. Because these two indicators capture the accounting perspective and
the stock market dimension, we classify them as hybrid indicators in the sense of Combs et al.
(2005).

Sample Description
Our sample consists of 37,262 firm-years for 4,868 U.S. organizations from three industries over a
period of 21 years beginning in 1990. We selected organizations from three dissimilar industries
based on the Industry Classification Benchmark: industrial, consumer services, and technology.
These three industries offer a sufficient sample size for both our primary and robustness analyses.
Furthermore, these three industries differ in their customer structure, business models, and production technology. We analyzed all U.S. organizations that belong to the three industries and are listed
on the capital market within the specified time period. Following the recommendations of Dess and
Robinson (1984), we used secondary, objective data. We obtained capital market data from the
Thomson Reuters Datastream database and accounting data from the Thomson Reuters Worldscope
database. In our sample, all firm-years with missing values are deleted. For each indicator, we winsorized the topmost and bottommost percentiles (by year and industry). We standardized all indicators by year and industry to a mean of zero and a standard deviation of one. Because of the
longitudinal nature of our sample, the observations are not independent. The univariate skewness
and kurtosis indicate non-normality for the standardized indicators in our sample.6 We employ
Mplus (version 6.1) for our analyses because this software allows us to conduct single-level CFAs
that are robust with regard to the non-normality and the non-independency of observations. To do so,
we cluster firm-years by firm and employ the Mplus COMPLEX analysis and the MLR
estimator.7

Downloaded from orm.sagepub.com by guest on February 26, 2015

78

Organizational Research Methods 16(1)

Results
Comparison of the Competing CFA Models
As shown in Table 4, we conducted CFAs for four competing models: (a) a three-factor model with
hybrid indicators (3FM-A), (b) a three-factor model without hybrid indicators (3FM-B), (c) a fourfactor model with hybrid indicators (4FM-A), and (d) a four-factor model without hybrid indicators
(4FM-B). Thus, we are able to compare the models for three OP dimensions and those for four OP
dimensions, as well as the models for the inclusion of hybrid indicators and those for the exclusion of
hybrid indicators.
First, the comparison of the three-factor models (3FM-A and 3FM-B) and the four-factor models
(4FM-A and 4FM-B) provides evidence of the superiority of the four-factor structure. The chisquare differences between nested models with hybrid indicators (Dw2 16,443.36; p < .001) and
nested models without hybrid indicators (Dw2 13,368.36; p < .001) are both significant. Directly
comparing the AICs for the pairs of nested models (3FM-A vs. 4FM-A and 3FM-B vs. 4FM-B) indicates an improved fit of the measurement model to the data for the four-factor models. In addition,
neither the 3FM-A nor the 3FM-B demonstrate acceptable fit to the data, based on CFI, RMSEA,
and SRMR.
Second, the comparison of models with hybrid indicators (3FM-A and 4FM-A) and models without hybrid indicators (3FM-B and 4FM-B) provides strong support for Combs et al.s (2005) recommendation to avoid hybrid indicators when measuring OP. AIC is lowest for the two models without
hybrid indicators (3FM-B and 4FM-B). On the one hand, including hybrid indicators in measurement models increases the number of non-zero covariances in the sample covariance matrix. On the
other hand, in the model-implied covariance matrix, only some of these covariances (but not all) are
freed to be estimated by modeling cross-loadings of hybrid indicators toward different factors (Burt,
1976). This issue increases the level of misspecification in such models, which is made evident by
the goodness-of-fit indices. In addition, most factor loadings of the six hybrid indicators in models
3FM-A and 4FM-A are below .5. This result indicates low convergence of hybrid indicators with
indicators directly measuring OP dimensions. Tobins Q (l3FM-A l4FM-A .361) and marketto-book ratio (l3FM-A l4FM-A .249) also exhibit low convergence with other stock market performance measures. This last finding underscores Rappaports (1993) criticism of OP indicators that
inappropriately couple an accounting criterion with a stock market criterion.
Third, all models display significant chi-square statistics. Thus, all models exhibit some degree of
misspecification. This misspecification arises partially from fixing the majority of parameters (e.g.,
cross-loadings of indicators to factors other than their primary factor) in all models to a value of zero
to develop a parsimonious measurement model of OP. However, correlations between the OP indicators, the sample covariance matrix of these indicators, and the modification indices indicate that
small cross-loadings (i.e., l < .2) exist for all OP indicators. Thus, we examine whether the
goodness-of-fit indices imply that the level of misspecification in the model is acceptable. In this
respect, only 4FM-B exhibits a good fit to the data. All of the fit statistics meet the cutoff criteria
for good fit (CFI .950, RMSEA .042, SRMR .043). In addition, among the four models, the
AIC is lowest for this model (AIC 1,012,209). These findings suggest the superiority of 4FM-B in
comparison with the other models in terms of both fit and parsimony. Thus, we employ 4FM-B to
examine reliability, convergent validity, discriminant validity, and nomological validity.

Construct Validity of Model 4FM-B


As shown in Table 5, all factors in 4FM-B demonstrate evidence of reliability with regard to all three
criteria. The item reliabilities of all indicators are above .4, whereas the sales growth indicator has
the lowest value for item reliability (.440). The values for construct reliability and average variance

Downloaded from orm.sagepub.com by guest on February 26, 2015

Hamann et al.

79

Table 4. Comparison of Single-Level Confirmatory Factor Analysis Models.


Indicators

3FM-A

3FM-B

4FM-A

4FM-B

.766***
.756***
.899***
.637***
.248***

.805***
.832***
.832***

.844***
.771***
.919***
.637***
.300***

.889***
.829***
.865***

.688***
.662***
.799***

.683***
.664***
.801***
.032***
.066***

.687***
.663***
.799***

.980***
.979***
.850***
.851***
.833***
.834***
.881***
.882***
.361***
.249***
.015*
.107***
Correlations between factors
.238***
.217***
.177***
.162***
.227***
.219***

.980***
.849***
.833***
.881***
.361***
.249***
.047***
.102***

.980***
.850***
.833***
.882***

.227***
.722***
.093***
.149***
.236***
.182***

.219***
.750***
.099***
.129***
.216***
.163***

18,597*** (139)
.834
.060
.071
1,601,775

3,678*** (56)
.950
.042
.043
1,012,209

Standardized factor loadings


Accounting (ACC)
Cash flow return per employee
Cash flow return on sales
Cash flow return on assets
Cash flow return on market value
Return per employee
Return on sales
Return on assets
Return on market value
Cash flow growth
Income growth
Liquidity (LIQ)
Cash flow return per employee
Cash flow return on sales
Cash flow return on assets
Cash flow return on market value
Cash flow growth
Profitability (PRO)
Return per employee
Return on sales
Return on assets
Return on market value
Income growth
Growth (GRO)
Employment growth
Sales growth
Assets growth
Cash flow growth
Income growth
Stock market performance (SMA)
Total shareholder return
Sharpe ratio
Jensens alpha
Treynor ratio
Tobins Q
Market-to-book ratio
Cash flow return on market value
Return on market value
ACC with GRO
ACC with SMA
GRO with SMA
LIQ with PRO
LIQ with GRO
LIQ with SMA
PRO with GRO
PRO with SMA
Chi-square (df)
CFI (> .95)
RMSEA (< .05)
SRMR (< .05)
AIC

.536***
.611***
.652***
.349***
.845***
.805***
.919***
.624***
.113***
.297***

.683***
.663***
.801***
.034***
.063***

.552***
.639***
.610***
.882***
.845***
.865***

Fit statistics
35,040*** (141)
17,047*** (59)
.685
.768
.082
.088
.089
.070
1,656,857
1,056,041

Note: n 37,262 firm-years. The four models test the factor structurethree factors (3FM) versus four factors (4FM)and
the inclusion of hybrid indicators (A) and the exclusion of these indicators (B). Residuals of indicators that share the same
denominator (e.g., sales or assets) are allowed to correlate in all four models for arithmetic reasons. Hybrid indicators are presented in italics. Fit statistics that indicate good model fit to the data are presented in boldface. Cutoff criteria of good model fit
to the data are presented in parentheses for each goodness-of-fit index. CFI Comparative Fit Index; RMSEA root mean
square error of approximation; SRMR standardized root mean square residual; AIC Akaikes Information Criterion.
*p < .05. ***p < .001.

Downloaded from orm.sagepub.com by guest on February 26, 2015

80

Organizational Research Methods 16(1)

Table 5. Evaluation of Construct Validity Based on the 4FM-B.


Factor
Variable
Cash flow return per employee
Cash flow return on sales
Cash flow return on assets
Return per employee
Return on sales
Return on assets
Employment growth
Sales growth
Assets growth
Total shareholder return
Sharpe ratio
Jensens alpha
Treynor ratio

Liquidity

Profitability

Growth

Stock Market
Performance

Item reliability (> .40)a


.648***
.692***
.692***
.791***
.687***
.748***
.472***
.440***
.638***
.960***
.723***
.695***
.777***

Reliability of constructs
Construct reliability (> .60) a
.863
.896
.761
Average variance extracted (> .50)a
.677
.742
.517
Discriminant validity: Fornell-Larcker criterionb
Liquidity
.677
Profitability
.563
.742
Growth
.010
.047
.517
Stock market performance
.017
.027
.048
Discriminant validity: Chi-square difference testc
Liquidity
3,241.46*** (3) 8,706.33*** (3)
Profitability
6,953.09*** (3)
Growth
Nomological validity: Antecedent constructsd
Research and development intensity () .227***
.271***
.043***
Capital investment intensity ()
.097***
.112***
.051***
Market concentration ()
.022**
.018*
.034***
Market share ()
.053***
.054***
.014***
Nomological validity: Consequent constructs
Survival ()
.022**
.040***
.007

.937
.789

.789
9,386.71*** (3)
11,473.98*** (3)
14,101.30*** (3)
.012*
.006
.012**
.013***
.016***

Note: n 37,272 firm-years.


a
The thresholds for item reliability, construct reliability, and average variance extracted are given in parentheses.
b
The Fornell-Larcker criterion of discriminant validity is satisfied if the average variance extracted for a factor is greater than
its squared correlations with all other factors. The average variance extracted is presented on the diagonal.
c
The chi-square difference test is performed between two two-factor models. In the first model, the correlation between the
two factors is constrained to 1.0. In the second model, this correlation is freely estimated. A significant chi-square difference
indicates discriminant validity. Differences in degrees of freedom are given in parentheses.
d
Nomological validity is tested in the 4FM-B. In this model, all measures of the antecedent constructs are regressed on each
single performance dimension, and each single performance dimension is regressed on survival. Comparative Fit Index (.934),
root mean square error of approximation (.036), and standardized root mean square residual (.042) indicate the fit of this
model. The expected signs are given in parentheses. The regression coefficients are presented in boldface if they are statistically significant and display the expected sign.
*p < .05. **p < .01. ***p < .001.

Downloaded from orm.sagepub.com by guest on February 26, 2015

Hamann et al.

81

extracted are above the thresholds for all factors, with the lowest values, .761 and .517, respectively,
determined for the growth dimension.
As Table 4 shows, 4FM-B provides evidence of convergent validity for all factors. First, the
factor loadings of all indicators exhibit acceptable convergence (i.e., l > .5). Second, if we consider
the stronger criteria of good convergence (i.e., l > .7), all indicators of the liquidity, profitability,
and stock market performance factors are above this threshold. The convergence of the three growth
indicators is slightly weaker. The factor-loading estimate of the employment growth indicator (l
.687) and the sales growth indicator (l .663) are both statistically significant but slightly below the
threshold for good convergence.
As shown in Table 5, all factors in model 4FM-B demonstrate discriminant validity. The FornellLarcker criterion holds for all factors. The chi-square difference tests, which compare fixed and
freely estimated two-factor models for all pairs of factors, support this conclusion. As Table 4 shows,
the liquidity and profitability factors exhibit the highest correlation (r .750) among the four factors. All other correlation coefficients are considerably lower (i.e., r < .25). These results indicate
four dimensions of the OP construct.
As shown in Table 5, our analyses provide evidence of nomological validity with regard to the
four OP dimensions. Ten out of 16 regression coefficients of the antecedents of OP are statistically
significant and display the expected signs, according to Capon et al.s (1990) meta-analysis of 320
primary studies. The majority of OP indicators that are included in their meta-analysis belong to the
profitability and liquidity dimensions. Accordingly, all regression coefficients between the four
determinants and these two dimensions are statistically significant and in the expected directions.
Three regression coefficients of survival for the OP dimensions are significantly different from zero
and show the expected (i.e., positive) signs (liquidity, profitability, and stock market performance).
However, growth appears to be unrelated to the survival of companies in our sample.

Robustness Analyses
Table 6 presents the results of our robustness analyses. Our inferences regarding the construct validity of 4FM-B are stable across industries and time periods. We repeat our primary analyses of the
construct validity for each industry and each year separately. The overall model fit is acceptable for
all three fit indices in all three industries and in 18 out of 21 years.
The model fit is lowest for years with high environmental instability, as indicated by the volatility
and the annual return of the S&P 500 index (e.g., in 2002, after the burst of the dotcom bubble, and in
2008 and 2009, during the financial crisis). In particular, sales growth and employment growth fail
to demonstrate item reliability and convergence for years with high environmental instability. With
regard to the growth dimension, only the asset growth indicator demonstrates acceptable values for
reliability and convergence for all years. The average variance extracted for the growth dimension is
below the threshold for almost every year after 2000 except 2005, indicating a weak construct reliability for the last 11 years. The discriminant validity of the dimensions of OP is evident in all industries. However, for five of the years, the correlation coefficient between the profitability and
liquidity dimensions is high (i.e., r > .8). Consequently, during these years, the two dimensions
do not discriminate as strongly as implied by the primary analysis. Overall, our results generally
remain unchanged across industries and time periods.

Discussion and Conclusions


The results of this study reveal the existence of four independent OP dimensions: liquidity, profitability, growth, and stock market performance. The evidence of the construct validity of the fourdimensional OP measurement scheme is strong and consistent across different time periods and

Downloaded from orm.sagepub.com by guest on February 26, 2015

82

Downloaded from orm.sagepub.com by guest on February 26, 2015

513
970
1,026
1,097
1,163
1,601
1,820
1,976
2,125
2,124
2,419
2,402
2,352
2,243
2,158
2,042
2,001
1,926
1,829
1,777
1,698

16,243
10,005
11,104

.950
.921
.949
.960
.944
.941
.940
.949
.945
.951
.907
.905
.898
.915
.949
.929
.955
.946
.855
.889
.909

.943
.950
.950

CFI

.050
.071
.058
.054
.063
.064
.062
.055
.058
.055
.063
.067
.070
.056
.054
.055
.047
.047
.072
.066
.066

.044
.041
.048

RMSEA

.062
.060
.051
.042
.027
.061
.049
.058
.052
.044
.073
.071
.059
.037
.047
.057
.056
.069
.062
.049
.044

.047
.044
.044

SRMR

14,986
25,023
27,679
29,627
21,929
43,731
47,870
51,827
55,834
49,137
66,679
64,264
65,262
48,433
42,020
47,612
52,609
52,965
54,705
37,570
43,064

441,800
282,383
278.172

AIC

131.53 (57)
333.48 (56)
252.88 (57)
232.52 (56)
310.88 (56)
427.74 (56)
447.95 (56)
384.76 (56)
454.01 (56)
410.97 (56)
585.18 (56)
662.28 (56)
697.31 (56)
448.34 (56)
413.94 (56)
404.52 (56)
306.65 (56)
290.61 (56)
581.51 (56)
492.22 (56)
473.18 (56)

1,821.91 (56)
979.99 (56)
1,486.66 (56)

Chi-Square

11/13
13/13
13/13
13/13
13/13
12/13
13/13
13/13
13/13
13/13
11/13
12/13
11/13
13/13
12/13
13/13
13/13
10/13
10/13
11/13
12/13

13/13
13/13
13/13

Item
Reliability
Industries
4/4
4/4
4/4
Time
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4

Construct
Reliability

Reliability

4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
4/4
3/4
3/4
3/4
3/4
3/4
4/4
3/4
3/4
3/4
3/4
3/4

4/4
3/4
4/4

Average
Variance
Extracted

11/13
13/13
13/13
13/13
13/13
13/13
13/13
13/13
13/13
13/13
13/13
12/13
13/13
13/13
13/13
13/13
13/13
13/13
13/13
13/13
13/13

13/13
13/13
13/13

Convergent
Validity

12/12
12/12
12/12
12/12
12/12
12/12
12/12
12/12
12/12
12/12
12/12
12/12
12/12
10/12
10/12
10/12
12/12
11/12
12/12
12/12
11/12

12/12
12/12
12/12

Discriminant
Validity

Validity

4.9214
5.6230
2.6822
2.9739
2.0597
9.2497
6.2587
9.2017
6.4972
4.6309
3.8732
7.0106
11.0146
8.6373
2.9923
2.8477
4.3365
3.3149
15.1583
12.7387
5.2141

n/a
n/a
n/a

Volatility

3.10%
30.47%
7.62%
10.08%
0.85%
37.05%
22.96%
33.36%
28.58%
21.04%
8.81%
11.89%
22.10%
28.68%
10.88%
4.77%
15.23%
5.49%
37.00%
26.46%
15.06%

n/a
n/a
n/a

Annual
Return

S&P 500

Note: Satisfied criteria are presented in boldface. Reliability and validity are reported as the proportion of indicators or factors that satisfy each criterion. The volatility of the S&P 500 is calculated
as the normalized standard deviation based on the daily return index in each year. CFI Comparative Fit Index; RMSEA root mean square error of approximation; SRMR standardized root
mean square residual; AIC Akaikes Information Criterion; ICB Industry Classification Benchmark.

1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010

ICB 2000
ICB 5000
ICB 9000

Group

Fit Statistics

Table 6. Stability of the Measurement Model 4FM Across Industries and Time.

Hamann et al.

83

industries. We demonstrate that the accounting return dimension that Combs et al. (2005) propose
should be decomposed into the two distinct dimensions of liquidity and profitability. Organizations
differ in their ability to meet financial obligations based on cash flows generated from their current
operations and in their efficiency to utilize production factors to generate earnings. For years with
high environmental instability, the overall fit of the measurement model to the data becomes weaker.
This result contributes to the findings of Fryxell and Barton (1990), who provided evidence on
changes in the structure of the measurement of OP between years with low and high environmental
instability.
Our study contributes to research on OP measurement in three ways. First, regarding the conceptual level of the PVF, we clarify the OP construct. Second, regarding the link between the conceptual
and operational levels of the PVF, we establish the construct validity of 13 OP indicators. Third, we
reveal the dynamics of OP measurement over time.
First, construct clarity encompasses definitions, semantic relationships, contextual conditions,
and coherence (Suddaby, 2010). We offer definitions of OP and its dimensions, which capture essential characteristics, avoid circularity, and are parsimonious. Additionally, we integrate OP and its
dimensions into a hierarchy of related performance constructs, with organizational effectiveness
at the top of this hierarchy. Furthermore, we provide evidence that the nature of the OP construct
is highly sensitive to environmental instability as an important contextual condition. Researchers
who study OP should be aware of the four different (not interchangeable) dimensions and their contextual conditions.
Second, we develop a set of OP indicators at the operational level of the PVF. In addition, we test
the construct validity of these OP indicators based on the four OP dimensions. Thus, we contribute to
the establishment of a link between the conceptual and operational levels of an important construct
in strategic management research. We empirically confirm that hybrid indicators should be avoided
when measuring OP and its dimensions, as recommended by Combs et al. (2005). Additionally, we
propose a measurement scheme of 13 OP indicators that measure all four dimensions in a constructvalid and parsimonious manner.
Third, environmental instability influences the measurement structure of OP. Researchers should
carefully control for this factor if they address OP in their research design. We recommend that
researchers avoid a nonlongitudinal measurement of OP for years characterized by high environmental instability.
Our findings have important implications for future strategic management research. First, strategic management theories that address variations in OP must consider the four OP dimensions as an
entity or only concentrate on selected dimensions. This implication is further underscored by our
evaluation of antecedents of OP. Profitability and liquidity are influenced by all four tested antecedents, whereas our results regarding the growth and stock market performance dimensions present a
different picture of these relationships. This issue must be addressed at the conceptual level, and it
offers fruitful avenues for future theoretical research on the determinants of OP. Second, empirical
researchers addressing OP in their work are encouraged to use the four-dimensional OP measurement scheme, for which construct validity has been established. Our study contributes to the reduction of the plethora of OP indicators that may be employed in future empirical studies and thus may
facilitate an increase in rigor and relevance in strategic management research.
Our study has three limitations. First, we concentrate on listed organizations because the stock
market performance dimension is applicable to only these organizations. Thus, the question of
whether the construct validity of the other three OP dimensions also holds in organizations that are
not active in the capital market remains unanswered. Second, we employ secondary, objective OP
data according to the recommendation of Dess and Robinson (1984). The question of whether the
four OP dimensions are also applicable to other performance data, such as perceptual OP indicators,
remains unanswered. Third, OP is only one of several important performance constructs (e.g.,

Downloaded from orm.sagepub.com by guest on February 26, 2015

84

Organizational Research Methods 16(1)

operational performance or corporate environmental performance). These other performance constructs are also subject to measurement issues. Consequently, developing and testing constructvalid measurement schemes for these constructs offers possibilities for future research.8 Such
research may draw on the methodology of this study and employ the four-dimensional OP measurement scheme to test nomological validity.
Valid measurement is the sine qua non of science. If the measures used in a discipline have not
been demonstrated to have a high degree of validity, that discipline is not a science (Peter, 1979, p.
6). In this respect, our study contributes to the valid measurement of the most important construct in
strategic management research.
Acknowledgments
We would like to thank Mark Orlitzky, Christoph Trumpp, and two anonymous reviewers for their insightful
comments on previous versions of this manuscript. All remaining mistakes are our own.

Declaration of Conflicting Interests


The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes
1. We thank an anonymous reviewer for highlighting the importance of construct clarity.
2. We thank an anonymous reviewer for drawing our attention to this relevant issue.
3. Survival is usually measured by a categorical variable that represents an organizations enduring presence in
the market (Richard, Devinney, Yip, & Johnson, 2009, pp. 732-734). We measure survival as the proportion
of years during which an organization is present in the stock market. We calculate the proportion during each
year of our time period based on the equation s (te y)/(2010 y), with s survival, te the year before a
company is delisted or the last year within our time period (i.e., 2010), and y year under consideration.
This operationalization is coarse-grained, but it corresponds with Baker and Kennedys (2002, p. 326) study.
4. We acknowledge that there are alternative measures for assets (e.g., capital employed) and for net profit (e.g.,
EBIT). Thus, ratios that use net profit as the nominator and assets as the denominator are exemplary for an
entire set of potential accounting return indicators. In online supplement 3, we extend our findings to these
indicators.
5. We calculate growth rates in all instances based on the [(tn tn1)/tn 1] formula (Weinzimmer, Nystrom, &
Freeman, 1998, p. 253).
6. We present an extended sample description including descriptive statistics (e.g., correlations, skewness, kurtosis, intraclass correlation coefficients, and design factors) in online supplement 1.
7. Multilevel confirmatory factor analysis (CFA) is another method that accounts for the dependence of our
data. We examine the robustness of our analysis with regard to this methodological decision by conducting
a two-level CFA. Results of this two-level CFA are similar to our primary results and are presented in online
supplement 2. We thank an anonymous reviewer for drawing our attention to this issue.
8. We thank an anonymous reviewer for highlighting this limitation.

References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control,
19(6), 716-723. doi:10.1109/TAC.1974.1100705
Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103(3), 411-423. doi:10.1037/0033-2909.103.3.411

Downloaded from orm.sagepub.com by guest on February 26, 2015

Hamann et al.

85

Bagozzi, R. P., & Baumgartner, H. (1994). The evaluation of structural equation models and hypothesis testing.
In R. P. Bagozzi (Ed.), Principles of marketing research (pp. 386-422). Cambridge, MA: Blackwell.
Bagozzi, R. P., & Yi, Y. (1988). On the evaluation of structural equation models. Journal of the Academy of
Marketing Science, 16(1), 74-94. doi:10.1177/009207038801600107
Bagozzi, R. P., Yi, Y., & Phillips, L. W. (1991). Assessing construct validity in organizational research.
Administrative Science Quarterly, 36(3), 421-458.
Baker, G. P., & Kennedy, R. E. (2002). Survivorship and the economic grim reaper. Journal of Law, Economics,
and Organization, 18(2), 324-361. doi:10.1093/jleo/18.2.324
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107(2), 238-246.
doi:10.1037/0033-2909.107.2.238
Bercovitz, J., & Mitchell, W. (2007). When is more better? The impact of business scale and scope on long-term
business survival, while controlling for profitability. Strategic Management Journal, 28(1), 61-79. doi:10.10
02/smj.568
Bisbe, J., Batista-Foguet, J.-M., & Chenhall, R. (2007). Defining management accounting constructs: A methodological note on the risks of conceptual misspecification. Accounting, Organizations and Society, 32(7-8),
789-820. doi:10.1016/j.aos.2006.09.010
Boyd, B. K., Gove, S., & Hitt, M. A. (2005). Construct measurement in strategic management research: Illusion
or reality. Strategic Management Journal, 26(3), 239-257. doi:10.1002/smj.444
Burt, R. S. (1976). Interpretational confounding of unobserved variables in structural equation models.
Sociological Methods & Research, 5(1), 3-52. doi:10.1177/004912417600500101
Capon, N., Farley, J. U., & Hoenig, S. (1990). Determinants of financial performance: A meta-analysis.
Management Science, 36(10), 1143-1159.
Carlson, K. D., & Herdman, A. O. (2010). Understanding the impact of convergent validity on research results.
Organizational Research Methods, 15(2), 17-32. doi:10.1177/1094428110392383
Combs, J. G., Crook, T. R., & Shook, C. L. (2005). The dimensionality of organizational performance and its
implications for strategic management research. In D. J. Ketchen (Ed.), Research methodology in strategy
and management (Vol. 2, pp. 259-286). Amsterdam: Elsevier.
Dechow, P. M. (1994). Accounting earnings and cash flows as measures of firm performance: The role
of accounting accruals. Journal of Accounting and Economics, 18(1), 3-42. doi:10.1016/01654101(94)90016-7
Dess, G. G., & Robinson, R. B., Jr. (1984). Measuring organizational performance in the absence of objective
measures: The case of the privately-held firm and conglomerate business unit. Strategic Management
Journal, 5(3), 265-273. doi:10.1002/smj.4250050306
Devinney, T. M., Yip, G. S., & Johnson, G. (2010). Using frontier analysis to evaluate company performance.
British Journal of Management, 21(4), 921-938. doi:10.1111/j.1467-8551.2009.00650.x
Edwards, J. R. (2003). Construct validation in organizational behavior research. In J. Greenberg (Ed.),
Organizational behavior: The state of the science (2nd ed., pp. 327-371). Mahwah, NJ: Erlbaum.
Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and
measures. Psychological Methods, 5(2), 155-174. doi:10.1037/1082-989x.5.2.155
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor
analysis in psychological research. Psychological Methods, 4(3), 272-299. doi:10.1037/1082-989X.4.3.272
Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobserveable variables and
measurement error. Journal of Marketing Research, 18(1), 39-50.
Fryxell, G. E., & Barton, S. L. (1990). Temporal and contextual change in the measurement structure of financial performance: Implications for strategy research. Journal of Management, 16(3), 553-569. doi:10.1177/
014920639001600303
Godfrey, P. C., & Hill, C. W. L. (1995). The problem of unobservables in strategic management research.
Strategic Management Journal, 16(7), 519-533. doi:10.1002/smj.4250160703

Downloaded from orm.sagepub.com by guest on February 26, 2015

86

Organizational Research Methods 16(1)

Hu, L.-t., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional
criteria versus new alternatives. Structural Equation Modeling, 6(1), 1-55. doi:10.1080/
10705519909540118
Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford
Publications.
Lee, J. (2009). Does size matter in firm performance? Evidence from US public firms. International Journal of
the Economics of Business, 16(2), 189-203. doi:10.1080/13571510902917400
Lubatkin, M., & Shrieves, R. E. (1986). Towards reconciliation of market performance measures to strategic
management research. Academy of Management Review, 11(3), 497-512. doi:10.5465/AMR.1986.4306197
MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample
size for covariance structure modeling. Psychological Methods, 1(2), 130-149. doi:10.1037/1082-989x.1.2.
130
March, J. G., & Sutton, R. I. (1997). Organizational performance as a dependent variable. Organization
Science, 8(6), 698-706. doi:10.1287/orsc.8.6.698
Marsh, H. W., Hau, K.-T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing
approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentlers
(1999) findings. Structural Equation Modeling, 11(3), 320-341. doi:10.1207/s15328007sem1103_2
Murphy, G. B., Trailer, J. W., & Hill, R., C. (1996). Measuring performance in entrepreneurship research.
Journal of Business Research, 36(1), 15-23. doi:10.1016/0148-2963(95)00159-X
OLeary-Kelly, S. W., & Vokurka, R. J. (1998). The empirical assessment of construct validity. Journal of
Operations Management, 16(4), 387-405. doi:10.1016/S0272-6963(98)00020-5
Peter, J. P. (1979). Reliability: A review of psychometric basics and recent marketing practices. Journal of
Marketing Research, 16(1), 6-17.
Rappaport, A. (1993). Corporate performance standards and shareholder value. Journal of Business Strategy,
3(4), 28-38. doi:10.1108/eb038987
Richard, P. J., Devinney, T. M., Yip, G. S., & Johnson, G. (2009). Measuring organizational performance:
Towards methodological best practice. Journal of Management, 35(3), 718-804. doi:10.1177/
0149206308330560
Rowe, W. G., & Morrow, J. L. (1999). A note on the dimensionality of the firm financial performance construct
using accounting, market, and subjective measures. Canadian Journal of Administration Science, 16(1),
58-70. doi:10.1111/j.1936-4490.1999.tb00188.x
Schwab, D. P. (2005). Research methods for organizational studies (2nd ed.). Mahwah, NJ: Erlbaum.
Starbuck, W. H. (2004). Methodological challenges posed by measures of performance. Journal of
Management and Governance, 8(4), 337-343. doi:10.1007/s10997-004-4125-z
Strasser, S., Eveland, J. D., Cummins, G., Deniston, O. L., & Romani, J. H. (1981). Conceptualizing the goal
and system models of organizational effectivenessimplications for comparative evaluation research.
Journal of Management Studies, 18(3), 321-340. doi:10.1111/j.1467-6486.1981.tb00105.x
Suddaby, R. (2010). Editors comments: Construct clarity in theories of management and organization. Academy
of Management Review, 35(3), 346-357. doi:10.5465/AMR.2010.51141319
Tosi, H. L., Werner, S., Katz, J. P., & Gomez-Mejia, L. R. (2000). How much does performance matter? A
meta-analysis of CEO pay studies. Journal of Management, 26(2), 301-339. doi:10.1016/S01492063(99)00047-1
Venkatraman, N. (2008). Advancing strategic management insights. Organizational Research Methods, 11(4),
790-794. doi:10.1177/1094428108320417
Venkatraman, N., & Grant, J. H. (1986). Construct measurement in organizational strategy research: A critique
and proposal. Academy of Management Review, 11, 71-87. doi:10.5465/AMR.1986.4282628
Venkatraman, N., & Ramanujam, V. (1986). Measurement of business performance in strategy research: A
comparison of approaches. Academy of Management Review, 11, 801-814. doi:10.5465/AMR.1986.
4283976

Downloaded from orm.sagepub.com by guest on February 26, 2015

Hamann et al.

87

Venkatraman, N., & Ramanujam, V. (1987). Measurement of business economic performance: An examination
of method convergence. Journal of Management, 13(1), 109-122. doi:10.1177/014920638701300109
Weinzimmer, L. G., Nystrom, P. C., & Freeman, S. J. (1998). Measuring organizational growth: Issues, consequences, and guidelines. Journal of Management, 24(2), 235-262. doi:10.1177/014920639802400205
Weygandt, J. J., Kimmel, P. D., & Kieso, D. E. (2010). Accounting principles. Hoboken, NJ: Wiley.

Author Biographies
P. Maik Hamann is research assistant and PhD candidate in Management Control/ Strategic Management at
the Technische Universitat Dresden. P. Maik Hamann received his bachelors degree from the University of
Paisley in Scotland and his German diploma degree from the Technische Universitat Dresden in Germany.
He is also lecturer of management accounting at the Technische Universitat Dresden and part-time for Dresden
International University. His main research interests encompass the design of corporate planning systems,
effects of corporate planning at the organizational level, contingency theory, measurement of organizational
effectiveness, construct validity, and philosophy of science.
Frank Schiemann is an Assistant Professor of Accounting at the University of Hamburg, Germany. He
received his PhD degree at the Technische Universitat Dresden, Germany. He was/is lecturer of management
accounting at the Technische Universitat Dresden, University of Hamburg and part-time for Dresden International University. His research focuses on firm valuation models as well as determinants and effects of firms
voluntary and mandatory disclosure decisions via different communication channels. His methodological focus
is on quantitative analysis methods, especially panel data models.
Lucia Bellora is a PhD candidate in Management Accounting and Management Control at the Technische
Universitat Dresden, Germany. She received her German diploma degree from the Technische Universitat in
Dresden. Lucia Bellora is also a lecturer of management accounting at the Technische Universitat Dresden, and
part-time at the Dresden International University and at the International Graduate School Zittau. Her research
interests include the design and performance effects of management control systems, the disclosure of extrafinancial information, and the validity of construct measurement. Her methodological interest is directed
especially towards quantitative analysis methods with a focus on structural equation modeling.
Thomas W. Guenther is a professor of Management Accounting and Control at Technische Universitat
Dresden. Thomas Guenther received his PhD and habilitation degree from University of Augsburg, Germany.
He has been a visiting professor several times at the University of Virginia and was/is teaching in MBA and
executive programs at Wirtschaftsuniversitat Wien, Austria; European Business School (EBS), Wiesbaden,
Germany; and Mannheim Business School, Germany. His work covers two fields of research: first, the design
of management control systems within management accounting and strategic management research, and
second, the measurement, valuation, and control of intangibles in financial and management accounting. He
is editor-in-chief of the Journal of Management Control and is on the editorial board of the Business
Administration Review (BARev). He also serves as board member of the Schmalenbach Association. His methodological background is in quantitative analysis, especially structural equation modeling, meta-analyses and
panel data models.

Downloaded from orm.sagepub.com by guest on February 26, 2015

Das könnte Ihnen auch gefallen