Anderson MJ 2001.permanova

INTRODUCTION
The analysis of multivariate data in ecology is becom-

ing increasingly important. Ecologists often need to test
hypotheses concerning the effects of experimental
factors on whole assemblages of species at once. This
is important for core ecological research and in studies
of biodiversity or environmental impacts in many
habitats, including marine subtidal environments
(Warwick et al. 1988; Gray et al. 1990; Chapman et al.
1995; Glasby 1997), mangroves (Skilleter 1996;
Kelaher et al. 1998), freshwater systems (Faith et al.
1995; Quinn et al. 1996) and terrestrial systems (Oliver
& Beattie 1996; Anderson & Clements, in press).
Univariate analysis of variance (ANOVA) provides an
extremely powerful and useful tool for statistical tests
of factors and their interactions in experiments
(Underwood 1981, 1997). Partitioning variation, as in
multifactorial ANOVA, is particularly important for test-
ing hypotheses in complex ecological systems with nat-
ural temporal and spatial variability. This partitioning
is also needed to test multivariate hypotheses in ecol-
ogy for experimental designs involving several factors.
This paper describes a new non-parametric test of
the general multivariate hypothesis of differences in the
composition and/or relative abundances of organisms
of different species (variables) in samples from differ-
ent groups or treatments. This test is a signicant
advance on previous methods because it can be based
on any measure of dissimilarity and can partition
variation directly among individual terms in a multi-
factorial ANOVA model. The test is applicable to any
situation where the simultaneous responses of many
potentially non-independent variables (usually abun-
dances of species in an assemblage) have been meas-
ured in samples from a one-factor or multifactorial
ANOVA design.
Powerful multivariate statistical methods, such as the
traditional multivariate analysis of variance (MANOVA),
have existed for decades (Hotelling 1931; Wilks 1932;
Fisher 1936; Bartlett 1939; Lawley 1939; Pillai 1955),
but tests using these statistics rely on assumptions that
are not generally met by ecological data. The assump-
tion that the data conform to a multivariate normal dis-
tribution is particularly unrealistic for most ecological
data sets. This is because the distributions of abun-
dances of individual species are usually highly aggre-
gated or skewed (e.g. Gaston & McArdle 1994). Also,
abundances take discrete values rather than being
continuous, species with small means often have
asymmetric distributions because they are necessarily
Austral Ecology (2001) 26, 3246
A new method for non-parametric multivariate analysis
of variance
MARTI J. ANDERSON
Centre for Research on Ecological Impacts of Coastal Cities, Marine Ecology Laboratories A11,
University of Sydney, New South Wales 2006, Australia
Abstract Hypothesis-testing methods for multivariate data are needed to make rigorous probability statements
about the effects of factors and their interactions in experiments. Analysis of variance is particularly powerful for
the analysis of univariate data. The traditional multivariate analogues, however, are too stringent in their assumptions
for most ecological multivariate data sets. Non-parametric methods, based on permutation tests, are preferable.
This paper describes a new non-parametric method for multivariate analysis of variance, after McArdle and
Anderson (in press). It is given here, with several applications in ecology, to provide an alternative and perhaps
more intuitive formulation for ANOVA (based on sums of squared distances) to complement the description pro-
vided by McArdle and Anderson (in press) for the analysis of any linear model. It is an improvement on previous
non-parametric methods because it allows a direct additive partitioning of variation for complex models. It does
this while maintaining the exibility and lack of formal assumptions of other non-parametric methods. The test-
statistic is a multivariate analogue to Fishers F-ratio and is calculated directly from any symmetric distance or
dissimilarity matrix. P-values are then obtained using permutations. Some examples of the method are given for
tests involving several factors, including factorial and hierarchical (nested) designs and tests of interactions.
Key words: ANOVA, distance measure, experimental design, linear model, multifactorial, multivariate dissimilarity,
partitioning, permutation tests, statistics.
*Present address: Department of Statistics, University of
Auckland, Private Bag 92019, Auckland, New Zealand (Email:
mja@stat.auckland.ac.nz).
Accepted for publication March 2000.
Pode usar qualquer medida de
dissimilaridade entre os objetos (por
exemplo, Gower, Jaccard, Bray-curtis,
Dudi.Hillsmith) e particionar a varincia
na matriz de distncia.
truncated at zero, and rare species contribute lots of
zeros to the data set. MANOVA test statistics are not
particularly robust to departures from multivariate nor-
mality (Mardia 1971; Olson 1974; Johnson & Field
1993). Finally, many of these test statistics are simply
impossible to calculate when there are more variables
than sampling units, which often occurs in ecological
applications.
Many non-parametric methods for tests of differ-
ences among a priori groups of observations (as in
MANOVA) have been developed (Mantel 1967; Mantel
& Valand 1970; Hubert & Schultz 1976; Mielke et al.
1976; Clarke 1988, 1993; Smith et al. 1990; Excofer
et al. 1992; Edgington 1995; Pillar & Orlci 1996;
Legendre & Anderson 1999). These methods gener-
ally have two things in common. First, they are based
on measures of distance or dissimilarity between pairs
of individual multivariate observations (which I will
refer to generally as distances) or their ranks. A statis-
tic is constructed to compare these distances among
observations in the same group versus those in differ-
ent groups, following the conceptual framework of
ANOVA. Second, they use permutations of the obser-
vations to obtain a probability associated with the null
hypothesis of no differences among groups.
These non-parametric methods generally fall into
two categories. First, there are those that can be based
on any chosen distance measure. There are many such
measures and these have different properties, which
make them appropriate for different kinds of data
(Legendre & Legendre 1998). For example, to express
differences in community structure, the semimetric
BrayCurtis measure of ecological distance (Bray &
Curtis 1957) or Kulczynskis (1928) semimetric meas-
ure are generally preferred over metric measures, like
Euclidean distance (Odum 1950; Hajdu 1981; Faith
et al. 1987; Clarke 1993). The methods that are exi-
ble enough to be used with any such distance measure
(e.g. Mantel 1967; Hubert & Schultz 1976; Smith
et al. 1990; Clarke 1993) have much to recommend
them for this reason.
The drawback to using these methods is that they are
not able to cope with multifactorial ANOVA. That is, they
are not able to partition variation across the many
factors that form part of the experimental design.
Consequently, for most complex designs, one must
analyse data as one-way analyses in multiple subsets
within particular levels of factors. These multiple one-
way analyses and qualitative interpretations of ordin-
ation plots are then used to infer something about
interactions or variability at different spatial scales (e.g.
Anderson & Underwood 1994; Kelaher et al. 1998).
Some of the proposed non-parametric methods do
allow partitioning for a complex design (e.g. Excofer
et al. 1992; Edgington 1995; Pillar & Orlci 1996), but
these are restricted for use with metric distance meas-
ures, which are not ideal for ecological applications.
Furthermore, even if these statistics were to be used,
there has been disagreement concerning appropriate
permutational strategies for complex ANOVA, particu-
larly for tests of interactions (e.g. Edgington 1995;
Manly 1997). There have been some recent examples
of direct statistical analyses of BrayCurtis distances
(Faith et al. 1995; Underwood & Chapman 1998).
These are restricted, however, to very specic experi-
mental designs or hypotheses and cannot be used for
any multifactorial ANOVA design.
Ecologists need a non-parametric multivariate
method that can partition variation based on any dis-
tance measure in any ANOVA design. The method needs
to be robust, interpretable by reference to the experi-
mental design, and should lack formal assumptions
concerning distributions of variables. The purpose of
this paper is to outline just such a method and to give
some ecological examples of its use. The more general
mathematical theory underlying this method, along
with simulations and a comparison with the related
approach of Legendre and Anderson (1999), is
described elsewhere (McArdle & Anderson, in press).
STRATEGY FOR NON-PARAMETRIC
MULTIVARIATE ANALYSIS
An outline for a general approach to the analysis of
multivariate data in ecology was given by Clarke and
Green (1988) and Clarke (1993). For experimental
designs used to test hypotheses dened a priori, there
are essentially four steps: (i) a choice is made con-
cerning an appropriate transformation and/or standard-
ization (if any) to apply to the data, given the hypothesis
and the scales and nature of the species variables; (ii)
a choice is made concerning the distance measure to
be used as the basis of the analysis (e.g. BrayCurtis,
Euclidean,
2
or other measure); (iii) ordination
(and/or clustering) is performed in order to visualize
patterns of resemblance among the observations based
on their community composition; and (iv) a non-
parametric multivariate test for differences among
groups is done to obtain a rigorous probabilistic
statement concerning multivariate effects of a priori
groups. Note that (iii) is not essential in terms of
the statistical test; ordination simply gives a visual
representation by reducing the dimensionality of the
data. In this paper, I focus on step (iv) of this pro-
cedure, which currently poses a problem for multi-
factorial designs.
DESCRIPTION OF THE TEST: ONE-WAY
DESIGN
The two essential considerations for the test are: (i) the
construction of the test-statistic, and (ii) the calculation
NON-PARAMETRIC MANOVA FOR ECOLOGY 33
of a P-value using some method of permutation. I will
describe the method, which I shall simply call non-
parametric MANOVA, rst for the one-way design and
then for more complex designs, followed by some eco-
logical examples. I deal here only with the case of
balanced ANOVA designs, but analogous statistics for any
linear model, including multiple regression and/or
unbalanced data, can be constructed, as described by
McArdle and Anderson (in press).
The test statistic: an F-ratio
The essence of analysis of variance is to compare vari-
ability within groups versus variability among different
groups, using the ratio of the F-statistic. The larger the
value of F, the more likely it is that the null hypothesis
(H0) of no differences among the group means (i.e.
locations) is false. For univariate ANOVA, partitioning
of the total sum of squares, SST, is achieved by calcu-
lating sums of squared differences (i) between indiv-
idual replicates and their group mean (SSW, the
within-group sum of squares; Table 1a), and (ii)
between group means and the overall sample mean
(SSA, the among-group sum of squares). Next, consider
the multivariate case where p variables are measured
simultaneously for each of n replicates in each of a
groups, yielding a matrix of data where rows are obser-
vations and columns are variables. A natural multi-
variate analogue may be obtained by simply adding up
the sums of squares across all variables (Table 1b). An
F-ratio can then be constructed, as in the univariate
case.
This multivariate analogue can also be thought of
geometrically (e.g. Cali nski & Harabasz 1974; Mielke
et al. 1976; Edgington 1995; Pillar & Orlci 1996), as
shown in Fig. 1 for the case of two groups and two vari-
ables (dimensions). Here, SSW is the sum of the
squared Euclidean distances between each individual
replicate and its group centroid (the point corres-
ponding to the averages for each variable, Fig. 1 and
34 M. J. ANDERSON
Table 1. Calculations of within-group sums of squares for partitioning in (a) univariate ANOVA, (b) a multivariate analogue
obtained by summing across variables, (c) a multivariate analogue equivalent to (b) obtained using sums of squared Euclidean
distances, (d) the traditional MANOVA approach, which yields an entire matrix (W) of within-group sums of squares and cross
products, and (e) the partitioning using inter-point distances advocated here, equivalent to (b) and (c) if Euclidean distances
are used
Univariate
(a) One variable SSW
a
i 1
n
j 1 ( yij y
i.)
2
Multivariate
(b) Summed across variables SSW
a
i 1
n
j 1
p
k 1 ( yijk y
i.k)
2
(c) Geometric approach SSW
a
i 1
n
j 1 ( yij y
i.)
T
( yij y
i.)
(inner product, a scalar, based on Euclidean distances, correlations between
variables ignored)
(d) Traditional MANOVA W
a
i 1
n
j 1 ( yij yy
i.)( yij yy
i.)
T
(outer product, a matrix, based on Euclidean distances, correlations between
variables matter)
(e) Inter-point geometric approach
(a scalar, based on any distance measure, correlations between variables ignored) SS1
1
N 1
i 1
N
j i 1 d
2
ij ij
n
yij, univariate observation of the jth replicate (j 1,, n) in the ith group (i 1,, a); yijk, observation of yij for the kth
variable (k 1,, p); yij, vector of length p, indicating a point in multivariate space according to p variables (dimensions) for
observation j in group i. A superscript T indicates the transpose of the vector, bars over letters indicate averages and a dot
subscript indicates averaging was done over that subscripted variable.
Fig. 1. A geometric representation of MANOVA for two
groups in two dimensions where the groups differ in location.
The within-group sum of squares is the sum of squared dis-
tances from individual replicates to their group centroid. The
among-group sum of squares is the sum of squared distances
from group centroids to the overall centroid. () Distances
from points to group centroids; () distances from group
centroids to overall centroid; (), overall centroid; (), group
centroid; (), individual observation.
Table 1c). Note that this additive partitioning using a
geometric approach yields one value for each of SSW,
SSA and SST as sums of squared Euclidean distances.
This geometric approach gives sums of squares equiv-
alent to the sum of the univariate sums of squares
(added across all variables) described in the previous
paragraph. This differs from the traditional MANOVA
approach, where partitioning is done for an entire
matrix of sums of squares and cross-products (e.g.
Mardia et al. 1979; Table 1d).
The key to the non-parametric method described
here is that the sum of squared distances between points
and their centroid is equal to (and can be calculated
directly from) the sum of squared interpoint distances
divided by the number of points. This important
relationship is illustrated in Fig. 2 for points in two
dimensions. The relationship between distances to
centroids and interpoint distances for the Euclidean
measure has been known for a long time (e.g.
Kendall & Stuart 1963; Gower 1966; Cali nski &
Harabasz 1974; Seber 1984; Pillar & Orlci 1996;
Legendre & Legendre 1998; see also equation B.1 in
Appendix B of Legendre & Anderson 1999). What is
important is the implication this has for analyses
based on non-Euclidean distances. Namely, an
additive partitioning of sums of squares can be obtained
for any distance measure directly from the distance
matrix, without calculating the central locations of
groups.
Why is this important? In the case of an analysis
based on Euclidean distances, the average for each vari-
able across the observations within a group constitutes
the measure of central location for the group in
Euclidean space, called a centroid. For many distance
measures, however, the calculation of a central location
may be problematic. For example, in the case of the
semimetric BrayCurtis measure, a simple average
across replicates does not correspond to the central
location in multivariate BrayCurtis space. An
appropriate measure of central location on the basis
of BrayCurtis distances cannot be calculated
easily directly from the data. This is why additive
Fig. 2. The sum of squared distances from individual
points to their centroid is equal to the sum of squared inter-
point distances divided by the number of points.
Fig. 3. Schematic diagram
for the calculation of (a) a dis-
tance matrix from a raw data
matrix and (b) a non-para-
metric MANOVA statistic for a
one-way design (two groups)
directly from the distance
matrix. SST, sum of squared
distances in the half matrix
() divided by N (total
number of observations); SSW,
sum of squared distances
within groups ( ) divided by
n (number of observations
per group). SSA SST SSW
and F = [SSA/(a 1)]/[SSW/
(N a)], where a the num-
ber of groups.
partitioning (in terms of average differences among
groups) has not been previously achieved using
BrayCurtis (or other semimetric) distances. However,
the relationship shown in Fig. 2 can be applied to
achieve the partitioning directly from interpoint
distances.
Thus, consider a matrix of distances between every
pair of observations (Fig. 3a). If we let Nan, the total
number of observations (points), and let dij be the dis-
tance between observation i 1,, Nand observation
j 1,, N, the total sum of squares is
1
SST
N1
i l
N
j i 1
d
2
ij
(1)
N
That is, add up the squares of all of the distances in
the subdiagonal (or upper-diagonal) half of the distance
matrix (not including the diagonal) and divide by N
(Fig. 3b). In a similar fashion, the within-group or
residual sum of squares is
1
SSW
N1
i l
N
j i 1
d
2
ij ij
(2)
n
where ij takes the value 1 if observation i and obser-
vation j are in the same group, otherwise it takes the
value of zero. That is, add up the squares of all of the
distances between observations that occur in the same
group and divide by n, the number of observations per
group (Fig. 3b). Then SSA SST SSW and a pseudo
F-ratio to test the multivariate hypothesis is
SSA/(a 1)
F (3)
SSW/(N a)
If the points from different groups have different cen-
tral locations (centroids in the case of Euclidean dis-
tances) in multivariate space, then the among-group
distances will be relatively large compared to the within-
group distances, and the resulting pseudo F-ratio will
be relatively large.
One can calculate the sums of squares in equations
(1) and (2) and the statistic in equation (3) from a
distance matrix obtained using any distance measure.
The statistic in equation (3) corresponds exactly to
the statistic in equation (4) of McArdle and Anderson
(in press), who have shown more generally how
partitioning for any linear model can be done directly
from the distance matrix, regardless of the distance
measure used. Another important aspect of the stat-
istic described above is that, in the case of a Euclidean
distance matrix calculated from only one variable,
equation (3) gives the same value as the traditional
parametric univariate F-statistic.
This is proposed as a new non-parametric MANOVA
statistic that is intuitively appealing, due to its analogy
with univariate ANOVA, and that is extremely relevant
for ecological applications. The results (in terms of
sums of squares, mean squares and pseudo F-ratios)
obtained for individual terms in a multivariate analysis
can be interpreted in the same way as they usually are
for univariate ANOVA. The difference is that the hypoth-
esis being tested for any particular term is a multivariate
hypothesis.
OBTAINING A P-VALUE USING
PERMUTATIONS
The multivariate version of the F-statistic described
here is not distributed like Fishers F-ratio under the
null hypothesis. This is so because (i) we do not expect
the individual variables to be normally distributed, and
(ii) we do not expect that the Euclidean distance will
necessarily be used for the analysis. Even if each of the
variables were normally distributed and the Euclidean
distance used, the mean squares calculated for the mul-
tivariate data would not each consist of sums of inde-
pendent
2
variables, because, although individual
observations are expected to be independent, individ-
ual species variables are not independent of one
another. Thus, traditional tabled P-values cannot be
used. A distribution of the statistic under the null
hypothesis can be created, however, using permutations
of the observations (e.g. Edgington 1995; Manly
1997). The only situation in which one could use the
traditional tabled probabilities would be if one had a
single variable that could reasonably be assumed to be
normally distributed and one used Euclidean distances
for the analysis.
Suppose the null hypothesis is true and the groups
are not really different (in terms of their composition
and/or their relative abundances of species, as measured
by the BrayCurtis distances). If this were the case,
then the multivariate observations (rows) would be
exchangeable among the different groups. Thus, the
labels on the rows that identify them as belonging to a
particular group could be randomly shufed (per-
muted) and a new value of F obtained (called, say, F
).
This random shufing and re-calculation of F
is then
repeated for all possible re-orderings of the rows rela-
tive to the labels. This gives the entire distribution of
the pseudo F-statistic under a true null hypothesis for
our particular data. Comparing the value of F obtained
with the original ordering of the rows to the distribution
created for a true null by permuting the labels, a P-value
is calculated as
(No. of F
F)
P (4)
(Total no. of F
)
Note that we consider the original observed value of F
to be a member of the distribution of F
under per-
mutation (i.e. it is one of the possible orderings of the
labels on the rows). The usual scientic convention of
an a priori signicance level of 0.05 is generally
used for interpreting the signicance of the result, as
in other statistical tests. It is also possible to view the
36 M. J. ANDERSON
P-value as a measure of condence concerning the null
hypothesis (Fisher 1955; Freedman & Lane 1983).
With a groups and n replicates per group, the num-
ber of distinct possible outcomes for the F-statistic in
a one-way test is (an)!/(a!(n!)
a
) (Clarke 1993). As it is
usually not practical to calculate all possible permuta-
tions, because of the time involved, P can be calculated
using a large random subset of all possible permutations
(Hope 1968). However, the precision of the P-value will
increase with increasing numbers of permutations.
Generally, at least 1000 permutations should be done
for tests with an -level of 0.05 and at least 5000 per-
mutations should be done for tests with an -level of
0.01 (Manly 1997).
ASSUMPTIONS
The only assumption of the test is that the observations
(rows of the original data matrix) are exchangeable
under a true null hypothesis. To assume exchangeability
under the null hypothesis is generally to assume that
the observations are independent and that they have
similar distributions (e.g. Boik 1987; Hayes 1996). By
similar distributions, I mean similar multivariate dis-
persions of points, not that the points are necessarily
multivariate normal. The test described here is a test
for differences in location (means or centroids) among
groups of multivariate observations based on the
chosen distance measure. Like its univariate counter-
part, which is sensitive to heterogeneity of variances,
this test and its predecessors that use permutations, like
ANOSIM (Clarke 1993), will also be sensitive to differ-
ences in the dispersions of points, even if the locations
do not differ.
The sensitivity of ANOSIMto differences in dispersion
has been suggested as an advantage by Clarke (1993).
This is because it was introduced in the context of
detecting environmental impacts, for which detection
of differences of any kind between control and impacted
locations is very important for environmental reasons.
Here, I simply suggest that caution be exercised in
interpreting the results of tests of signicance.
Determining if signicant differences among groups
may be due to differences in dispersion versus differ-
ences in location (or some combination of the two) is
an important statistical and ecological issue. The use
of permutation tests to obtain P-values does not avoid
this issue.
A useful comparative index of multivariate dispersion
has been given by Warwick and Clarke (1993). Also, a
separate permutation test for signicant differences in
multivariate dispersions (after removing effects of dif-
ferences in location), as an accompaniment to the non-
parametric MANOVA approach given here, will be
described elsewhere (Anderson, Dutilleul, Lapointe &
Legendre, unpubl. data).
DISTINCTION FROM TRADITIONAL TEST
STATISTICS
Although the statistic described here is sensitive to dif-
ferences in the relative dispersion of points among
groups, it takes no account of the correlations among
variables. In traditional MANOVA, the test-statistics (such
as Wilks Lambda) use information contained in the
between-group and/or within-group sample variance
covariance matrices (e.g. Table 1d, see Olson 1974;
Johnson & Field 1993). The traditional MANOVA tests
assume not only that the variance for each variable
remains constant across different groups (i.e. the
points in different groups have similar scatter), they also
assume that the relationships among the variables (their
covariances or correlations) do not differ across groups.
These differences in the sensitivities of different
multivariate test statistics are shown diagrammatically
in Fig. 4 for two variables (two dimensions). Figure 4(a)
shows two groups that differ in their correlation struc-
ture, but not in their variances or location. Figure 4(b)
shows two groups that differ only in their dispersions,
but not in their correlation or location. Although all
MANOVA statistics are designed to test for differences in
location, the traditional statistics will also be sensitive
to differences in correlations (Fig. 4a) as well as dif-
ferences in dispersion (Fig. 4b). The method of non-
parametric MANOVA described here will only be sensitive
to differences in dispersion (Fig. 4b). The correlations
among variables play no role in the analysis. If differ-
ences in the relationships amongst variables form a
hypothesis of interest, then some other non-parametric
techniques may be relevant (e.g. Biondini et al. 1991;
Krzanowski 1993).
ONE-WAY EXAMPLE: EFFECTS OF
GRAZERS
Consider the following example, taken from an eco-
logical study by Anderson and Underwood (1997). The
study was designed to test the hypothesis that grazing
by gastropods affects intertidal estuarine assemblages.
The experiment was done at an intertidal oyster farm
from January to July 1994 in Quibray Bay, south of
Sydney, New South Wales, Australia. Experimental
surfaces (10 cm10 cm) were enclosed in cages to
exclude gastropod grazers, while other surfaces were
left open to grazing. A third treatment consisted of
caged areas where natural densities of grazers were
included. This was a control for the effect of the cage
itself on assemblages. There were n 20 surfaces in
each of the three treatments. The numbers of indi-
viduals of each of 21 taxa (invertebrates and algae)
colonizing each surface were recorded.
The rationale for increasing the severity of the trans-
formation to increase the relative contribution of rare
versus abundant species in the analysis, given by Clarke
and Green (1988), is followed here. Note that the trans-
formation is not done in an effort to make data con-
form to any assumptions of the analysis. In this
example, the data contained some species that occurred
on a very large relative scale of abundance (e.g.
Spirorbid worms occurred in the thousands), so the
data were transformed by taking double-square roots
before the analysis. To visualize the multivariate
patterns among observations, non-metric multi-
dimensional scaling (MDS) was performed on the
BrayCurtis distances (Kruskal & Wish 1978), using
the PRIMER computer program. Non-parametric
MANOVA was then done on BrayCurtis distances, as
described in the previous section, using the computer
program NPMANOVA, written by the author in FORTRAN.
The number of possible permutations for the one-
way test in the case of the grazing experiment is
9.6 10
25
. A random subset of 4999 permutations was
used (Fig. 5). In this case, the null hypothesis of no
differences among groups was rejected, as the observed
value was much larger than any of the values obtained
under permutation (Fig. 5, Table 2).
A POSTERIORI TESTS
As in univariate ANOVA where there is a signicant result
in a comparison of 3 or more treatments, we may wish
to ask for the multivariate case: wherein does the sig-
nicant difference lie? This can be done by using the
same test, given above for the one-way comparison of
groups, but where individual pair-wise comparisons
between particular groups are done. To continue with
the logic of the analogous univariate situation, we
can use a t-statistic (which is simply the square root of
the value of the F-statistic described above) for these
38 M. J. ANDERSON
Fig. 4. Two variables in each of two groups of observations
where (a) the groups differ in correlation between variables,
but not in location or dispersion and (b) the groups differ
in dispersion, but not in location or correlation between
variables.
Fig. 5. Distribution of the non-parametric MANOVA
F-statistic for 4999 permutations of the data on assemblages
in different grazing treatments. The real value of F for these
data is very extreme by reference to this distribution
(F 36.62): thus there are strong differences among the
assemblages in different grazing treatments.
Table 2. Non-parametric MANOVA on BrayCurtis dis-
tances for assemblages of organisms colonizing intertidal sur-
faces in estuaries in three grazing treatments (grazers
excluded, grazers inside cages, and surfaces open to grazers)
Source d.f. SS MS F P
Grazers 2 18 657.65 9328.83 36.61 0.0002
Residual 57 14 520.89 254.75
Total 59 33 178.54
Comparison* t P
Open versus caged 8.071 0.0002
Open versus cage control 3.268 0.0002
Caged versus cage control 6.110 0.0002
*Pair-wise a posteriori tests among grazing treatments.
pairwise comparisons. These have the same inter-
pretation as univariate t-tests, but they test the general
multivariate hypothesis of no difference between the
groups on the basis of the BrayCurtis (or other
chosen) distances. This is Students univariate
t-statistic if Euclidean distances are chosen for the
analysis of only one variable. P-values for each test are
obtained using separate sets of permutations that are
only done across the pair of groups being compared.
In this example, a random subset of 4999 permutations
was used (out of a possible 6.8 10
9
) for each pair-
wise comparison.
For the analysis of the experimental removal of graz-
ers, there was a signicant difference among all pairs
of treatments: assemblages colonizing surfaces in cages
differed from those in the open or in cage controls
(t 8.07, P0.0002 and t 6.11, P0.0002, respec-
tively). Grazers had a signicant effect on the assem-
blages, which is consistent with the pattern of
separation of points corresponding to different treat-
ments in the non-metric MDS plot (Fig. 6). The fact
that assemblages on open surfaces also differed signi-
cantly from those in cage controls (t 3.27,
P 0.0002) suggested that there was some additional
artifact due to the presence of a cage in the experiment.
An important point here is that the a posteriori com-
parisons just described did not make any correction for
experiment-wise error rate (Day & Quinn 1989).
Similarly, the multivariate pair-wise tests available in
the computer program NPMANOVA are not corrected for
experiment-wise error rate. This means that with an a
priori signicance level of 0.05, one should expect
to obtain a signicant result in one out of every 20
independent tests by chance alone. Nevertheless, the
P-value obtained under permutation for any individual
pair-wise test is exact. Many of the methods used for
correcting error rates for multiple comparisons, such
as the Bonferroni method, are very conservative but
may be applied.
MORE COMPLEX DESIGNS
For more complex designs, we can start by consider-
ing the situation with two factors in a factorial (or
orthogonal) design. The principles used to partition the
variation directly from the distance matrix and to obtain
the statistics and permutation methods for individual
terms in the model can be readily generalized to other
multifactorial cases. The logic applied to multifactorial
ANOVA of univariate data (e.g. see Underwood 1981,
1997) also applies to the analysis of multivariate data
using this non-parametric procedure. For example,
tests of main effects should be examined after tests for
interactions.
Calculating the statistic
Let A designate factor 1 with a levels (treatments or
groups) and B designate factor 2 with b levels, with n
replicates in each of the ab combinations of the two fac-
tors. The total number of observations is Nabn. The
total sum of squares in the analysis is calculated as for
the one-way case according to equation (1). To partition
the variation, the within-group sum of squares for factor
A, ignoring any inuence of B, is calculated as
1
SSW(A)
N1
i 1
N
j i 1
d
2
ij ij
(A)
(5)
bn
where
ij
(A)
takes the value 1 if observation i and obser-
vation j are in the same group of factor A, otherwise it
takes the value of zero. Similarly, the within-group sum
of squares for factor B, ignoring any inuence of A, is
1
SSW(B)
N1
i l
N
j i 1
d
2
ij ij
(B)
(6)
an
Then, the corresponding sums of squares for each of
the main effects in the analysis are SSA SST SSW(A)
and SSB SST SSW(B).
The residual sum of squares is calculated by con-
sidering the interpoint distances within each of the ab
combinations of factor A and B, thus:
1
SSR
N1
i l
N
j i 1
d
2
ij ij
(AB)
(7)
n
where
ij
(AB)
takes the value 1 if observation i and obser-
vation j are in the same combination of factors A and
B, otherwise it takes the value of zero. We then can
Fig. 6. Non-metric MDS plot of assemblages colonizing
intertidal surfaces in Quibray Bay in each of three different
grazing treatments: (), cage control; (), open; ( ) caged.
easily obtain the sum of squares corresponding to the
interaction term: SSAB SST SSA SSB SSR. It may
be easier to consider the squared distances being
summed in equations (5) through (7) by reference to
their physical location in the distance matrix itself, as
illustrated in Fig. 7.
In the case of a two-factor design where one factor
is nested in the other, the same general approach is
used. In this case, however, there is no interaction term
in the analysis and we have instead SSB(A) SST SSA
SSR, where B(A) denotes that factor B is nested in
factor A.
Having obtained appropriate sums of squares, the
construction of the pseudo F-statistic for each term in
the analysis for non-parametric MANOVA then follows
the same rules and formulae as it would for the usual
univariate ANOVA. The construction of the F-ratio
depends on the experimental design, that is, whether
factors are nested or factorial and whether they are xed
or random, exactly as for univariate ANOVA (e.g.
Underwood 1981, 1997; Winer et al. 1991).
Doing the permutations
The method of permutation required to obtain an exact
test is not so simple if there is more than one factor in
the design. The choice of an appropriate permutation
method is not trivial and should be considered care-
fully for each term in the model. Indeed, the lack of
exact tests or knowledge of how the available approxi-
mate permutation tests might behave for complex
models has been a sticking point in the development
of multivariate non-parametric methods (e.g. Crowley
1992; Clarke 1993). To construct exact tests, two
important issues must be considered (Anderson & ter
Braak, unpublished data). First, which units should be
permuted (i.e. what are exchangeable under the null
hypothesis) and second, should any restrictions be
imposed on the permutations to account for other
factors in the design?
In many important situations, such as tests of inter-
actions, no exact permutation test can be done. Also,
there are times when the exact test imposes so many
40 M. J. ANDERSON
Fig. 7. Schematic diagram of the interpoint distances used
to partition the variability in the multivariate data set and to
calculate the sum of squares for each term in a two-factor
orthogonal design (each factor has two groups or levels). (a)
SSW(A) sum of squared distances within groups of A (),
divided by (bn). (b) SSW(B) sum of squared distances
within groups of B ( ), divided by (an). (c) SSR sum of
squared distances within combinations of AB ( ), divided
by (n) (residual sum of squares). SST sum of squared
distances in the total half matrix, divided by (abn),
SSA SST SSW(A), SSB SST SSW(B), SSAB SST SSA
SSB SSR.
restrictions as to render the test meaningless, due to
there being too few possible permutations left. In these
cases, approximate permutation tests should be used,
of which there are several alternatives, including
permutation of residuals and permutation of raw data
across all terms in the analysis (e.g. Freedman & Lane
1983; ter Braak 1992; Manly 1997). Some empirical
comparisons of these methods are provided by
Gonzalez and Manly (1998) and Anderson and
Legendre (1999).
ECOLOGICAL EXAMPLES
Two-way factorial design
The rst example is from an experiment in Middle
Harbour (part of Sydney Harbour) to test for the effect
of shade and proximity to the seaoor on assemblages
of invertebrates and algae colonizing subtidal hard sur-
faces near marinas (Glasby 1999). The experiment was
a two-way crossed (orthogonal) design with n 4 repli-
cate settlement plates (15 cm15 cm sandstone)
either far from or near to the seaoor (the factor of
position; all plates were at a similar depth of approxi-
mately 2 m below low water) and in one of three
shading treatments: (i) shade (an opaque plexiglass
roof), (ii) a procedural control (a clear plexiglass roof),
and (iii) no shade. Organisms colonizing the plates after
33 weeks were counted and a total of 46 taxa were
included in the analyses. Organisms that occurred
only once across the entire data set were not included.
Non-parametric MANOVA was done on BrayCurtis
distances calculated from double-root transformed data
using the FORTRAN program NPMANOVA. The sample
size was reasonably small for this study (n <5), so the
test was done using unrestricted permutation of raw
data (e.g. Manly 1997; Gonzalez & Manly 1998) with
4999 random permutations. Similar results were
obtained using permutation of residuals under a
reduced model (not shown).
There was no signicant interaction of shade and
position, but both main effects were signicant
(Table 3, Fig. 8). It was then of interest to compare the
groups corresponding to different shading treatments
using a posteriori tests (Table 3). It was not necessary
to do this for the effect of position, because this factor
only had two groups. Assemblages of organisms on
settlement plates near the bottom were extremely differ-
ent from those far away from the bottom (Fig. 8). Also,
assemblages on shaded plates were signicantly differ-
ent from those on either the procedural control or on
unshaded plates, which themselves did not differ
(Table 3, Fig. 8). This analysis also shows how the effect
of position relative to the bottom was much greater than
the effect of shading on assemblages in this experiment
(compare the values of their mean squares in Table 3).
The non-parametric approach advocated here allows
tests of signicance, but it also allows relative sizes of
effects to be compared directly through the partitioning
of the variation and examination of mean squares.
Three-way design, including nesting
The second example comes from an experiment to test
the hypothesis that the size of a patch available for
tances for assemblages of organisms colonizing subtidal sand-
stone settlement panels after 33 weeks in an estuary at
different distances from the seaoor (positions either near or
far) and in three different shading treatments
Position 1 5595.40 5595.40 13.536 0.0002
Shade 2 3566.44 1783.22 4.314 0.0006
Position 2 1238.94 619.47 1.499 0.1394
shade
Residual 18 7440.66 413.37
Total 23 17 841.43
Comparison* t P
Shade versus control 1.783 0.0154
Shade versus no shade 1.987 0.0018
Control versus no shade 0.866 0.5560
*Pair-wise a posteriori tests among shading treatments.
Fig. 8. Two-factor non-metric MDS plot of subtidal
assemblages colonizing sandstone settlement plates after
33 weeks in Middle Harbour that were either near to or far
from the seaoor and in one of three shading treatments. (),
Shaded; ( ), control; (), no shade; 1, far from the sea oor;
2, close to the sea oor.
colonization would affect the succession of assemblages
in equal areas on those patches (Anderson 1998). This
hypothesis was tested using wooden panels of three
different sizes (10 cm10 cm, 20 cm20 cm and
40 cm40 cm). Two panels of a given size were
attached to sticks that were then strapped to the
structure of an intertidal oyster farm in the Port
Stephens estuary in New South Wales, Australia in
January of 1995 (see Anderson 1998 for details). Six
sticks (two for each of the three patch sizes) were then
collected independently after periods of 3, 6, 9, 12
and 18 months of exposure to colonization. The experi-
mental design thus consisted of three factors: time
(5 periods of submersion), patch (3 sizes) and sticks
(2 sticks per time patch combination, a random
nested factor), with n 2 panels per stick. Organisms
colonizing panels were quantied in a 10 cm10 cm
area from each panel (chosen randomly from the larger-
sized panels). A total of 33 taxa were included in
multivariate analyses.
The analysis was done using NPMANOVA on Bray
Curtis distances calculated on double-root trans-
formed data, as for the previous examples. In this case,
however, an exact permutation test for the nested factor
(sticks) was carried out by permuting the observations
randomly across sticks, but only within the 5 3 com-
binations of levels of time patch. Then, for the test
of the upper-level terms (the main effects of time, patch
size and their interaction), individual replicates on a
stick were permuted together as a unit (i.e. whole sticks
were permuted). This is done so that the upper-level
effects can be tested against the variability across sticks,
not across individual replicates, as is necessary under
the null hypothesis for a nested hierarchy (e.g. Clarke
1993). For all tests, a subset of 4999 permutations was
used.
The nested factor of sticks was not signicant in the
multivariate analysis, but the time patch size inter-
action was signicant (Table 4). Individual pair-wise
comparisons of effects of patch size for each time were
nearly identical to the one-way tests given in Anderson
(1998) using analysis of similarities (ANOSIM, Clarke
1993). Assemblages were signicantly different on the
smallest patches compared to the other sized patches
after 3, 6, 9 or 12 months. Assemblages on the two
larger sized patches did not differ signicantly from one
another except after 12 months. After 18 months,
assemblages were similar on all patch sizes.
Non-metric MDS plots helped to interpret these
results. Two separate ordinations were done on these
data, as the stress value for the non-metric MDS
plot that included all observations was too high for a
reasonable interpretation. The effect of different patch
sizes appeared to be fairly consistent (in terms of
its magnitude and direction) after 3, 6 or 9 months
(Fig. 9a). After 12 or 18 months, the observations
become more scattered within and across the groups
and the effects of patch size become less clear (Fig. 9b).
Increased dispersion (variability in assemblages) after
these longer periods of time, compared to earlier
periods, is seen clearly in the two-factor plot of stick
centroids (Fig. 9c, which includes all data). As noted
earlier, the tests are sensitive to such differences in dis-
persion. Overall, although the two factors did interact,
the effect of time (i.e. succession) was relatively more
important in distinguishing assemblages than the size
of the patch (compare their mean squares in Table 4),
and effects of patch size decreased through time for
these assemblages (see Anderson 1998 for further
details).
DISCUSSION
Natural temporal and spatial variability is intrinsic to
ecological systems. Indeed, variability might be con-
sidered the currency of ecological scientic work. It is
for this reason that statistical analysis plays such an
important role in the development of ecology as a
science. In Design of Experiments, R. A. Fisher (1935,
p. 4) wrote:
We may at once admit that any inference from the
particular to the general must be attended with
some degree of uncertainty, but this is not the
same as to admit that such inference cannot be
absolutely rigorous, for the nature and degree of
uncertainty may itself be capable of rigorous
expression.
Quantitative statistical inference is indeed what is
needed for the rigorous interpretation of mensurative
or manipulative ecological experiments. Although our
conclusions may be uncertain, they are still rigorous in
the sense that the degree of uncertainty can be
expressed in terms of mathematical probability. In uni-
variate analysis, W. S. Gosset (Student 1908) made
this possible for comparisons of two treatments, while
R. A. Fisher made this possible for many treatments
and experimental factors. In a complex and intrinsi-
cally variable world, ANOVA allows us to identify simul-
taneous effects and interactions of more than one
factor, and to identify the uncertainty of our inferences
with rigour (e.g. Underwood 1981, 1997).
An important advance in the analysis of multivariate
data in ecology was the development of non-parametric
methods for testing hypotheses concerning whole
communities (e.g. Clarke 1988, 1993; Smith et al.
1990; Biondini et al. 1991). Some parallel advances
were made in the context of tests for signicant clus-
ters in cluster analysis (e.g. Good 1982; Gordon 1994).
Before these applications, particularly ANOSIM (Clarke
1993), became widely available, most multivariate
analyses in ecology focused on the reduction of dimen-
sionality to produce and interpret patterns (ordination
42 M. J. ANDERSON
methods) and the use of numerical strategies for placing
observations into natural groups (clustering). These
methods, although extremely useful towards their
purpose, do not rigorously express the nature and
degree of uncertainty concerning a priori hypotheses.
Methods like Mantels test (Mantel 1967), ANOSIM
(Clarke 1993) and multiresponse permutation pro-
cedures (Mielke et al. 1976) allow such rigorous
probabilistic statements to be made for multivariate
ecological data.
The drawback to such non-parametric tests is that
they cannot easily be extended to the multifactorial
designs so common now in ecological studies. Two
sticking points prevented this: (i) the lack of a gener-
alized statistic for partitioning variation, and (ii) the lack
of appropriate permutation methods (e.g. Clarke 1993;
Legendre & Anderson 1999). Although traditional
test-statistics used for MANOVA allow partitioning,
their restrictive assumptions have prevented their
effective use in ecology. The method of distance-
based redundancy analysis (Legendre & Anderson
tances for assemblages of organisms colonizing wooden
settlement panels of three different sizes after 3, 6, 9, 12 or
18 months on an intertidal oyster farm
Time 4 30 305.71 7576.43 20.50 0.0002
Patch Size 2 6414.99 3207.49 8.68 0.0002
Time patch 8 6224.03 778.004 2.10 0.0062
Sticks (time 15 5544.66 369.64 1.28 0.3384
patch)
Residual 30 8697.09 289.90
Total 59 57 186.48
3 6 9 12 18
Comparison months months months months months
Small versus 2.24* 1.86* 1.48* 2.00* 1.38
medium
Small versus 2.30* 2.87* 2.38* 2.54* 3.02*
large
Medium versus 1.49 1.47 1.69 1.75* 1.43
large
*P <0.05; pair-wise a posteriori tests among patch sizes
within each time using the t-statistic. Sticks were ignored in
the pair-wise tests. There were 35 possible permutations for
each.
Fig. 9. Two-factor non-metric MDS plots of assemblages
colonizing intertidal wooden settlement panels of three dif-
ferent sizes for (a) periods of 3, 6 or 9 months and (b) peri-
ods of 12 or 18 months for raw data and (c) where centroids
were plotted corresponding to each stick across all combi-
nations of time patch size. The points corresponding to
assemblages after particular times of submersion (numbers
indicate the period in months) have been outlined for clar-
ity. (), Small; (), medium; (), large.
1999) largely solved these issues, but it has many
rather complicated steps and involves the use of a
correction constant to distances. Although this
correction does not adversely affect the test (generally
making it more conservative, if anything), accurate
P-values are not given by this method in the case of
anything other than a one-factor design (McArdle &
Anderson, in press).
The method presented here has, in some sense,
combined the best of both worlds. Like the traditional
test-statistics, it can partition variation according to
any ANOVA design. Like the most exible non-
parametric methods, it can be based on any sym-
metric dissimilarity or distance measure (or their
ranks) and provides a P-value using appropriate per-
mutation methods. That is, one can still choose a rel-
evant transformation and an appropriate distance
measure (or use ranks of distances), consistent with the
method of ordination used to visualize patterns. By
using permutations, the test requires no specic
assumption concerning the number of variables or the
nature of their individual distributions or correlations.
The statistic used is analogous to Fishers F-ratio and
is constructed from sums of squared distances (or
dissimilarities) within and between groups. Another
feature of this statistic is that it is equal to to Fishers
original F-ratio in the case of one variable and when
Euclidean distances are used.
It is perhaps important to point out that the
BrayCurtis measure of dissimilarity may or may not
be the most appropriate measure to use in any given
situation. A point still commonly ignored is that
BrayCurtis and related measures, such as Kulczynskis
coefcient, will tend to under-estimate true ecological
distances when distances become large (i.e. when
observations have very few species in common),
as determined by simulations (Faith et al. 1987;
Belbin 1991). The BrayCurtis measure may therefore
only be useful insofar as it will produce reasonable
ecological ordinations, through the ranks used for
MDS. In light of this issue, Faith (1990) proposed
doing MANOVA on ordination scores obtained from
MDS.
An analysis based only on ordination scores will omit
some portion of the ecological information contained
in the original distance matrix and will depend on the
number of dimensions chosen for the MDS (Clarke
1993). The severity of this potential problem will
obviously increase with increases in the stress value
associated with an MDS plot. The actual amount
and the kind of information lost in reducing dimensions
using MDS are generally unknown and will depend
on the particular data set. Also, any subsequent
statistical inferences on ordination scores (using
traditional MANOVA, as described in Faith 1990; or
based on permutation tests) would be limited to
points in the ordination plots, rather than being applic-
able to the original observations.
It is not possible to identify a single best distance
measure for ecological data. The method described
here may be used with any distance measure chosen
(or on ranks of distances). It is useful to have the ex-
ibility to choose a distance measure appropriate for the
data and hypothesis being tested. Although the
BrayCurtis measure has now become commonplace
in ecological studies (perhaps due to its availability in
the PRIMER computer program, or due to its intuitive
interpretation as percentage difference, or due to the
results presented in Faith et al. 1987), there are still
many rivals. Over 60 measures of similarity or dissim-
ilarity have been described, with very few actual com-
parisons of their performance with different kinds
of ecological data (see Lamont & Grant 1979;
Legendre & Legendre 1998). This is undoubtedly an
area needing further research.
The approach advocated here is that multifactorial
analysis of variance, as successfully applied to univariate
data in ecology (e.g. Underwood 1981, 1997), can and
should also be applied to multivariate data for testing
hypotheses in a logical and rigorous way. It stands to
reason that the developments in experimental design
for ecology that require multifactorial ANOVA, in order
to, for example, avoid pseudo-replication (Hurlbert
1984), test for generality (Beck 1997) or test for envir-
onmental impact (e.g. Green 1979, 1993; Underwood
1993; Glasby 1997), should be incorporated into
multivariate analysis. The method described here
allows that to happen, but within a framework that is
general enough to suit our need for few assumptions
and exibility in the multivariate analysis of ecological
data.
ACKNOWLEDGEMENTS
I am indebted to those who have worked on and devel-
oped multivariate methods and permutation tests that
have led to the ideas in this manuscript, including
K. R. Clarke, E. S. Edgington, R. A. Fisher,
P. Legendre, B. F. J. Manly, N. Mantel, B. H. McArdle,
L. Orlci, V. D. P. Pillar, E. P. Smith, and C. J. F. ter
Braak. I also owe a great deal to A. J. Underwood for
his work in the statistical analysis of ecological experi-
ments using ANOVA, which inspired my pursuit of this
topic for multivariate analysis. T. Glasby kindly
provided data for the two-way factorial example. The
PRIMER computer program was provided courtesy of
M. R. Carr and K. R. Clarke, Plymouth Marine
Laboratories, UK. My colleagues at the Special
Research Centre for Ecological Impacts of Coastal
Cities provided logistic support, tested out the com-
puter program and commented on earlier versions of
the manuscript. The computer program NPMANOVA is
available from the author. This research was supported
by a U2000 Post-doctoral Fellowship at the University
of Sydney.
44 M. J. ANDERSON
REFERENCES
Anderson M. J. (1998) Effects of patch size on colonisation in
estuaries: revisiting the species-area relationship. Oecologia
118, 8798.
Anderson M. J. & Clements A. (in press) Resolving environmental
disputes: a statistical method for choosing among competing
cluster models. Ecol. Applic.
Anderson M. J. & Legendre P. (1999) An empirical comparison
of permutation methods for tests of partial regression
coefcients in a linear model. J. Stat. Comput. Simul. 62,
271303.
Anderson M. J. & Underwood A. J. (1994) Effects of substratum
on the recruitment and development of an intertidal estuar-
ine fouling assemblage. J. Exp. Mar. Biol. Ecol. 184, 21736.
Anderson M. J. & Underwood A. J. (1997) Effects of gastropod
grazers on recruitment and succession of an estuarine assem-
blage: a multivariate and univariate approach. Oecologia 109,
44253.
Bartlett M. S. (1939) A note on tests of signicance in multi-
variate analysis. Proc. Camb. Philos. Soc. 35, 1805.
Beck M. W. (1997) Inference and generality in ecology: current
problems and an experimental solution. Oikos 78, 26573.
Belbin L. (1991) Semi-strong hybrid scaling, a new ordination
algorithm. J. Veg. Sci. 2, 4916.
Biondini M. E., Mielke P. W. & Redente E. F. (1991) Permutation
techniques based on Euclidean analysis spaces: a new and
powerful statistical method for ecological research. In:
Computer Assisted Vegetation Analysis (eds E. Feoli & L.
Orlci) pp. 22140. Kluwer Academic Publishers,
Dordrecht.
Boik R. J. (1987) The FisherPitman permutation test: a non-
robust alternative to the normal theory F test when variances
are heterogeneous. Br. J. Math. Stat. Psychol. 40, 2642.
Bray J. R. & Curtis J. T. (1957) An ordination of the upland forest
communities of southern Wisconsin. Ecol. Monogr. 27,
32549.
Caliski T. & Harabasz J. (1974) A dendrite method for cluster
analysis. Commun. Stat. 3, 127.
Chapman M. G., Underwood A. J. & Skilleter G. A. (1995)
Variability at different spatial scales between a subtidal assem-
blage exposed to discharge of sewage and two control sites.
J. Exp. Mar. Biol. Ecol. 189, 10322.
Clarke K. R. (1988) Detecting change in benthic community
structure. In: Proceedings XIVth International Biometric
Conference, Namur: Invited Papers, pp. 13142. Socit
Adolphe Quetelet, Gembloux.
Clarke K. R. (1993) Non-parametric multivariate analyses of
changes in community structure. Aust. J. Ecol. 18, 11743.
Clarke K. R. & Green R. H. (1988) Statistical design and analysis
for a biological effects study. Mar. Ecol. Prog. Ser. 46,
21326.
Crowley P. H. (1992) Resampling methods for computation-
intensive data analysis in ecology and evolution. Ann. Rev.
Ecol. Syst. 23, 40547.
Day R. W. & Quinn G. P. (1989) Comparison of treatments after
an analysis of variance. Ecol. Monogr. 59, 43363.
Edgington E. S. (1995) Randomization Tests, 3rd edn. Marcel
Dekker, New York.
Excofer L., Smouse P. E. & Quattro J. M. (1992) Analysis of
molecular variance inferred from metric distances among
DNA haplotypes: application to human mitochondrial DNA
restriction data. Genetics 131, 47991.
Faith D. P. (1990) Multivariate methods for biological monitoring
based on community structure. In: The Australian Society of
Limnology 29th Congress, p. 17 (Abstract). Alligator Rivers
Region Research Institute.
Faith D. P., Dostine P. L. & Humphrey C. L. (1995) Detection
of mining impacts on aquatic macroinvertebrate communi-
ties: results of a disturbance experiment and the design of a
multivariate BACIP monitoring program at Coronation Hill,
N. T. Aust. J. Ecol. 20, 16780.
Faith D. P., Minchin P. R. & Belbin L. (1987) Compositional
dissimilarity as a robust measure of ecological distance.
Vegetatio 69, 5768.
Fisher R. A. (1935) Design of Experiments. Oliver & Boyd,
Edinburgh.
Fisher R. A. (1936) The use of multiple measurements in taxo-
nomic problems. Ann. Eugen. 7, 17988.
Fisher R. A. (1955) Statistical methods and scientic induction.
J. Roy. Stat. Soc. 17, 6978.
Freedman D. & Lane D. (1983) A nonstochastic interpretation
of reported signicance levels. J. Bus. Econ. Stat. 1, 2928.
Gaston K. J. & McArdle B. H. (1994) The temporal variability
of animal abundances: measures, methods and patterns. Phil.
Trans. Roy. Soc. Lond. 345, 33558.
Glasby T. M. (1997) Analysing data from post-impact studies
using asymmetrical analyses of variance: a case study of
epibiota on marinas. Aust. J. Ecol. 22, 44859.
Glasby T. M. (1999) Interactive effects of shading and proximity
to the seaoor on the development of subtidal epibiotic
assemblages. Mar. Ecol. Prog. Ser. 190, 11324.
Gonzalez L. & Manly B. F. J. (1998) Analysis of variance
by randomization with small data sets. Environmetrics 9,
5365.
Good I. J. (1982) An index of separateness of clusters and a
permutation test for its signicance. J. Stat. Comput. Simul.
15, 814.
Gordon A. D. (1994) Identifying genuine clusters in a classi-
cation. Comput. Stat. Data Anal. 18, 56181.
Gower J. C. (1966) Some distance properties of latent root and
vector methods used in multivariate analysis. Biometrika 53,
32538.
Gray J. S., Clarke K. R., Warwick R. M. & Hobbs G. (1990)
Detection of initial effects of pollution on marine benthos:
an example from the Ekosk and Eldsk oilelds, North Sea.
Mar. Ecol. Prog. Ser. 66, 28599.
Green R. H. (1979) Sampling Design and Statistical Methods for
Environmental Biologists. Wiley, New York.
Green R. H. (1993) Application of repeated measures designs in
environmental impact and monitoring studies. Aust. J. Ecol.
18, 8198.
Hajdu L. J. (1981) Graphical comparison of resemblance meas-
ures in phytosociology. Vegetatio 48, 4759.
Hayes A. F. (1996) Permutation test is not distribution free.
Psychol. Methods 1, 18498.
Hope A. C. A. (1968) A simplied Monte Carlo signicance test
procedure. J. Roy. Stat. Soc. 30, 58298.
Hotelling H. (1931) The generalization of Students ratio. Ann.
Math. Stat. 2, 36078.
Hubert L. & Schultz J. (1976) Quadratic assignment as a gen-
eral data analysis strategy. Br. J. Math. Stat. Psychol. 29,
190241.
Hurlbert S. H. (1984) Pseudoreplication and the design of eco-
logical eld experiments. Ecol. Monogr. 54, 187211.
Johnson C. R. & Field C. A. (1993) Using xed-effects model
multivariate analysis of variance in marine biology and
ecology. Oceanogr. Mar. Biol. Ann. Rev. 31, 177221.
Kelaher B. P., Chapman M. G. & Underwood A. J. (1998)
Changes in benthic assemblages near boardwalks in
temperate urban mangrove forests. J. Exp. Mar. Biol. Ecol.
228, 291307.
Kendall M. G. & Stuart A. (1963) The Advanced Theory of
Statistics, Vol. 1, 2nd edn. Charles Grifth, London.
Kruskal J. B. & Wish M. (1978) Multidimensional Scaling. Sage
Publications, Beverly Hills.
Krzanowski W. J. (1993) Permutational tests for correlation
matrices. Statistics and Computing 3, 3744.
Kulczynski S. (1928) Die Panzenassoziationen der Pieninen.
Bull. Int. Acad. Pol. Sci. Lett. Cl. Sci. Math. Nat. Ser. B,
(Suppl. II) 1927, 57203.
Lamont B. B. & Grant K. J. (1979) A comparison of twenty-one
measures of site dissimilarity. In: Multivariate Methods in
Ecological Work (eds L. Orloci, C. R. Rao & W. M. Stiteler)
pp. 10126. International Co-operative Publishing House,
Fairland.
Lawley D. N. (1939) A generalization of Fishers IX test.
Biometrika 30, 1807 (Corrections in Biometrika 30, 4679).
Legendre P. & Anderson M. J. (1999) Distance-based redundancy
analysis: testing multispecies responses in multifactorial eco-
logical experiments. Ecol. Monogr. 69, 124.
Legendre P. & Legendre L. (1998) Numerical Ecology, 2nd English
edn. Elsevier Science, Amsterdam.
McArdle B. H. & Anderson M. J. (in press) Fitting multivariate
models to community data: a comment on distance-based
redundancy analysis. Ecology.
Manly B. F. J. (1997) Randomization, Bootstrap and Monte Carlo
Methods in Biology, 2nd edn. Chapman & Hall, London.
Mantel N. (1967) The detection of disease clustering and a
generalized regression approach. Cancer Res. 27, 20920.
Mantel N. & Valand R. S. (1970) A technique of nonparametric
multivariate analysis. Biometrics 26, 54758.
Mardia K. V. (1971) The effect of non-normality on some multi-
variate tests and robustness to nonnormality in the linear
model. Biometrika 58, 10521.
Mardia K. V., Kent J. T. & Bibby J. M. (1979) Multivariate
Analysis. Academic Press, London.
Mielke P. W., Berry K. J. & Johnson E. S. (1976) Multi-response
permutation procedures for a priori classications. Commun.
Stat. Theory Methods 5 (14), 140924.
Odum E. P. (1950) Bird populations of the Highlands (North
Carolina) Plateau in relation to plant succession and avian
invasion. Ecology 31, 587605.
Oliver I. & Beattie A. J. (1996) Designing a cost-effective inver-
tebrate survey: a test of methods for rapid assessment of bio-
diversity. Ecol. App. 6, 594607.
Olson C. L. (1974) Comparative robustness of six tests in multi-
variate analysis of variance. J. Am. Stat. Assoc. 69, 894908.
Pillai K. C. S. (1955) Some new test criteria in multivariate
analysis. Ann. Math. Stat. 26, 11721.
Pillar V. D. P. & Orlci L. (1996) On randomization testing in
vegetation science: multifactor comparisons of relev groups.
J. Veg. Sci. 7, 58592.
Quinn G. P., Lake P. S. & Schreiber S. G. (1996) Littoral benthos
of a Victorian lake and its outlet stream: spatial and temp-
oral variation. Aust. J. Ecol. 21, 292301.
Seber G. A. F. (1984) Multivariate Observations. John Wiley and
Sons, New York.
Skilleter G. A. (1996) Validation of rapid assessment of damage
in urban mangrove forests and relationships with Molluscan
assemblages. J. Mar. Biol. Ass. UK 76, 70116.
Smith E. P., Pontasch K. W. & Cairns J. (1990) Community
similarity and the analysis of multispecies environmental
data: a unied statistical approach. Water Res. 24, 50714.
Student. (1908) The probable error of a mean. Biometrika 6,
125.
ter Braak C. J. F. (1992) Permutation versus bootstrap signi-
cance tests in multiple regression and ANOVA. In:
Bootstrapping and Related Techniques (eds K. H. Jckel,
G. Rothe & W. Sendler) pp. 7986. Springer-Verlag,
Berlin.
Underwood A. J. (1981) Techniques of analysis of variance in
experimental marine biology and ecology. Oceanogr. Mar.
Biol. Ann. Rev. 19, 513605.
Underwood A. J. (1993) The mechanics of spatially replicated
sampling programmes to detect environmental impacts in a
variable world. Aust. J. Ecol. 18, 99116.
Underwood A. J. (1997) Experiments in Ecology: Their Logical
Design and Interpretation Using Analysis of Variance.
Cambridge University Press, Cambridge.
Underwood A. J. & Chapman M. G. (1998) A method for
analysing spatial scales of variation in composition of assem-
blages. Oecologia 107, 5708.
Warwick R. M., Carr M. R., Clarke K. R., Gee J. M. & Green
R. H. (1988) A mesocosm experiment on the effects of
hydrocarbon and copper pollution on a sublittoral soft-
sediment meiobenthic community. Mar. Ecol. Prog. Ser. 46,
18191.
Warwick R. M. & Clarke K. R. (1993) Increased variability as a
symptom of stress in marine communities. J. Exp. Mar. Biol.
Ecol. 172, 21526.
Wilks S. S. (1932) Certain generalizations in the analysis of
variance. Biometrika 24, 47194.
Winer B. J., Broan D. R. & Michels K. M. (1991) Statistical
Principles in Experimental Design, 3rd edn. McGraw-Hill,
Sydney.
46 M. J. ANDERSON
FORUM
An Entomologist Guide to Demystify Pseudoreplication: Data Analysis
of Field Studies With Design Constraints
LUIS FERNANDO CHAVES
1,2
J. Med. Entomol. 47(3): 291298 (2010); DOI: 10.1603/ME09250
ABSTRACT Lack of independence, or pseudoreplication, in samples from ecological studies of
insects reects the complexity of working with living organisms: the nite and limited input of
individuals, their relatedness (ecological and/or genetic), and the need to group organisms into
functional experimental units to estimate population parameters (e.g., cohort replicates). Several
decades ago, when the issue of pseudoreplication was rst recognized, it was highlighted that
mainstream statistical tools were unable to account for the lack of independence. For example, the
variability as a result of differences across individuals would be confounded with that of the exper-
imental units where they were observed (e.g., pans for mosquito larvae), whereas both sources of
variability now can be separated using modern statistical techniques, such as the linear mixed effects
model, that explicitlyconsider thedifferent scales of variabilityinadataset (e.g., mosquitoes andpans).
However, the perception of pseudoreplication as a problem without solution remains. This study
presents concepts to critically appraise pseudoreplication and the linear mixed effects model as a
statistical solution for analyzing data with pseudoreplication, by separating the different sources of
variability and thereby generating correct inferences from data gathered in studies with constraints
in randomization.
KEY WORDS linear mixed effects model, Culex quinquefasciatus, Anopheles nuneztovari, bootstrap,
data analysis
Pseudoreplication is probably one of the most widely
cited and misunderstood concepts in the statistical
analysis of ecological studies on insects and other
organisms. Pseudoreplication is dened as the use of
inferential statistics to test for treatment effects with
data from experiments where either treatments are
not replicated . . . or replicates are not statistically dif-
ferent . . . (Hurlbert 1984). This concept has been
very inuential and pervasive, to the extent that pseu-
doreplication is widely cited as a major aw of most
eld studies (Heffner et al. 1996). Hurlberts major
claim was correct, and he basically showed that main-
stream statistical tools at that time (e.g., analysis of
variance) were not suitable for the analysis of most
experimental designs. However, the uncritical ap-
praisal of his study has been a major barrier for the
publicationof results and, therefore, the advancement
of ecology (Oksanen 2001). Hurlberts study did not
prevent the unsuitable analysis of valuable datasets or
the proliferation of unsound experimental practices,
such as the movement of sampling units to control for
spatial/temporal variability (Alto and Juliano 2001a,
2001b; Reiskind and Wilson 2004). As thoughtfully
presented by Oksanen (2001), the goal of ecological
studies is not the application of statistical analysis to
ecological data per se, but rather its application to the
understanding of ongoing ecological phenomena from
variation of individual phenotypic traits to the assem-
blages of organisms in populations, communities, and
ecosystems. Unlike physics or chemistry, in which the
supply of individual objects of study is practically
unlimited, the objects of study for anentomologist (or
moregenerallya naturalist) areniteandconstrained.
Thus, limitations inrandomizationwill likelyarise, and
thescienceof statistics has developednewsolutions to
correctly analyze the lack of independence in eld
data since Hurlberts study (Millar and Anderson
2004). The current forum article presents the follow-
ing: 1) key concepts of experimental design to criti-
cally appraise and demystify the concept of pseu-
doreplication; 2) linear mixed effects models
(LMEMs) as powerful tools to analyze data originated
from constrained designs or to produce more general
inferences from classical randomized designs (e.g.,
blocks); and 3) howthese tools can be used to further
gain insights from the data that can strengthen our
understanding of insect ecology. In developing point
2, equations and a guide to interpret them as models
used to analyze common entomological data are pre-
sented, as well as the implementation of this type of
analysis in the open source software R.
1
Corresponding author: Department of Environmental Studies,
Emory University, 400 Dowman Drive, Suite E510, Atlanta GA 30322
(e-mail: lfchave@emory.edu).
2
Laboratorio de Biologa Teo rica, Instituto de Zoologa y Ecologa
Tropical, Facultad de Ciencias, Universidad Central de Venezuela,
Caracas, Venezuela.
0022-2585/10/02910298$04.00/0 2010 Entomological Society of America
Field Studies, Experiments, and Statistical Data
Analysis
Experiments are one of the major tools for hypoth-
esis testing (Fisher 1935). In general, the idea is to
subject individual units of observation to varying de-
grees of independent and/or controllable factor(s),
and to determine how the levels of variation explain a
given pattern (Box 1980). Field studies focus on the
impacts of natural (or controlled) variation in envi-
ronmental factors on individual units of observation.
Hypothesis testing has been central to the develop-
ment of modern science, to the point that hypothesis-
drivenexperiments or eld studies are one of the most
prominent requirements for project support by fund-
ing agencies. One of the major reasons for the wide-
spread appeal of hypothesis-driven experiments and
eld studies has been their close association with tools
for data analysis to determine the impact of different
independent variables. The best example of a statis-
tical tool guiding experimental design is the use of the
linear model (LM). This model assumes that variabil-
ity across a set of individual units of observation (y
i
)
is explained by a series of n independent variables (x
1
,
x
2
, . . . , x
n
) and by a unique source of unexplained
variability, normally referred to as error (). These
models are linear, because the parameters enter lin-
early into the equation that relates the independent
variables to the outcome (Faraway 2006, Chaves and
Pascual 2007). A major constraint of these models is
that they assume total independence among the sub-
jects of study, i.e., individual observation units are
unrelated at least within strata (i.e., after accounting
for the explanatory variables), which is the formal
denition of replication. When there is a lack of
independenceacross objects of study(i.e., pseudorep-
lication), the use of LMs with a unique source of
variability is inappropriate, because the variability is
modeled incorrectly and can lead to spurious infer-
ences. For example, if mosquitoes are reared in pans
(or kissingbugs injars)tomeasurebodysizeof emerg-
ing adults from different experimental conditions, the
lack of independence that arises fromthe aggregation
into pans (or jars) will inate the error value (a.k.a.
residual variance) of the LM, in some cases leading to
incorrect inferences when the LMis compared with a
model that explicitly models the lack of independence
because of the aggregation into a functional experi-
mental unit (i.e., the pan or jar). More than 20 yr ago,
because of the limited statistical toolbox in ecology,
this issue was a major problemfor the correct analysis
of datasets fromstudies withdesignconstraints (Hurl-
bert 1984). However, strategies to handle the problem
of pseudoreplication were around at the time. For
example, in evolutionary ecology, individuals have
different degrees of common descent, and this vari-
ability by itself is often a subject of study. In the 1980s,
it was common to use nested half-sibling designs to
estimate the variance of families and individuals be-
longing to those families (Conner and Hartl 2004).
Also, the use of dened designs such as Greco-Roman
squares, Roman squares, and fractional factorials was
well establishedintheeldof engineeringandprocess
control (Montgomery 2005). Some of these balanced
designs were even used in studies of medically im-
portant insects (Carpenter 1982, Chesson 1984). In
fact, sophistication in randomization, when possible,
can be very useful to evaluate the impact of strategies
to control human-vector contact (Kirby et al. 2008).
For example, randomized control trials have been
used to demonstrate the importance of mosquito
screening in reducing the risk of malaria transmission
(Kirby et al. 2009). However, one of the major limi-
tations of designs that handle pseudoreplication is the
need for balanced designs, i.e., an equal number of
replicates per treatment. The inability to analyze un-
balanced designs with unequal number of replicates
per treatment has been overcome with the develop-
ment of maximum likelihood methods, especially the
restricted maximum likelihood method and its appli-
cation to estimate LMEM (Pinheiro and Bates 2000).
LMEM has the same fundamental assumptions of the
LM; it tries to explain the sources of variability across
a set of individual units of observation (y
i
) as function
of a series of n independent variables (x
1
, x
2
, . . . ., x
n
),
referred to as xed factors, but it can incorporate
additional sources of variability (the random factors),
besides the error (). The random factors can accom-
modate the lack of independence among the individ-
ual units of observation as a result of spatial, temporal,
genetic, or any exogenous environmental factor that is
not fully randomized. For example, the variance of
functional experimental units such as pans for mos-
quitoes or jars for kissing bugs can be explicitly mod-
eled, thus allowing the proper estimation of the
error variance, and thus limiting the chances of
committing a type II error, i.e., rejecting the null
hypothesis when true. Therefore, LMEMs allow the
statistical analysis of pseudoreplicated data. The
next section will provide a series of examples illus-
trating the use of LMEMs and how they compare
with similar LMs. The data used in the examples and
code to perform the analyses using the open source
statistical software R are included as supplementary
online material (http://www.envs.emory.edu/research/
Chaves_SOM_Pseudoreplication.html).
LMs Versus LMEMs
Factorial Designs. To illustrate the most basic dif-
ferences betweenLMs andLMEMs, I reference a eld
experiment designed to study oviposition by Culex
quinquefasciatus Say in Atlanta, GA (Chaves et al.
2009). Cx. quinquefasciatus larvae are normally absent
fromloticsystems, suchas rivers andcreeks. However,
several cities have relic sewage treatment systems
where runoff water and sewage are combined in the
same system, and after large rainfall events the com-
bined sewage efuent can overow into urban water
bodies (Chaves et al. 2009). In this experiment, the
effects of combined sewage overow water and nu-
trient addition on oviposition site selection by this
mosquito species were studied using 10 experimental
pools (water containers) at four sites in a forest patch.
292 JOURNAL OF MEDICAL ENTOMOLOGY Vol. 47, no. 3
For this example, data will be used from the experi-
ment whenegg rafts were removeddaily. Data use has
been restricted to a randomly extracted subsample
from the original data (only four pools per site) to
have a dataset similar to that of a balanced design. In
this experiment, oviposition (y) was measured by
counting the total number of egg rafts oviposited over
5 d. The experiment has three independent variables
(i.e., n 3): 1) x
1
water quality (with two levels:
combined sewage overow water and tap water as
control); 2) x
2
nutrient addition (added or absent
as control); and 3) x
3
sites (four in total). Only x
1
and x
2
are factors (each with two levels), because x
3
is an independent variable considered to test the
block effects of forest site on oviposition. Because
all treatments were present at each site, this is a
randomized block 2 2 factorial design, which is
randomized because both factors were present in all
four sites (blocks in the model), and 2 2 factorial
because each factor has two levels. The goal of a
factorial experiment is to test whether the factors
interact, which can be expressed by the following
LM:
y
il

1
x
1

2
x
2

3
x
1
x
2

4
x
3l

il
[1]
where is the average value of the observations;
1
,
2
, and
4
quantify the impact of each independent
variable;
3
the interaction of water quality and nu-
trient addition; and is the error, which is assumed to
be normally distributed. The subscript l denotes block
(site within the forest patch), and i is for individual
pools within a block (containers in a forest site).
Therefore, y
il
is the total number of rafts from a
given pool and block (i.e., container in a site). Table
1 shows the results of the analysis of variance for the
data using the model presented in (1). The natural
logarithm transformation of ln (y 1) is done to
normalize the data and fulll model assumptions.
Table 1 shows that neither block nor the interaction
between water quality and nutrient addition was
signicant (P 0.05).
The inuence of water quality and nutrients on Cx.
quinquefasciatus oviposition can be reanalyzed using a
LMEM. By contrast with the LM analysis, in which
inferences are done over blocks, the LMEMassumes
that blocks are random samples from a larger pop-
ulation and models block variability as a random
factor (Fig. 1). The equivalent to equation 1 for a
LMEM is:
y
il

1
x
1

2
x
2

3
x
1
x
2

l

il
[2]
where , the s, and y and have the same interpre-
tation as in equation 1, and quanties the variability
across the blocks, which is assumed to be normally
distributed. The signicance of factors can be tested
using F tests, which work well when designs are bal-
anced (equal number of samples per treatment and
block) and parameters can be estimated using maxi-
mum likelihood. More generally, signicance may be
tested using parametric bootstraps, for balanced or
unbalanced designs where parameters are estimated
by restrictedmaximumlikelihood(PinheiroandBates
2000, Faraway 2006). Table 2 shows the results of an
analysis of deviance, with parameters of equation 2
estimated using restricted maximum likelihood and
inference based on 1000 replications of parametric
bootstrap (an analysis in which datasets are simulated
and the results of likelihood ratio tests for studied
factors are compared with those of the true data to
compute the signicance of factors). In this example,
inference about the impact of water quality and nu-
trient addition is qualitatively similar using LM or
LMEM. However, LMEM provides additional insight;
it indicates oviposition is nely grained. The variance
at the individual container level is larger than at the
block level (Table 2: 0.059 and 0.024), a
pattern also observed in the original study for the full
dataset (Chaves et al. 2009). The data canbe observed
in Fig. 2.
Constrained Designs. One of the major limitations
of theLMis that it is not suitedtoanalyzedatasets with
constraints in randomization. For example, all repli-
cates froma treatment should be present in all blocks.
This is a frequent limitation in eld studies in which
features of a given landscape cannot be altered, an
underlying motivation behind split-plot designs, in
which some treatments do not vary across blocks.
Split-plots are widely used in agriculture (Faraway
2006) and economic entomology (Blumberg et al.
1997, Haile et al. 2000, Oyediranet al. 2007). However,
constraints in randomization can also arise as a prod-
uct of other trade-offs in experimentation or by the
nature of the questions asked. For example, the orig-
inal design of Chaves et al. (2009) was unbalanced in
the sense that each block had an unequal number of
replicates for eachoneof thetreatments. However, for
each block the amount of total nutrients and water
quality was constant; one of the questions was to
determine the grain of mosquito perception for ovi-
position choices. The largest variance was among the
individual oviposition containers, indicating a nely
grained perception, in contrast to a scenario of
coarsely grained mosquito perception, in which the
largest variance would be expected for the blocks or
sites. To illustrate the analysis of constrained designs,
the original data of Chaves et al. (2009) are sampled
in such a way that all experimental pools with nutri-
ents added and combined sewage overow water
Table 1. Analysis of variance for the effects of water quality
and nutrient addition on the ln of total number of egg rafts 1
oviposited over 5 d by Culex quinquefasciatus in Atlanta, GA
(Chaves et al. 2009)
Factor df
Sum
square
Mean
square
F Value Pr(F)
Water (
1
) 1 0.653 0.653 10.1401 0.01111*
Nutrient (
2
) 1 50.634 50.634 786.0312 4.54E-10*
Water nutrient (
3
) 1 0.013 0.013 0.2055 0.66105
Block (
4
) 3 0.467 0.156 2.4186 0.13339
Error () 9 0.58 0.064
This design was balanced. Original data and R code for analysis are
in the supplementary online material (http://www.envs.emory.edu/
research/Chaves_SOM_Pseudoreplication.html).
*Statistically signicant (P 0.05).
May 2010 CHAVES: DEMYSTIFYING PSEUDOREPLICATION 293
came from the same block (Fig. 2), and therefore the
design becomes unbalanced (Fig. 3). Thus, the LM
from equation 1 cannot be employed to analyze the
data, but the LMEM from equation 2 is suitable for
such an analysis. Results are presented in Table 3 and
are similar to those presented in Table 2, showing a
decreased variability in the blocks and a larger error.
In summary, LMEMs can uncover the same variance
pattern in data under pseudoreplication, a major ad-
vantage over LMs.
Spatial Variability
Organisms can be clustered in space, for example,
the larvae of mosquitoes can be associated with only
certain habitats where eggs are oviposited, thus mak-
ing their abundance autocorrelated in space (Pitcairn
et al. 1994). Several statistical tools can accommodate
the lack of spatial independence in data from eld
studies (Fortin and Dale 2005), and they have been
widely used with insects of medical importance
(Koenraadt et al. 2007, 2008; Vazquez-Prokopec et al.
2008). However, their description is outside the scope
of this article. LMEM can also be used to consider
spatial variability. LMEM are especially suitable for
cases when spatial scales are nested. For example,
mosquito larval samples coming from containers in
several houses that belong to the same neighborhood
are hierarchically nested. Several studies have used
this approachinrecent studies onmedicallyimportant
insects (Harrington et al. 2008, Chaves et al. 2009,
Gurtler et al. 2009). The LMEM tting procedure is
similar to the one used next to consider the lack of
temporal independence in longitudinal studies.
Longitudinal Studies
The fact that observations are repeated through
time in the same place (or from the same organisms)
can lead to data that are not independent and are
autocorrelated in time. One approach to this problem
Fig. 1. Blocks: xed or random? The left panel shows the case for blocks as a xed factor in LM, in which the assumption
is that inferences are exclusive for the observedblocks (withinsquares). The right panel shows the case for blocks as a random
factor inLMEMs, inwhichthe observedblocks (withinsquares) come froma larger population(i.e., blocks inside andoutside
squares). (Online gure in color.)
Table 2. Analysis of deviance for the effects of water quality
Fixed df Log likelihood LRT P
Water (
1
) 1 7.241 5.874 0.009*
Nutrient (
2
) 1 31.089 53.569 0.000*
Water nutrient (
3
) 1 4.304 8.612 0.664
Random Mean square (variance)
Blocks () 0.024
Error () 0.059
This design was balanced. Original data and R code for analysis are
in the supplementary online material (http://www.envs.emory.edu/
research/Chaves_SOM_Pseudoreplication.html). LRT, likelihood ra-
tio test.
Obtained with a parametric bootstrap.

is to use repeated measurements analysis (Faraway
2006) and time series analysis techniques (Shumway
and Stoffer 2000, Chaves and Pascual 2007). Time
series techniques have been used for longitudinal
studies of some vectors (Hayes and Downs 1980,
Strickman 1988, Feliciangeli and Rabinovich 1998,
Scott et al. 2000, Salomon et al. 2004). Alternatively,
LMEMcan model the lack of temporal independence
as a random factor, which is one of the many methods
for repeated measurements analysis (Faraway 2006).
Modeling the lack of temporal independence will be
illustrated by examining data from a study on the
biting and resting behavior of anophelines using ex-
perimental huts inthree villages of westernVenezuela
(Rubio-Palis and Curtis 1992). Mosquitoes were col-
lected during two nights per month and by catching
the landing mosquitoes on the legs of two catchers
between 1900 and 0700 hours, inside and outside ex-
perimental huts. Althoughseveral species were found,
only data for Anopheles nuneztovari Gabaldo n from
Guaquitas collected between August 1988 and Octo-
ber 1989 will be analyzed in this study (Fig. 4). In this
case, the response or dependent variable (y) is the
total number of landings for all huts, as presented in
the original study (Rubio-Palis and Curtis 1992). The
xed factors are as follows: 1) x
1
, the site with two
levels, inside and outside the hut; 2) x
2
, the landing time
with 12 levels corresponding to the hours between 1900
and 0700 hours; and 3) x
3
, the rainfall season with two
levels: dry (December-May) and wet (JuneNovem-
ber). Therandomfactors consider thedifferent scales of
temporal variability: 1)
l
, the year l; 2)
kl
, the month k
within a given year l; 3)
jkl
, the sampling day j within a
given month k and year l; and 4)
ijkl
, which is the error
i (error for an observation belonging to day j within a
given month k and year l). All random factors are as-
sumed to be independent and normally distributed. The
model equation is as follows:
y
ijkl

1
x
1

2
x
2

3
x
3

4
x
2
x
3

l

kl

jkl

ijkl
[3]
In this model, represents the mean value of all
observations;
1
,
2
, and
3
quantifytheimpact of each
independent variable on the number of landings; and
4
the interaction of season and landing time. Note
that choice of factors is dictated by the study objective:
quantication of the seasonal nocturnal biting pattern
outside and inside the experimental huts. Such quanti-
cation requires landing time, site, and season to be
treated as xed independent variables. The other tem-
poral variables need to be random factors accounting
Fig. 2. Boxplots (median and quartiles) for the natural
logarithm number of Cx. quinquefasciatus egg rafts 1: (A) in
the balanced (Tables 1 and 2) and (B) unbalanced (Table 3)
block designs to study the effects of water quality and nutrient
enrichment on oviposition. Tap indicates tap water and CSO
indicates combined sewage overow water. Y stands for nutri-
ent addition, and N for no additional nutrients. Data extracted
from Chaves et al. (2009). Original data are available in the
supplementary online material (http://www.envs.emory.edu/
research/Chaves_SOM_Pseudoreplication.html).
Fig. 3. Unbalanced design. Note that one block contains
all samples with added nutrients and combined sewage over-
ow water, and other treatments have unequal number
across other blocks. (Online gure in color.)
Table 3. Analysis of deviance for the effects of water quality
Fixed df Log likelihood LRT P
Water (
1
) 1 10.45 5.477 0.012*
Nutrient (
2
) 1 29.03 42.638 0.000*
Water nutrient (
3
) 1 7.711 0.0409 0.801
Blocks () 5.0405e-14
Error () 0.125
This design was unbalanced (because of the sampling from the full
dataset). Original data and R code for analysis are in the supplemen-
tary online material (http://www.envs.emory.edu/research/Chaves_
SOM_Pseudoreplication.html). LRT, likelihood ratio test.

for the lack of independence that arises from the re-
peated measurements through time. There was no vari-
ability because of the year of the observation( 0; see
supplementaryonlinematerial http://www.envs.emory.
edu/research/Chaves_SOM_Pseudoreplication.html),
anda simpler model, without a parameter for the annual
variability, was t, as follows:
y
ijk

1
x
1

2
x
2

3
x
3

4
x
2
x
3

k

jk

ijk
[4]
Results for this model are presented in Table 4. The
interaction between season and time and the main
effects of site and season are statistically signicant
(P 0.05). Daily observations for each month were
more homogeneous, having a lower variance than
those observations across months. For comparison
purposes, a LM was also t, as follows:
y
i

1
x
1

2
x
2

3
x
3

4
x
2
x
3

i
. . . [5]
Table 5 shows the results for the analysis with equa-
tion 5. All factors are signicant with this model (P
0.05). However, when compared with the model with
randomeffects, the main effect for landing time is not
signicant when the lack of independence in the data
is properly modeled with a LMEM (Table 4). These
analyses illustrate one of the problems of incorrectly
Fig. 4. Boxplots (median and quartiles) for the hourly number of Anopheles nuneztovari landings: (A) outside the house,
dry season; (B) inside the house, dry season; (C) outside the house, wet season; (D) inside the house, wet season. Data
extracted from Rubio-Palis and Curtis (1992). Original data are available in the supplementary online material (http://
www.envs.emory.edu/research/Chaves_SOM_Pseudoreplication.html).
Table 4. Analysis of deviance for the effects of site, landing
time, and season on Anopheles nuneztovari abundance in Guaqui-
tas, Venezuela (Rubio-Palis and Curtis 1992)
Factor df Log likelihood LRT P
Site (
1
) 1 3550 15.1 0.000*
Landing time (
2
) 11 3619 154.2 0.073
Season (
3
) 1 3548 11.2 0.000*
Landing time
season (
4
)
11 3542 124.4 0.000*
Month () 751.46
Day () 182.68
Error () 1669.42
Original data and R code for analysis are in the supplementary
online material (http://www.envs.emory.edu/research/Chaves_
SOM_Pseudoreplication.html). LRT, likelihood ratio test.

Table 5. Analysis of variance for the effects of site, landing
time, and season on Anopheles nuneztovari abundance in Guaqui-
tas, Venezuela (Rubio-Palis and Curtis 1992)
Factor df
Sum
square
Mean
square
F Value Pr(F)
Site (
1
) 1 17,914 17,914 6.727 0.009704*
Landing time (
2
) 11 162,525 14,775 5.5483 1.55E-08*
Season (
3
) 1 286,446 286,446 107.5666 2.2e-16*
Landing time
season (
4
)
11 77,851 7077 2.6577 0.002432*
Error () 671 178,6849 2663
Original data and R code for analysis are in the supplementary
online material (http://www.envs.emory.edu/research/Chaves_
SOM_Pseudoreplication.html).
modeling the lack of independence across observa-
tions: the LM rejects a null hypothesis that is true
(type II error) by saying that landing time by itself is
signicant (Table 5), when in reality it is only signif-
icant when considered in conjunction with the season
(Table 4).
Pseudoreplication: an Issue of the Past
As showninthis forum, pseudoreplicationnolonger
is an issue preventing the statistical analysis of exper-
iments and eld studies. Current statistical tools such
as LMEMcan model the lack of independence in eld
observations. However, pseudoreplication will most
likely always be present in any ecological study, be-
cause of the complexity of working with living organ-
isms that constrains full randomization or limits the
number of replicates. Although other objects of study,
like molecules or atoms, are numerous and wide-
spread, samples of living organisms are comparatively
few and organisms always are evolutionary and eco-
logicallyrelatedat somescale. Althoughthis forumhas
been focused on demystifying statistical concepts and
presents howto use LMEMmodels to address the lack
of independence in datasets, the ingenuity of statisti-
cians is laudable because many other techniques out-
side the scope of this article have beendevelopedover
recent years. Abest example includes the extension of
LMEM to accommodate non-normal observations in
generalized LMEMs (Bolker et al. 2009). Other tools
that do not consider the individual variability of ob-
servations, but rather the average across all samples,
like the generalized estimating equations (Faraway
2006), can address the lack of independence in ob-
servations, and have been used in the study of med-
ically important insects (Lindblade et al. 2000, Gure-
vitz et al. 2009). A third line of new computer-based
tools, including neural networks, trees (Olden et al.
2008), and random forests (Ruiz et al. 2010), does not
haveassumptions ondata independence, andhas been
successfully used to study insects of public health
importance (Hu et al. 2006, Ruiz et al. 2010). Thus,
pseudoreplication should no longer be considered as
a major aw that impairs the statistical analysis of
experiments and eld studies. Independence con-
straints in the manipulation and observation of organ-
isms are adequately handled by many available statis-
tical tools, thus enabling valid inferences from
valuable entomological data.
Acknowledgments
I am thankful to Yasmin Rubio-Palis for sharing her orig-
inal data on Anopheles nuneztovari from Guaquitas, Venezu-
ela. This work was funded by a Gorgas Research Award from
the American Society of Tropical Medicine and Hygiene and
Emory University. This work also beneted from comments
by the editor, anonymous reviewers, Jorge Rabinovich,
NicoleGottdenker, andGregDecker, andhelpful discussions
from a National Institutes of Health-Research and Policy on
Infectious Disease Dynamics (NIH-RAPIDD) study group
on mosquito-borne diseases.
References Cited
Alto, B. W., and S. A. Juliano. 2001a. Precipitation and tem-
perature effects on populations of Aedes albopictus
(Diptera: Culicidae): implications for range expansion.
J. Med. Entomol. 38: 646656.
Alto, B. W., andS. A. Juliano. 2001b. Temperatureeffects on
the dynamics of Aedes albopictus (Diptera: Culicidae)
populations in the laboratory. J. Med. Entomol. 38: 548
556.
Blumberg, A.J.Y., P. F. Hendrix, and D. A. Crossley. 1997.
Effects of nitrogen source on arthropod biomass in no-
tillage and conventional tillage grain sorghum agroeco-
systems. Environ. Entomol. 26: 3137.
Bolker, B. M., M. E. Brooks, C. J. Clark, S. W. Geange, J. R.
Poulsen, M. H. Stevens, and J. S. White. 2009. General-
ized linear mixed models: a practical guide for ecology
and evolution. Trends Ecol. Evol. 24: 127135.
Box, J. F. 1980. R. A. Fisher and the design of experiments,
19221926. Am. Stat. 34: 17.
Carpenter, S. R. 1982. Stemow chemistry: effects on pop-
ulationdynamics of detritivorous mosquitoes intree-hole
ecosystems. Oecologia 53: 16.
Chaves, L. F., and M. Pascual. 2007. Comparing models for
early warning systems of neglected tropical diseases.
PLoS Negl. Trop. Dis. 1: e33.
Chaves, L. F., C. L. Keogh, G. M. Vazquez-Prokopec, and
U. D. Kitron. 2009. Combined sewage overow en-
hances oviposition of Culex quinquefasciatus (Diptera:
Culicidae) in urban areas. J. Med. Entomol. 46: 220226.
Chesson, J. 1984. Effect of notonectids (Hemiptera, No-
tonectidae) on mosquitos (Diptera, Culicidae): preda-
tion or selective oviposition. Environ. Entomol. 13: 531
538.
Conner, J. K., and D. L. Hartl. 2004. A primer of ecological
genetics. Sinauer, Sunderland, MA.
Faraway, J. J. 2006. Extending the linear model with R: gen-
eralized linear, mixed effects and nonparametric regres-
sion models. CRC, Boca Raton, FL.
Feliciangeli, M. D., and J. Rabinovich. 1998. Abundance of
Lutzomyia ovallesi but not Lu-gomezi (Diptera: Psychod-
idae) correlated with cutaneous leishmaniasis incidence
in north-central Venezuela. Med. Vet. Entomol. 12: 121
131.
Fisher, R. A. 1935. The design of experiments. Oliver &
Boyd, Edinburgh, United Kingdom.
Fortin, M. J., andM.R.T. Dale. 2005. Spatial analysis: a guide
for ecologists. Cambridge University Press, Cambridge,
United Kingdom.
Gurevitz, J. M., U. Kitron, andR. E. Gurtler. 2009. Temporal
dynamics of ight muscle development in Triatoma in-
festans (Hemiptera: Reduviidae). J. Med. Entomol. 46:
10211024.
Gurtler, R. E., F. M. Garelli, and H. D. Coto. 2009. Effects
of a ve-year citywide intervention program to control
Aedes aegypti and prevent dengue outbreaks in northern
Argentina. PLoS Negl. Trop. Dis. 3: e427.
Haile, F. J., D. L. Kerns, J. M. Richardson, and L. G. Higley.
2000. Impact of insecticides and surfactant on lettuce
physiology and yield. J. Econ. Entomol. 93: 788794.
Harrington, L. C., A. Ponlawat, J. D. Edman, T. W. Scott, and
F. Vermeylen. 2008. Inuence of container size, loca-
tion, and time of day on oviposition patterns of the den-
gue vector, Aedes aegypti, inThailand. Vector Borne Zoo-
notic Dis. 8: 415423.
Hayes, J., and T. D. Downs. 1980. Seasonal changes in an
isolated population of Culex pipiens quinquefasciatus
(Dipter: Culicidae): a time series analysis. J. Med. Ento-
mol. 17: 6369.
Heffner, R. A., M. J. Butler, and C. K. Reilly. 1996. Pseu-
doreplication revisited. Ecology 77: 25582562.
Hu, W., S. Tong, K. Mengersen, B. Oldenburg, and P. Dale.
2006. Mosquito species (Diptera: Culicidae) and the
transmission of Ross River virus in Brisbane, Australia.
J. Med. Entomol. 43: 375381.
Hurlbert, S. H. 1984. Pseudoreplication and the design of
ecological eld experiments. Ecol. Monogr. 54: 187211.
Kirby, M., P. Milligan, D. Conway, and S. Lindsay. 2008.
Study protocol for a three-armed randomized controlled
trial to assess whether house screening can reduce ex-
posuretomalariavectors andreducemalariatransmission
in The Gambia. Trials 9: 33.
Kirby, M. J., D. Ameh, C. Bottomley, C. Green, M. Jawara,
P. J. Milligan, P. C. Snell, D. J. Conway, andS. W. Lindsay.
2009. Effect of two different house screening interven-
tions on exposure to malaria vectors and on anemia in
children in The Gambia: a randomized controlled trial.
Lancet 374: 9981009.
Koenraadt, C.J.M., J. Aldstadt, U. Kijchalao, A. Kengluecha,
J. W. Jones, and T. W. Scott. 2007. Spatial and temporal
patterns in the recovery of Aedes aegypti (Diptera: Cu-
licidae) populations after insecticide treatment. J. Med.
Entomol. 44: 6571.
Koenraadt, C.J.M., J. Aldstadt, U. Kijchalao, R. Sithiprasasna,
A. Getis, J. W. Jones, and T. W. Scott. 2008. Spatial and
temporal patterns in pupal and adult production of the
dengue vector Aedes aegypti in Kamphaeng Phet, Thai-
land. Am. J. Trop. Med. Hyg. 79: 230238.
Lindblade, K. A., E. D. Walker, A. W. Onapa, J. Katungu, and
M. L. Wilson. 2000. Land use change alters malaria
transmission parameters by modifying temperature in a
highland area of Uganda. Trop. Med. Int. Health 5: 263
274.
Millar, R. B., and M. J. Anderson. 2004. Remedies for pseu-
doreplication. Fisheries Res. 70: 397407.
Montgomery, D. C. 2005. Design and analysis of experi-
ments. Wiley, New York, NY.
Oksanen, L. 2001. Logic of experiments in ecology: is pseu-
doreplication a pseudoissue? Oikos 94: 2738.
Olden, J. D., J. J. Lawler, and N. LeRoy-Poff. 2008. Machine
learningmethods without tears: aprimer for ecologists. Q.
Rev. Biol. 83: 171193.
Oyediran, I. O., M. L. Higdon, T. L. Clark, andB. E. Hibbard.
2007. Interactions of alternate hosts, postemergence
grass control, and rootworm-resistant transgenic corn on
western corn rootworm (Coleoptera: Chrysomelidae)
damage and adult emergence. J. Econ. Entomol. 100:
557565.
Pinheiro, J. C., and D. M. Bates. 2000. Mixed effects models
in S and S-plus. Springer, New York, NY.
Pitcairn, M. J., L. T. Wilson, R. K. Washino, and E. Rejman-
kova. 1994. Spatial patterns of Anopheles freeborni and
Culex tarsalis (Diptera: Culicidae) larvae in California
rice elds. J. Med. Entomol. 31: 54553.
Reiskind, M. H., and M. L. Wilson. 2004. Culex restuans
(Diptera: Culicidae) oviposition behavior determined by
larval habitat quality and quantity in southeastern Mich-
igan. J. Med. Entomol. 41: 17986.
Rubio-Palis, Y., and C. F. Curtis. 1992. Biting and resting
behavior of anophelines in western Venezuela and im-
plications for control of malaria transmission. Med. Vet.
Entomol. 6: 325334.
Ruiz, M. O., L. F. Chaves, G. L. Hamer, T. Sun, W. M. Brown,
E. D. Walker, L. Haramis, T. L. Goldberg, and U. D.
Kitron. 2010. Local impact of temperature and precipi-
tation on West Nile virus in Culex species mosquitoes in
northeast Illinois, U.S.A. Parasit. Vectors 3: 19.
Salomon, O. D., M. L. Wilson, L. E. Munstermann, and B. L.
Travi. 2004. Spatial and temporal patterns of phleboto-
mine sand ies (Diptera: Psychodidae) in a cutaneous
leishmaniasis focus in northern Argentina. J. Med. Ento-
mol. 41: 3339.
Scott, T. W., A. C. Morrison, L. H. Lorenz, G. G. Clark, D.
Strickman, P. Kittayapong, H. Zhou, and J. D. Edman.
2000. Longitudinal studies of Aedes aegypti (Diptera: Cu-
licidae) in Thailand and Puerto Rico: population dynam-
ics. J. Med. Entomol. 37: 7788.
Shumway, R. H., andD. S. Stoffer. 2000. Timeseries analysis
and its applications. Springer, New York, NY.
Strickman, D. 1988. Rate of oviposition by Culex quinque-
fasciatus in San Antonio, Texas, during three years. J. Am.
Mosq. Control Assoc. 4: 339344.
Vazquez-Prokopec, G. M., M. C. Cecere, U. Kitron, and R. E.
Gurtler. 2008. Environmental and demographic factors
determining the spatial distribution of Triatoma guasay-
ana in peridomestic and semi-sylvatic habitats of rural
northwestern Argentina. Med. Vet. Entomol. 22: 273282.
Received 12 October 2009; accepted 20 January 2010.

Anderson MJ 2001.permanova

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Anderson MJ 2001.permanova

Hochgeladen von

Copyright:

Verfügbare Formate

INTRODUCTION

The analysis of multivariate data in ecology is becom-

Obtained with a parametric bootstrap.

Obtained with a parametric bootstrap.

Obtained with a parametric bootstrap.

Das könnte Ihnen auch gefallen