Sie sind auf Seite 1von 8

Null Model Analyses of Presence-

Absence Data in Ecology:


Combining Generalized Linear Models
and Monte Carlo Testing for the
Detection of Non-Random Patterns
Applied Statistics 2007 International Conference
Ribno, Slovenia
Jorge Navarro-Alberto
UADY. Yucatn, Mxico

Bryan F. J. Manly
WEST, Inc. Wyoming, USA
Outline
Why ecologists use null models?
Statistical approaches for null model analyses
of presence-absence data (e.g. species
occurrences on locations)
Generalized linear models as a tool
Testing non random patterns via Monte Carlo
Properties of the combined GLM-Monte Carlo
method
Comparison to other approaches
Conclusions
From an ecological point of view, NMs are useful for
pattern detection.
How? Compare a realization of an observed
ecological process and an associated model that
aims to eliminate the effect of that particular
ecological process: the null model (Harvey et al.
1983).
Tools for analysis of data generated by non-
experimental procedures, typical of community and
biogeographical studies, in an experimental setting.
The usual terminology and interpretation of
hypothesis testing of experimental data, like Type I
and Type II errors, are applicable
Ecologists use null models to compare observed data
Con el propsito de comparar datos observados con
modelos que asumen patrones al azar, los eclogos
acuden a los llamados modelos nulos.
Sin embargo, el uso de los modelos nulos ha estado
rodeado de controversia.

Presence and absence of species in systems
of patches provide basic information in
ecology and biogeography. For the purpose
of comparing the observed data with
assumed random patterns, ecologists and
biogeographers have resorted to using null
models via simulation. However, recognition
of patterns is challenging: there may be
different plausible null hypotheses
associated to different randomization
protocols.
Null model analyses of presence/absence
data in ecology (e.g., occurrences of species
on particular locations) can be characterized
into two broad categories: those where the
simulation protocols keep row (species) and
column (location) totals fixed in the null
matrices, and those where row and/or column
totals are allowed to vary. In contrast to the
research devoted to the first type of null
models, relatively little research has been
done to study the properties of the latter.
we describe a strategy for null model
construction by means of generalized linear
models for presence-absence data.
Assumptions for the generalized linear
models (GLMs) are that (1) occurrences are
independent of each other; (2) species and
island effects are the only explanatory
variables for each observation in the matrix;
and (3) the relationship between the
occurrence of each species-location
combination and the species and location
effects is non-linear.
For model definition, observable presence-
absence data are related with unobservable
hypothetical distributions of the number of
elements (called the "quasiabundance") of
each species-location combination; these
distributions are interpreted as different
scenarios of species occurrences from where
the best fitting model is selected among a
range of competitor null models.
The method produces fitted cell probabilities,
which are subsequently used for the
detection of non-random patterns in the
observed matrices, via parametric bootstrap.
As a consequence, the simulation protocol
allows both row and column totals to vary
from one simulation to the other.
Monte Carlo tests applied to suitable metrics
for the observed and simulated matrices are
then used to evaluate the adequacy of
species and location effects for the prediction
of each species-location combination.
Properties of the observed data matrices (e.g.,
sparseness and degeneracy) and constraints in the
simulation protocols are also evaluated.
Finally, using as statistic the estimated proportion of
allocated presences in each cell of randomly
generated matrices, it is shown that the set of null
matrices in the GLM approach can be different from
the set of null matrices obtained with three algorithms
keeping row and column totals fixed.
It is confirmed also that there may be differences in
the null universe of matrices produced by different
versions of this latter simulation protocol.