Richards 2005

422
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 3, MARCH 2005
Analysis of Remotely Sensed Data: The

Formative Decades and the Future
John A. Richards, Fellow, IEEE
AbstractDevelopments in the field of image understanding

in remote sensing over the past four decades are reviewed, with
an emphasis, initially, on the contributions of David Landgrebe
and his colleagues at the Laboratory for Applications of Remote
Sensing, Purdue University. The differences in approach required
for multispectral, hyperspectral and radar image data are emphasised, culminating with a commentary on methods commonly
adopted for multisource image analysis. The treatment concludes
by examining the requirements of an operational multisource
thematic mapping process, in which it is suggested that the most
practical approach is to analyze each data type separately, by
techniques optimized to that datas characteristics, and then to
fuse at the label level.
Index TermsFusion, multisensor, multisource, thematic
mapping.
I. INTRODUCTION
AVID Landgrebe and his coworkers from the Laboratory

for Applications of Remote Sensing (LARS) at Purdue
University were seminal in devising many of the procedures
that are now commonplace in thematic mapping from remotely
sensed image data. Because of early data limitations, many of
the analytical procedures developed had to appeal to simple data
models and techniques taken directly from signal processing
(such as maximum-likelihood estimation). Nevertheless, many
of those methods have endured and set the benchmarks against
which newer developments are often assessed.
In many ways, the field had its genesis in the signal processing
methods of the 1950s and 1960s and their extension to handling
image data. It received special impetus with interest in lunar and
planetary mapping missions, which drove not only the development of imaging technologies, but also many of the image processing and enhancement procedures that sit alongside methods
for thematic mapping. A good account of this can be seen in the
first edition of Castlemans book [1].
We are now at the stage where we have available a range
of successful and widely used analytical procedures, some of
which are tailored to particular data types. We also have an abundance of data, quite contrary to the situation at the start of the
spaceborne era, so that the analyst is now challenged to choose
from among a number of coincident datasets when undertaking
thematic mapping in an operational setting.
But perhaps the greatest contemporary challenge is to derive
practical labeling methodologies for effective thematic mapping
Manuscript received February 15, 2004; revised July 26, 2004.
The author is with the Research School of Information Sciences and Engineering, The Australian National University, Canberra ACT 0200, Australia
(e-mail: John.Richards@anu.edu.au).
Digital Object Identifier 10.1109/TGRS.2004.837326
from mixed data types to allow the inherent information contents of complementary datasets to be brought together. Indeed,
much of the current research agenda is focused on finding viable
fusion methods. It is difficult, though, to envisage operational
image understanding with a continuation of methods that seek to
fuse data. At minimum, the fact that each given data type is usually best handled by its own (matched) analytical methods belies data fusion as an operational method for multisource image
analysis. Instead, individual data types may be best analyzed
separately, with combination occurring at the label level through
some form of symbolic processing.
After reviewing the significant developments in thematic
mapping, concentrating particularly on the work of Landgrebe
and colleagues, the problem of operational multisource classification is considered from the perspective of label-level fusion.
II. PROBLEM DOMAIN
The essential problem in thematic mapping is to specify the
data or information that needs to be gathered about a pixel
to allow a label to be attached to the pixel consistent with an
application of interest. In a modern operational setting, the
application requirements would be specified by a client, and
a consultant would have the task of choosing the datasets and
analytical methods for generating an acceptable thematic map.
That is the ideal. In reality, and certainly in the early years
of remote sensing, the problem statement is more like: given
the data or information actually available, how effectively can
a pixel be labeled into a class of interest?
The process is a mapping from available data (and sometimes
other information) to a label. We express this mapping as
where
in which
is the data description of the pixel, and
is the
associated class of interest. As indicated, generally consists
of a (column) pixel vector of the measurements.
We need to find procedures that allow us to extract meaningful information ( ) from the data ( ) and then move to an
understanding of the scene being imaged. In other words, we
need to move along the chain
data
information
understanding
All relevant, coregistered spatial measurements can contribute

to the pixel data description and thus aid in scene understanding.
A more complete data description of a pixel would then be
0196-2892/$20.00 2005 IEEE
RICHARDS: ANALYSIS OF REMOTELY SENSED DATA
423
Fig. 1. Deriving the most appropriate label for a pixel based on spatial,
contextual, and prior sources of data or information.
where the represent the different data types. Recognition of

this more complete pixel descriptor was the forerunner of data
fusion methods, and paralleled also the evolution of geographic
information systems (GIS).
It is not just pixel-specific measurements that tell us something useful about what the pixel represents; its context in relation to other pixels is also significant. That context could be
associations among near neighboring pixels (spatial context) or
could be represented by the texture of the region in which the
pixel resides. So, the data description more generally could be
expressed
where the is a representation of context.

Further, we may have some preexisting information available
about the pixelfrom the knowledge of some expert or from the
application domain, so that the pixel description most generally
is now a data and information description of the form
(1)
where represents sources of available ancillary information.
Notwithstanding the vast range of procedures, over the past
three to four decades research has focused on devising mapping techniques for labeling pixels based on differing degrees
of complexity in the data/information description of (1), and as
depicted in Fig. 1.
III. DRIVERS
Early algorithm development was based on multispectral
visible and infrared (optical) data with about 12 to 18 channels.1 With so few samples of reflectance spectra it was not
possible to understand the corresponding ground cover type
for a pixel by relying on scientific (spectroscopic) knowledge.
Nevertheless, the available samples were sufficient to differentiate fundamental cover types and to study vegetation type and
condition. Building on earlier work with learning machines
[2] and statistical pattern recognition [3], the logical analytical
method to separate apparent classes in the recorded data was
to use discriminant analysis of one form or another.
1Even though there had been aircraft radar missions since the early 1960s,
the single data channel available (even with the spaceborne missions of the
late 1970s and early 1980s) meant that automated interpretation techniques
were not relevant (as against image processing and enhancement tools to
make the data more interpretable by a human analyst).
What has happened since? Three broad trends in sensor and

information system developments have driven the evolution of
machine assisted interpretation algorithms. They are:
improvement in spectral measurements in the optical domain, both in numbers of channels and spectral resolution;
availability of multidimensional (wavelength, polarization, and incidence angle) radar data;
ready availability of geolocated spatial data types in a
geographical information systems.
The many and varied methods developed by different research
groups have been in response largely to one of these three evolutions. We now highlight those developments concentrating,
where appropriate, on the major contributions that grew out
of the Laboratory for Applications of Remote Sensing (LARS)
under the leadership of David Landgrebe.2
IV. MULTISPECTRAL AND HYPERSPECTRAL DATA METHODS
A. Fundamentals
Although not readily available,3 a LARS technical report
in 1966 [5] by Landgrebe first explored the use of pattern
recognition procedures for analyzing remotely sensed data.
Several related reports followed in 1966 and 1967, leading
to the first general publication of the signal-processing-based
pattern recognition approach to pixel labeling in 1969 [6].
Gaussian maximum-likelihood classification was advanced as
the most reasonable way to undertake pixel labeling with the
small numbers of spectral measurements per pixel then available,
thus introducing the analytical process that has become the
standard thematic mapping procedure in remote sensing ever
for a pixel that
since. Essentially, it seeks to find the class
. By using Bayes
maximizes the posterior probability
rule, that is equivalent to maximizing
(2)
is the prior probability of class membership of
in which
the pixel.
Many of the other processes now regarded as standard components in pixel labeling methodologies were also devised at
LARS and applied to aircraft multispectral data over the four
years 19681971. These include clustering for unsupervised
classification [7], the use feature selection [8], [9] and the use
of the minimum-distance algorithm for supervised classification [10]. Commonly used separability measures for feature
reduction were also investigated and employed around that
same period [11], [12].
Thus, effectively before the launch of Landsat-1, LARS had
developed the suite of techniques regularly applied to multispectral data from the spaceborne era. Based on this experience, the
2Landgrebe was not the inaugural Director of LARS. The Laboratory
commenced in 1966 under Ralph Shay, then Head of Purdue Universitys
Department of Botany and Plant Pathology who in 1961 had chaired a National
Academy of Science committee on Aerial survey methods in agriculture[4].
Landgrebe took over the Directorship in 1969 and continued in that role until
1981
3Many of the LARS technical reports will be found in downloadable
form at http://www.lars.purdue.edu/home/References.html.
424
first book dealing with quantitative techniques for analyzing remotely sensed data was produced from LARS in 1978 [13].
A 1975 report by Fleming et al. [14] demonstrated how clustering can be used to resolve data multimodality in preparation
for the use of Gaussian maximum-likelihood supervised classification.4 Flemings hybrid methodology has remained one
of the analytical mainstays for thematic mapping from optical
image data. More recently, it has been seen that the method can
be generalized, rendering it amenable to hyperspectral datasets
[15].
During the same period, there were a number of key developments occurring outside LARS that have had a continuing influence of remote sensing image analysis. Haralick et al. [16]
in 1973 provided measures of image texture that are still in use
today [17]. In 1976, Hord and Brooner [18] looked at measures
of thematic map accuracy and thus commenced a significant period of continuing research concerned with sampling methods
for accuracy assessment [19], [20].
So, by the end of the first decade after the formation of LARS,
all the foundations for digital image analysis in remote sensing
had been established. It was the principal decade upon which
later work was to elaborate and expand.
In the latter part of the 1970s, there were three early, important developments to do with image transformation. In 1976, Eppler used the class-sensitive transform associated with canonical
analysis to effect feature reduction [21], while Jensen and Waltz
[22], in a celebrated short paper in 1979, applied the principal
components transformation for feature reduction and display
purposes. Soha and Schwartz [23] showed how the transformation could lead to a decorrelated image display procedure. Kauth
and Thomas in 1976 [24] demonstrated that application-specific
transformations could be devised, and proposed their tasseled
cap model. In 1979, it was shown that the principal components
transformation could be used to monitor changes in land cover
between images of the same region taken on different dates [25].
B. Classification in Context
The next significant era in remote sensing thematic mapping
commenced with the realization that classification should be
carried out sensitive to the spatial neighborhood of a pixel and
by incorporating information from other available spatial data
sources.
The significance of spatial correlation among pixels at satellite spatial resolutions was recognized early on by LARS staff
[26][28] leading to the well-known ECHO classifier described
in detail in 1976 [29]. ECHO, which is still used 25 years after
its introduction,5 is based on region growing to find homogeneous sets of pixels that can be characterized by group means
and covariances. All the pixels of a given group are labeled in
one step by assessing group similarity to each of the training
classes. Pixels that do not naturally occur in groups are handled individually by regular point classification methods (such
as maximum likelihood).
4Mixture models have also been employed to resolve multimodalitysee

Section IV-F
5Available in the MuliSpec software package.
Hoffer and colleagues at LARS were also among the first to

experiment with the incorporation of data, other than spectral,
into a multispectral classification [30], [31], while Swain, who
received his Purdue Ph.D. in 1970 when working at LARS, pioneered statistical approaches to the incorporation of context
[32].
Building on the work of Rosenfeld and colleagues, from the
then Computer Vision Laboratory at the University of Maryland, for developing consistency in scene labeling by the use of
label relaxation [33], Richards, Landgrebe, and Swain showed
that relaxation methods could be used to incorporate both spatial context and the effect of ancillary (topographic) data into a
classification [34][36].
The problem of context has been handled more recently using
Markov (or Gibbs) Random Fields. One of the earliest treatments was that of Jeon and Landgrebe [37]. Solberg et al. [38]
have employed Markov random field (MRF) models to incorporate spatial and temporal context. Jung and Swain [39] have
shown how the use of better estimates for class statistics, along
with MRF for including spatial context, can lead to improved
classification accuracy. Sarkar et al. [40] have devised an unsupervised procedure based on MRF.
Interestingly, the MRF approach essentially uses the prior
probability term in the posterior probability of (2) to embed
the effect of context via the Gibbs distribution energy term. Although not an MRF approach, Strahler [41], as early as 1980,
also used the prior probability term to incorporate contextual
information into a classification.
C. The Hyperspectral Challenge
Although they had always been aware of the so-called Hughes
phenomenon, it was in the 1980s that Landgrebe and colleagues
took a greater interest in problems to do with training maximumlikelihood classifiers with limited training samples [42][45].
Initially their focus was on feature selection, but with the advent
of hyperspectral datasets, they turned their attention to the difficulty of obtaining reliable estimates of class covariance matrices
in the face of limited training data [46]. The approach has been
to use an approximation to the class sample covariance matrix
along the general line
where
,
are two different estimates of the sample cowould be the (poor) class covariance
variance. Typically,
estimate obtained from the available training samples, and
would be the (global) covariance matrix computed over all the
training samples treated as a single group. Sometimes the principal diagonal matrix of the covariance matrix, or a diagonal
matrix with the trace of the matrix as its elements, is used inas appropriate.
stead of ,
This work developed into a series of major contributions
by Landgrebe and colleagues through the 1990s and the
early 2000s, all concerned with the problem of reliable maximum-likelihood classification with high-dimensionality image
datasets [47][56]. Much of this work has appeared in Ph.D.
theses supervised by Landgrebe and is summarized in his recent
book [57].
425
An alternative method for handling covariance matrix estimation in the face of limited training samples has been to simplify
the structure of the matrix. By recognizing that interband correlations are generally strongest in the region of the diagonal, a
block diagonal simplification of the matrix can be adopted that
relaxes considerably the need to gather large numbers of training
samples per class [58].
Algorithm development for thematic mapping may well have
taken a different path had technological limitations not restricted
the spectral range and resolution of the first sensors. If hyperspectral measurements had been available from the start, it is
possible that pixel labeling might have been based on spectroscopic principles because the samples could then have been
composed into a discrete version of the real reflectance spectrum of the earth cover type.
With upward of 200 spectral channels now available the sampled pixel spectrum contains enough detail to allow spectroscopic principles to be applied for image understanding. Rather
than doing that ab initio each time a pixel has to be identified,
the approach normally taken is to compile reference libraries of
prerecorded spectra against which newly measured data can be
compared for identification. This work actually commenced in
the mid 1980s [59]. Early research concentrated on coding the
spectral data to make full-spectrum library searching feasible
[60]. More recently, though, matching procedures have been
developed around only those spectral features known to be diagnostic [61]. They are generally absorption features that can
be characterized by measurements of their position, depth, and
width. Expert system methods are used to assist in the feature
selection process.
D. Multisource Methods
In the mid to late 1980s, LARS researchers and their associates looked systematically at statistical and evidential
procedures for thematic mapping from several data sources
[62][64]. Generally, statistical approaches have been based on
maximizing the multisource posterior probability
or, having used Bayes rule, maximizing

(3)
To render the problem tractable the different sources are generally assumed to be independent so that (3) can be written as
the product of individual distribution functions. That also allows
indices, , to be added to reflect confidence in the different
sources that contribute to the joint decision:
This work was generalized by Benediktsson and Swain in devising the consensus theoretic approach to multisource classification [65].
Evidential methods for handling multisource data derive from
the treatment of Shafer [66]. These entail allocating measures
of evidential mass or belief to each of the possible labels for
a pixel (the higher the mass the more likely the label). Mass
can also be allocated to joint labeling possibilities (where we
have reason to believe a pixel may be a mixture of cover types)

and, explicitly, to any uncertainty in the labeling process. The
masses allocated over all single and joint labeling propositions,
and uncertainty, sum to unity.
Allocation of mass can be derived from any reasonable approach. Some authors use the posterior probabilities from maximum-likelihood classification for that purpose, with the mass
given to uncertainty based on the confidence one has in the classified product.
The real strength of evidential reasoning for multisource
classification is that the mass distributions derived from several
sources can be combined through the so-called orthogonal
sum [62], [67]. The outcome generally concentrates evidential
mass on preferred label(s), and reduces uncertainty. Evidential
reasoning has also been combined with Markov Random Field
methods [68].
Decision trees have long been regarded as valuable means
for handling difficult classification problems, including those
involving several sources, because they allow different feature
sets to be used at each decision node. The application of decision
trees was largely pioneered at LARS [69][71].
E. Neural Networks and Support Vector Machines
The earliest pattern classifiers sought to label data into two
different classes by attempting to place a linear separating hyperplane between them in multidimensional (feature) space [2].
Multiple classes were handled, usually, by adopting binary decision trees. Because accuracy depended on the classes being
linearly separable, the so-called perceptron approaches of the
1960s did not find great application in remote sensing. There
were some notable exceptions in which piecewise linear decision surfaces were derived [72], [73] but, in general, thematic
mapping of remote sensing data by perceptron-like classifiers
did not become viable before the advent of the (artificial) neural
network.
The great breakthrough the neural network offered was the
availability of feasible training techniques for nonlinearly separable data, but at the expense of algorithmic and training complexity. Benediktsson et al. [74], [75] appear to have been the
first to demonstrate the use of neural networks (using backpropagation) for remote sensing image labeling.
Perhaps the most interesting development in classification
based on linear methods in recent years has been the support
vector machine (SVM) and the use of kernels to transform inseparable data into a feature space where linear separability can
be exploited [76].
In the simple perceptron, the linear separating hyperplane is
found by iterating an initial guess into a final position such that
the pixels from each of the training classes are finally on their
correct side of the hyperplane. That iterative process uses all the
training data to find the hyperplane. Since the objective is simply
to find a surface which separates two classes of data, it is really
only those pixels nearest the hyperplane that need be used to
derive the surface. Moreover, there is an optimal surface (which
is missed in perceptron methods) given by the hyperplane orientation that maximizes class separation as depicted in Fig. 2.
The hyperplane is found by a constrained optimization process
that maximizes the separation between the classes (the margin
426
Fig. 2. Separation of two classes of pixel by the optimal hyperplane of a

support vector machine and by maximum-likelihood classification. A typical
hyperplane as implemented by a simple linear perceptron is also depicted.
in Fig. 2) subject to the condition that the training classes must

be on their correct side [77][79].
The real advance with the SVM comes with the realization
that the pixel vectors in the optimization and decision rule formulas always appear in pairs related through a scalar product.
It is possible therefore to replace those products by higher order
functions of the pairs of vectors, essentially projecting the pixel
vectors into a higher dimensional space where the data become
linearly separable. Such a projection is carried out by the use of
so-called kernels, the two most common of which are polynomial and radial basis function transformations. Recent studies
of the value of SVM in remote sensing image analysis will be
found in [80] and [81].
F. Mixtures, Fuzzy Memberships, and Bayesian Methods
While clustering is often used as a means for resolving multimodality in information classes, as noted in Section IV-A, mixture analysis has been used for this purpose by a number of
authors, including Landgrebe for the case of hyperspectral data.
The probability of a pixel belonging to a given class is repGaussian distributions
resented as a linear combination of
[57]
in which the
are a set of mixing proportions. If such a
model is to be used successfully, the number of components
, the mixing parameters , and the means and covariances
for each of the individual distributions all have to be found.
Kuo and Landgrebe [82] use expectation-maximization (EM)
and a range of goodness of fit measures to apply the model to
hyperspectral data. They use feature selection based on nonparametric weighted feature extraction (NWFE) to render the
approach tractable [57]. Dundar and Landgrebe [56] essentially
examine the same problem, but use a range of regularization
methods for estimating the class covariances while generating

the model.
Unsupervised mixture models can be devised through the
AutoClass procedure based on Bayesian classification assumptions [83]. The computational demand however can be high,
especially when a significant number of classes needs to be used
to reveal the underlying data structure, sometimes requiring
parallel algorithms to render the computation viable. Bayesian
methods have also been applied to data mining and multisource
inference problems [84] and content-based image retrieval [85].
Most classification procedures allow the user to get some feel
for the relative likelihoods of a pixel belonging to each of the
available classes. In the maximum-likelihood rule, a set of posterior probabilities is produced by the algorithm, from which the
preferred label is generally chosen by maximum selection. The
last step is not essential and, if desired, the user can accept the
posterior probabilities as indications of the likelihoods of class
membership. Likewise, evidential methods produce a belief distribution that suggests relative likelihoods. So, it is possible to
avoid maximum selection steps and produce instead maps of
likelihoods or, in some cases, abundances. In the case of optical
data, this is often done explicitly through endmember analysis,
in which it is assumed that a given pixel is a composed of a linear
combination of pure cover types. Spectral unmixing methods
can then be used to assess the relative abundances present in a
given pixel [86].
Fuzzy classification procedures can also be used to assign a
set of membership possibilities to a pixel instead of providing
the label of the most likely class. Schowengerdt [87] provides
a simple summary of the approach, for both supervised and unsupervised labeling, while Zhang and Foody [88] demonstrate
fuzzy classification of a suburban region, in which fuzzy ground
truth data was also used.
V. RADAR DATA METHODS
In principle, the procedures devised for thematic mapping
from visible and infrared spectral data can be applied to the analysis of radar imagery. However, there are several fundamental
differences in the data types that suggests that radar should be
handled differently.
The multiplicative speckle noise inherent in radar data gives
an effective signal to noise ratio of 0 dB in its raw captured
form and (so-called multilook) averaging is needed to reduce
the noise to a level where meaningful image analysis can
be carried out. Even then, simple point classifiers, such as
the maximum-likelihood rule and neural networks, will not
perform well unless further filtering is performed, or so-called
per field classification is carried out [89]. Averaging over time
has also been used to reduce speckle noise in preparation for
classification [90].
The nature of the energymatter interaction process is another
major difference between radar and optical imagery. Whereas
the interaction is largely with the surface elements of the landscape at visible and reflected infrared wavelengths (with some
transmission and multiple reflections for vegetation and water),
at the much longer wavelengths that characterize radar, diffuse
surface scattering is only one of a number of backscattering
mechanisms. Surface scattering can also be specular at long

wavelengths. Additionally, volume scattering can be a major element of backscatter; it occurs from forest canopies, crops, and
sea ice. Subsurface scattering in hyperarid regions at very long
wavelengths is also possible.
There can also be a backscattered component resulting from
double bounce strong reflection involving the right angle formed
by adjacent vertical and horizontal structures. Tree trunks and
the ground, buildings and the ground, and ships and the sea surface are all situations that give rise to this mechanism. On occasions, facets (such as sloped building roofs) can also give a
strong specular return to the radar.
In order to interpret radar imagery effectively, the analyst
needs to be aware of those scattering complexities. It is important to recognize also that there are three dimensions to radar
data: wavelength, polarization, and incidence angle, in contrast
to just wavelength for optical data. Further, the energy is coherent so that the backscattered returns are complex, involving
both amplitude and phase, and can have a different polarization
from that transmitted. With such rich dimensionality one would
expect, therefore, that scattering models could be devised for
the purpose of thematic mapping by relating the received energy
more directly to the biophysical variables of interest [91][93],
rather than proceeding via a classification.
In the simplest cases, these models seek to simulate the behavior of the scattering coefficient which relates backscattered
power density to incident power density
where
is polarization, wavelength and the look angle.
Sometimes these models are based on regression relationships.
More complex models simulate the scattering matrix, the elements of which relate the scattered electric field vector to the
incident field for various polarization combinations
Inversion of scattering models, which is often nontrivial, allows the analyst to derive meaningful biophysical information
(biomass, age of sea ice, crop yield, etc.) directly from the
recorded data.
We now look at some of the more significant developments in
backscatter modeling that have found application over the past
two decades.
Moisture content is an important determinant of the strength
of scattering from vegetation. One of the earliest effective,
though simple, radar scattering models was the water-cloud
model for vegetation canopy scattering developed by Attema
and Ulaby [94]. It represents the canopy of a forest, say, as
a cloud of water droplets and then uses results from cloud
physics to avoid having to model the complex canopy physical
structure.
Lang et al. [95][97] introduced dielectric disk models for
foliage to give a more realistic canopy model, while Richards
et al. [98] showed how simple separable structural models can
427
be used, demonstrating the importance of the trunk component

in scattering from mature forests at long wavelengths.
The first comprehensive radar scattering model (MIMICS)
was devised by Ulaby et al. [99], while van Zyl [100] demonstrated how knowledge of the phase changes that occur with the
number of scattering events can be used to provide simple forest
thematic mapping.
Knowledge of scattering behaviors, either from prior experience or derived from image characteristics, has been used to
construct expert system or knowledge-based approaches to the
analysis of radar image data [101]. More recently, case-based
reasoning has been proposed as a knowledge-based method
using predetermined pixel attributes or features, and fuzzy
measures of class membership [102], in the descriptions of
radar resolution cells.
Lee and coworkers [103][105] have developed a classifier
for radar that appeals directly to the statistical nature of the
actual data. Based on the knowledge that a vector formed from
the copolarized and cross-polarized elements of the scattering
matrix is distributed in a Gaussian fashion with zero mean,
a distance-based discriminant function can be developed for
classification purposes. When multilook averaging and several wavebands are taken into account, the scattering vector is
based on Wishart statistics, again leading to a distance-derived
discriminant function that performs well in forest thematic mapping. In a very interesting later article, Lee et al. [106] have
combined scattering models and Wishart-based maximum-likelihood classification to devise a new unsupervised classification
procedure for polarimetric radar imagery.
Unlike passive optical remote sensing systems, radar imaging
offers additional utility because of the coherent nature of the
radar returns. For two radar passes over the same region, the
vector difference of the signals carries information in its phase
and in the correlation of the two complex signals. The latter is
generally expressed as coherence which can be used as a feature for biophysical parameter estimation [107], [108] and as
a feature in unsupervised classification of interferometric SAR
(InSAR) imagery [91]. As importantly, however, the phase information obtained by interfering the two images allows topographic analysis of the landscape [109]. Topography derived in
this manner has been used to supplement other radar features in
decision tree analysis of radar image data [110].
VI. MULTISOURCE INFERENCE
Pixel labeling from multiple, mixed data sources was recognized as important early in the history of thematic mapping, and
was a feature of LARS research in the late 1970s and early 1980s
[31], [36].
With the emergence of several data gathering nations and
space programs in the mid to late 1980s, and the varieties of
data type available, attention to forming joint inferences about
the most appropriate labels to assign to equivalent ground pixels
has accelerated significantly, to the extent that the IEEE Geoscience and Remote Sensing Society now has a Technical Committee on Data Fusion,6 and special issues of journals dealing
with the problem have appeared [111].
6http://ewh.ieee.org/soc/grss/
428
Methods such as multisource (joint) statistical modeling, evidential methods and relaxation labeling show promise for data
fusion, as do support vector machines [112] and more general
credit assignment methods [113]. But from an operational viewpoint, they have a number of potentially serious practical limitations, as do all methods that rely on fusing at the data and
decision levels, as discussed in the following.
It is unlikely that a given analyst would be expert in understanding more than one particular data type. Moreover, many
application specialists are unlikely to be expert in techniques
for handling data and, in some cases, might not even care
about the primary datasets that went into deriving products
of importance. While they might find it interesting that the
data originated from satellite sensors, that may not be significant from the point of view of their specific application.
As a simple analogue, those who are interested in weather
forecasts would not normally know (or care) anything about
cloud top temperatures.
In addition, given that the user most likely would now be
served spatial data over a network, there are a number of fundamental requirements that an operational thematic mapping
schema should satisfy. Apart from being able to cope with a
variety of data types, including mixtures of categorical and numerical data, it must do the following.
1) Account for relative data quality and relevance. This has
been well known since the earliest studies of multisource
classification methods. Poor quality data, or that which is
marginally relevant for a particular application, must not
unreasonably influence the outcome.
2) Allow each data source to be analyzed separately in time
and location, and by its own experts. Different data types
are generally recorded by different agencies, and sometimes those data are not all available when an analysis
is first performed. Also, each particular data source has
its own expert analysis community and it is unlikely that
any single agency or person will be expert across the variety of relevant, contemporary data typesspectroscopic
analysis of hyperspectral data and the derivation of biophysical information from multipolarization InSAR data
is an illustration.
3) Allow preexisting thematic maps to be incorporated into
the analysis process, and thematic map revision must be
accommodated. Previously classified data may have value
to a current exercise, as might existing categorical maps,
and should be able to be incorporated. Likewise, map revision is important.
4) Accept that the thematic classes from a combined dataset
might be different from the classes achievable with any
dataset on its own. This is a particularly important reason
as to why data fusion methods are limited when seeking
to form operational joint inferences. Table I provides a
simple illustration: the types of information class relevant
to one particular dataset can be quite different from those
relevant to a different data typein other words information classes are often source-specific. The information
classes of interest to the user may be quite different again,
but should be derivable from the source-specific labels.
TABLE I
CLASSES CAN BE DIFFERENT FOR DIFFERENT DATA TYPES
Meeting all these conditions is difficult when using any (fusion)

technique that depends on combining data, since the data
sources would then all have to be available simultaneously
and the class definitions would need to be consistent over the
sources.
The best prospect for an operational methodology, therefore,
seems to be for a technique that allows each dataset to be analyzed independently, using whatever procedure is optimally
matched to the characteristics of that particular source, and then
look for a combination process that permits the thematic classes
identified in the individual analyses to be combined. Not only
does that offer the prospect of getting the best possible performance out of each source/classifier combination, but it also allows each data source to be analyzed where and when available.
That includes the case of a primary data supplier converting
recorded data to geophysical products for distribution to client
agencies.
Spatial context, as appropriate, could be incorporated at the
level of individual data analysis. The labeling of a pixel in spatial
relation to its neighbors is probably most effective when done
in conjunction with the primary form of data analysis.
An effective method for operating on labels to form joint inferences is to adopt some form of symbolic reasoning similar to
that used in an expert system. For example, if analysis of a multispectral dataset reveals a vegetation type, while interpretation of
coincident radar data suggests a smooth specular surface, then
an expert would conclude that the cover type was most likely
grassland (smooth vegetation). As a production rule that would
be expressed
from spectral data the region is vegetated
from radar the region is smooth
the region is probably grassland
Since labels from each individual data source are found using
the most appropriate algorithm for the task, production rules
provide an easy means for forming joint inferencesi.e., for
fusing the labeling outcomes from each data source. In contrast to data-level, feature-level, and decision-level fusion, this
is label-fusion as depicted in Fig. 3, akin to fusion at the decision level, but with the possibility that the final thematic labels
can be different from the source-specific labels.
Fig. 3.
429
Handling mixed data types by fusion at the label level.
A rule-based expert system for handling mixed data analysis

problems in remote sensing was devised by Srinivasan [114],
[115]. While focused on the problem of both single source analysis and label combination, its main interest here is that it provides a ready combination tool. Analysis of single data sources
can be carried out either by a rule-based process (as might be important for radar when using scattering models) or by any of the
more traditional and, where appropriate, numerical and statistical methods. Provided, however, they deliver sets of thematic
labels relevant to the particular data source, a joint inference can
be created if symbolic combination rules are available. Alternative symbolic reasoning procedures are also possible [116].
From an operational perspective, that is only part of the story.
Who will develop and apply the joint inferential symbolic processing techniques? Is it reasonable to expect that the average
user will possess the knowledge and skills to work at that level?
In the early decades of remote sensing, with only a single spatial data type and a few standard thematic mapping procedures
available, it was reasonable to expect that all the analytical work
might be carried out by the skilled end user. Now, however, the
more effective operational approach might involve two broad
phases as follows.
Single datasets should be analyzed by experts with those
data types. The resultant products can be generated in response to a specific requirement or might be produced
speculatively and archived for later use.
There will need to emerge consultant communities capable of using expert knowledge to convert single source
products to the jointly inferred products required by the
end user. The consultant, having taken instruction from
the user, would select the products generated by single
data suppliers that most appropriately match the users requirements, and then perform label-level fusion using expert knowledge as depicted in Fig. 4. The consultant can
also account, at least in a qualitative sense, for relative
data quality by the manner in which symbolic reasoning
is applied [115].
Of course, all such methodologies involve compromises. It is
certain that there will be some combinations of data types that
may well give better results through data fusion, rather than
Fig. 4.
types.
Operational spatial data analysis framework for handling mixed data
being handled individually. However, in an operational context,

and in an era where the data types and suppliers are many, it is
necessary to adopt a schema that will work over all applications
and that will not require either the analyst or client to develop
expertise across the breadth of the technology sector.
VII. CONCLUDING COMMENTS
The contributions to quantitative digital image interpretation in remote sensing by David Landgrebe, his students, and
coworkers established the foundations for the field and have
influenced the evolution of the quantitative approach ever since.
It has been an important journey. From the challenges of the
pre- and early spaceborne days through to the current era of
the hyperspectral dataset, the Purdue group has identified the
problems and proposed the solutions. The array of analytical
techniques now regularly used in image analysis owes much to
those contributions.
The next era will be characterized by the need for scene understanding that better matches the requirements of the client
community. We have done much over the past ten years or so to
attempt accurate pixel labeling with mixed data types by using
techniques that refine the results obtained on one dataset by
using other available data types. It is important that we now
move away from methodologies that require the end user, or
even a person specialist in one particular dataset, to have enough
expertise to create meaningful thematic products from the range
of spatial data types available. Instead, we need to accept that
a truly operational spatial data community will need both domain experts and those who can manipulate labels. Symbolic
processing needs to become as much a part of the new era of
image analysis as quantitative extraction of information from
data has been in the past.
By its nature, and in its mission, LARS has always been
applications driven. The analytical techniques devised by Dave
Landgrebe and his collaborators have been in response, either
directly or indirectly, to the perceived needs of communities
as diverse as forestry, agriculture, soils mapping, and land use
430
generally. Remote sensing, fundamentally, is an applicationsdriven field, and while there is still room for the development
of further thematic mapping algorithms, the requirements of
the end user must drive the outcomes. Our methodologies must
now be as much about choosing the most relevant primary data
types as doing the actual analysis. It is to consultants, skilled
in translating user requirements into tasks for data source
experts, and expert in combining the single-source outputs
into the product required by the client, that the field must
given. Only then will the ground-breaking and fundamental
analytical work pioneered at LARS, and by Dave Landgrebe,
find maturity.
REFERENCES
[1] K. R. Castleman, Digital Image Processing, 1st ed. Englewood Cliffs,
NJ: Prentice-Hall, 1979.
[2] N. J. Nilsson, Learning Machines. New York: McGraw-Hill, 1965.
[3] R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[4] D. A. Landgrebe. (1986) A brief history of the Laboratory for
Applications of Remote Sensing (LARS). [Online]. Available:
www.lars.purdue.edu/home/LARSHistory.html
[5] G. Cardillo and D. Landgrebe, On pattern recognition, Purdue Univ.,
West Lafayette, IN, LARS Tech. Note 101 866, 1966.
[6] K. S. Fu, D. A. Landgrebe, and T. L. Phillips, Information processing
of remotely sensed agricultural data, Proc. IEEE, vol. 57, no. 4, pp.
639653, Apr. 1969.
[7] A. Wacker, A cluster approach to finding spatial boundaries in multispectral imagery, Purdue Univ., West Lafayette, IN, LARS Tech. Note
122 969, 1969.
[8] K. S. Fu and P. J. Min, On feature selection in multiclass pattern recognition, Purdue Univ., West Lafayette, IN, LARS Tech. Note 080 168,
1968.
[9] P. J. Min, D. A. Landgrebe, and K. S. Fu, Feature selection in multiclass pattern recognition, Purdue Univ., West Lafayette, IN, LARS
Tech. Note 050 170, 1970.
[10] A. Wacker and D. A. Landgrebe, The minimum distance approach to
classification, Ph.D. dissertation, Purdue Univ., School of Elect. Eng.,
West Lafayette, IN, 1971.
[11] P. H. Swain and A. G. Wacker, Comparison of the divergence and B-distance in feature selection, Purdue Univ., West Lafayette, IN, LARS
Tech. Note 020 871, 1971.
[12] P. H. Swain and R. C. King, Two effective feature selection criteria
for multispectral remote sensing, in Proc 1st Int. Joint Conf. Pattern
Recognition, Washington, DC, Nov. 1973, pp. 536540.
[13] P. H. Swain and S. M. Davis, Remote Sensing: The Quantitative Approach. New York: McGraw-Hill, 1978.
[14] M. D. Fleming, J. S. Berkebile, and R. M. Hoffer, Computer-aided analysis of Landsat-1 MSS data: A comparison of three approaches including
a modified clustering approach, in Proc. Symp. Machine Processing of
Remotely Sensed Data, West Lafayette, IN, Jun. 35, 1975, pp. 5461.
[15] X. Jia and J. A. Richards, Cluster space representation for hyperspectral
classification, IEEE Trans. Geosci. Remote Sens., vol. 40, no. 3, pp.
593598, Mar. 2002.
[16] R. M. Haralick, K. Shanmugan, and I. Dinstein, Texture features for
image classification, IEEE Trans. Syst., Man Cybern., vol. SMC-3, no.
6, pp. 610621, Nov. 1973.
[17] K. R. Castleman, Digital Image Processing, 2nd ed. Upper Saddle
River, NJ: Prentice-Hall, 1996.
[18] R. M. Hord and W. Brooner, Land-use map accuracy criteria, Photogramm. Eng. Remote Sens., vol. 42, no. 5, pp. 671677, 1976.
[19] R. G. Congalton and K. Green, Assessing the Accuracy of Remotely
Sensed Data: Practices and Principles. Boca Raton, FL: Lewis, 1993.
[20] G. M. Foody, Status of land cover classification accuracy assessment,
Remote Sens. Environ., vol. 80, pp. 185201, 2002.
[21] W. G. Eppler, Canonical analysis for increased classification speed and
channel selection, IEEE Trans. Geosci. Electron., vol. GE-14, no. 1, pp.
2633, Jan. 1976.
[22] S. K. Jensen and F. A. Waltz, Principal components analysis and canonical analysis in remote sensing, in Proc. Amer. Soc. Photogrammetry
45th Annu. Meeting, 1979, pp. 337348.
[23] J. M. Soha and A. A. Schwartz, Multispectral histogram normalization

contrast enhancement, in Proc 5th Can. Symp. Remote Sensing, 1978,
pp. 8693.
[24] R. J. Kauth and G. S. Thomas, The tasseled cap A graphic description
of the spectral-temporal development of agricultural crops as seen by
Landsat, in Proc. Symp. on Machine Processing of Remotely Sensed
Data, West Lafayette, IN, 1976.
[25] G. R. Byrne, P. F. Crapper, and K. K. Mayo, Monitoring land cover
changes by principal components analysis of multitemporal Landsat
data, Remote Sens. Environ., vol. 10, pp. 175184, 1980.
[26] T. Huang, Per field classifier for agricultural applications, Purdue
Univ., West Lafayette, IN, LARS Tech. Note 060 569, 1969.
[27] T. V. Robertson, Extraction and classification of objects in multispectral
images, in Proc. Symp. Machine Processing of Remotely Sensed Data,
West Lafayette, IN, 1973, pp. 2734.
[28] J. N. Gupta, R. L. Kettig, D. A. Landgrebe, and P. A. Wintz, Machine
boundary finding and sample classification of remotely sensed data,
in Proc. Symp. on Machine Processing of Remotely Sensed Data, West
Lafayette, IN, 1973, pp. 2535.
[29] R. L. Kettig and D. A. Landgrebe, Computer classification of remotely
sensed multispectral image data by extraction and classification of homogeneous objects, IEEE Trans. Geosci. Electron., vol. GE-14, no. 1,
pp. 1926, Jan. 1976.
[30] R. M. Hoffer, M. D. Fleming, L. A. Bartolucci, S. M. Davis, and R. F.
Nelson, Digital processing of Landsat MSS and topographic data to
improve capabilities for computerized mapping of forest cover types,
Purdue Univ., West Lafayette, IN, LARS Tech. Note 011 579, 1979.
[31] M. D. Fleming and R. M. Hoffer, Machine processing of Landsat
MSS data and DMA topographic data for forest cover types mapping,
in Proc. Symp. Machine Processing of Remotely Sensed Data, West
Lafayette, IN, 1979, pp. 377390.
[32] J. C. Tilton, P. H. Swain, and S. B. Vardeman, Context distribution estimation for contextual classification of multispectral image data, Purdue
Univ., West Lafayette, IN, LARS Tech. Note 040 280, 1980.
[33] A. Rosenfeld, R. Hummel, and S. Zucker, Scene labeling by relaxation
algorithms, IEEE Trans. Syst., Man Cybern., vol. SMC-6, no. 6, pp.
420433, Jun. 1976.
[34] J. A. Richards, D. A. Landgrebe, and P. H. Swain, Pixel labeling by
supervised probabilistic relaxation, IEEE Trans. Pattern Anal. Mach.
Intell., vol. PAMI-3, no. 2, pp. 188191, Mar. 1981.
, On the accuracy of pixel relaxation labeling, IEEE Trans. Syst.,
[35]
Man Cybern., vol. SMC-11, no. 4, pp. 303309, Apr. 1981.
, A means for utilizing ancillary information in multispectral clas[36]
sification, Remote Sens. Environ., vol. 12, pp. 463477, 1982.
[37] B. Jeon and D. A. Landgrebe, Classification with spatio-temporal interpixel class dependency contexts, IEEE Trans. Geosci. Remote Sens.
, vol. 30, no. 4, pp. 663672, Jul. 1992.
[38] A. H. S. Solberg, T. Taxt, and A. K. Jain, A Markov random field model
for classification of multisource satellite imagery, IEEE Trans. Geosci.
Remote Sens. , vol. 34, no. 1, pp. 100113, Jan. 1996.
[39] Y. Jung and P. H. Swain, Bayesian contextual classification based on
modified M-estimates and Markov random fields, IEEE Trans. Geosci.
Remote Sens. , vol. 34, no. 1, pp. 6775, Jan. 1996.
[40] A. Sarkar, M. K. Biswas, B. Kartikeyan, V. Kumar, K. L. Majumder, and
D. K. Pal, A MRF model-based segmentation approach to classification
for multispectral imagery, IEEE Trans. Geosci. Remote Sens. , vol. 40,
no. 5, pp. 11021113, May 2002.
[41] A. H. Strahler, The use of prior probabilities in maximum likelihood
classification of remotely sensed data, Remote Sens. Environ., vol. 10,
pp. 135163, 1980.
[42] H. M. Kalayeh, M. J. Muasher, and D. A. Landgrebe, Feature selection
with limited training samples, IEEE Trans. Geosci. Remote Sens. , vol.
GE-21, no. 4, pp. 434438, Oct. 1983.
[43] M. J. Muasher and D. A. Landgrebe, A binary tree feature-selection
techniques for limited training sample size, in Proc 8th Intl. Symp. Machine Processing of Remotely Sensed Data, West Lafayette, IN, 1982,
pp. 130137.
[44] C. Lee and D. A. Landgrebe, Decision boundary feature selection for
nonparametric classifiers, in Proc. SPIE 44th Annu. Conf., St. Paul,
MN, May 1991.
, Feature selection based on decision boundaries, in Proc.
[45]
IGARSS, Espoo, Finland, Jun. 1991.
[46] B. M. Shahshahani and D. A. Landgrebe, Using partially labeled data
for normal mixture identification with application to class definition, in
Proc. IGARSS, Houston, TX, May 2629, 1992, pp. 16031605.
[47] C. Lee and D. A. Landgrebe, Analyzing high-dimensional multispectral
data, IEEE Trans. Geosci. Remote Sens. , vol. 31, no. 4, pp. 792800,
Jul. 1993.
[48] J. P. Hoffbeck and D. A. Landgrebe, Covariance matrix estimation and

classification with limited training data, IEEE Trans. Pattern Anal.
Mach. Intell., vol. 18, no. 7, pp. 763767, Jul. 1996.
[49] S. Tadjudin and D. A. Landgrebe, Covariance estimation for limited
training samples, in Proc. IGARSS, Seattle, WA, Jul. 610, 1998.
[50] P.-F. Hsieh and D. A. Landgrebe, Statistics enhancement in hyperspectral data analysis using spectral-spatial labeling, the EM algorithm and
the leave-one-out covariance estimator, in Proc. SPIE Int. Symp. Optical Science, Engineering, and Instrumentation, San Diego, CA, Jul.
1924, 1998.
[51] S. Tadjudin and D. A. Landgrebe, Covariance estimation with limited
training samples, IEEE Trans. Geosci. Remote Sens., vol. 37, no. 4, pp.
21132118, Jul. 1999.
[52] Q. Jackson and D. A. Landgrebe, An adaptive classifier design for high
dimensional data analysis with a limited training data set, IEEE Trans.
Geosci. Remote Sens., vol. 39, no. 12, pp. 26642679, Dec. 2001.
[53] B.-C. Kuo and D. A. Landgrebe, A covariance estimator for small
sample size classification problems and its application to feature extraction, IEEE Trans. Geosci. Remote Sens., vol. 40, no. 4, pp. 814819,
Apr. 2002.
[54] Q. Jackson and D. A. Landgrebe, An adaptive method for combined
covariance estimation and classification, IEEE Trans. Geosci. Remote
Sens., vol. 40, no. 5, pp. 10821087, May 2002.
[55] B.-C. Kuo and D. A. Landgrebe, Regularized covariance estimators
for hyperspectral data classification and its application to feature extraction, in Proc. IGARSS, Toronto, ON, Canada, Jun. 2428, 2002.
[56] M. M. Dundar and D. A. Landgrebe, A model-based mixture-supervised classification approach in hyperspectral data analysis, IEEE
Trans. Geosci. Remote Sens. , vol. 40, no. 12, pp. 26922699, Dec.
2002.
[57] D. A. Landgrebe, Signal Theory Methods in Multispectral Remote
Sensing. Hoboken, NJ: Wiley, 2003.
[58] X. Jia and J. A. Richards, Efficient maximum likelihood classification
for imaging spectrometer data sets, IEEE Trans. Geosci. Remote Sens.,
vol. 32, no. 2, pp. 274281, Mar. 1994.
[59] M. A. Piech and K. R. Piech, Symbolic representation of hyperspectral
data, Appl. Opt., vol. 26, pp. 40184026, 1987.
[60] A. S. Mazer, M. Martin, M. Lee, and J. E. Solomon, Image processing
software for imaging spectrometry data analysis, Remote Sens. Environ., vol. 24, pp. 201211, 1988.
[61] R. N. Clark, G. A. Swayze, K. E. Livio, R. F. Kokaly, S. J. Sutley, J. B.
Dalton, R. R. McDougal, and C. A. Gent, Imaging spectroscopy: Earth
and planetary remote sensing with the USGS tetracorder and expert systems, J. Geophys. Res., vol. 108, no. E12, pp. 51315175, Dec. 2003.
[62] T. Lee, J. A. Richards, and P. H. Swain, Probabilistic and evidential
approaches for multisource data analysis, IEEE Trans. Geosci. Remote
Sens., vol. GE-25, no. 3, pp. 283293, May 1987.
[63] H. Kim and P. H. Swain, Multisource data analysis in remote sensing
and geographic information systems based on Shafers theory of
evidence, in Proc. IGARSS, Vancouver, BC, Canada, Jul. 1989, pp.
829832.
, Evidential reasoning approach to multisource-data classification
[64]
in remote sensing, IEEE Trans. Syst., Man, Cybern., vol. 25, no. 8, pp.
12571265, Aug. 1995.
[65] J. A. Benediktsson and P. H. Swain, Consensus theoretic classification
methods, IEEE Trans. Syst., Man, Cybern., vol. 22, no. 4, pp. 688704,
Jul.-Aug. 1992.
[66] G. Shafer, A Mathematical Theory of Evidence. Princeton, NJ:
Princeton Univ. Press, 1976.
[67] J. A. Richards and X. Jia, Remote Sensing Digital Image Analysis, 3rd
ed. Berlin, Germany: Springer-Verlag, 1999.
[68] A. Bendjebbour, Y. Delingon, L. Fouques, V. Samson, and W. Picznski,
Multisensor image segmentation using DempsterShafer fusion in
Markov fields context, IEEE Trans. Geosci. Remote Sens., vol. 39, no.
8, pp. 17891798, Aug. 2001.
[69] C. L. Wu, D. A. Landgrebe, and P. H. Swain, The decision-tree approach to classification, Ph.D. dissertation, Purdue Univ., School of
Elect. Eng., West Lafayette, IN.
[70] P. H. Swain and H. Hauska, The decision-tree classifier: Design and potential, IEEE Trans. Geosci. Electron., vol. GE-15, no. 3, pp. 142147,
Jul. 1977.
[71] S. Safavian and D. A. Landgrebe, A survey of decision tree classifier methodology, IEEE Trans. Syst., Man, Cybern., vol. 21, no. 3, pp.
660674, May 1991.
[72] T. Lee and J. A. Richards, Piecewise linear classification using seniority
logic committee methods with application to remote sensing, Pattern
Recognit., vol. 17, no. 4, pp. 453464, 1984.
431
[73]
[74]
[75]
[76]
[77]
[78]
[79]
[80]
[81]
[82]
[83]
[84]
[85]
[86]
[87]
[88]
[89]
[90]
[91]
[92]
[93]
[94]
[95]
[96]
[97]
, A low-cost classifier for multitemporal applications, Int. J. Remote Sens., vol. 6, pp. 14051417, 1985.
J. A. Benediktsson, P. H. Swain, and O. K. Ersoy, Neural network approaches versus statistical methods in classification of multisource remote sensing data, IEEE Trans. Geosci. Electron., vol. GE-28, no. 4,
pp. 540552, Jul. 1990.
J. A. Benediktsson and P. H. Swain, Statistical Methods and Neural
Network Approaches for Classification of Data from Multiple Sources,
Ph.D., Purdue Univ., School of Elect. Eng., West Lafayette, IN, 1990.
J. A. Gualitieri and R. F. Cromp, Support vector machines for hyperspectral remote sensing classification, in Proc. SPIE27th AIPR Workshop Advances in Computer Assisted Recognition, vol. 3584, , R. J.
Merisko, Ed., 1998, pp. 221232.
B. Sholkoff and A. Smola, Learning with Kernels. Cambridge, MA:
MIT Press, 2002.
C. J. C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining Knowl. Discov., vol. 2, pp. 121167, 1998.
C. Huang, L. S. Davis, and J. R. G. Townshend, An assessment of support vector machines for land cover classification, Int. J. Remote Sens.,
vol. 23, no. 4, pp. 725749, 2002.
J. A. Gualtieri and S. Chettri, Support vector machines for classification of hyperspectral data, in Proc. IGARSS, vol. 2, Honolulu, HI, Jul.
2428, 2000, pp. 813815.
F. Melgani and L. Bruzzone, Classification of hyperspectral remote
sensing images with support vector machines, IEEE Trans. Geosci. Remote Sens., vol. 42, no. 8, pp. 17781796, Aug. 2004.
B.-C. Kuo and D. A. Landgrebe, A robust classification procedure
based on mixture classifiers and nonparametric weighted feature
extraction, IEEE Trans. Geosci. Remote Sens., vol. 40, no. 11, pp.
24862494, Nov. 2002.
P. Cheeseman and J. Stutz, Bayesian classification (AutoClass): Theory
and results, in Advances in Knowledge Discovery and Data Mining,
U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy,
Eds. Menlo Park, CA: AAAI Press, 1996.
M. Datcu and K. Seidel, Bayesian methods: Applications in information aggregation and image data mining, in Int. Arch. Photogramm. Remote Sensing, vol. 34 part 7-4-3, Valladolid, Spain, Jun. 34, 1999, W6.
M. Schroeder, H. Rehrauer, K. Siedel, and M. Datcu, Interactive
learning and probabilistic retrieval in remote sensing image archives,
IEEE Trans. Geosci. Remote Sens., vol. 38, no. 5, pp. 22882298, Sep.
2000.
A. Plaza, P. Martinez, R. Perez, and J. Plaza, A quantitative and comparative analysis of endmember extraction algorithms for hyperspectral
data, IEEE Trans. Geosci. Remote Sens., vol. 42, no. 3, pp. 650663,
Mar. 2004.
R. A. Schowengerdt, Remote Sensing: Models and Methods for Image
Processing. San Diego, CA: Academic, 1997.
J. Zhang and G. M. Foody, A fuzzy classification of sub-urban land
cover from remotely sensed imagery, Int J. Remote Sens., vol. 19, no.
14, pp. 27212738, 1998.
F. del Frate, G. Schiavon, D. Solimini, M. Borgeaud, D. H. Hoekman,
and M. A. M. Vissers, Crop classification using multiconfiguration
C-band SAR data, IEEE Trans. Geosci. Remote Sens. , vol. 41, no. 7,
pp. 16111619, Jul. 2003.
M. E. Engdahl and J. M. Hyyppa, Land-cover classification using multitemporal ERS-1/2 InSAR data, IEEE Trans. Geosci. Remote Sens. ,
vol. 41, no. 7, pp. 16201628, Jul. 2003.
X. Blaes and P. Defurney, Retrieving crop parameters based on tandem
ERS1/2 interferometric coherence images, Remote Sens. Environ., vol.
88, pp. 374385, 2003.
J. R. Santos, C. F. Freitus, L. S. Aruajo, L. V. Dutra, J. C. Mura, F. F.
Gama, L. S. Sola, and S. J. S. SantAnna, Airborne P-band SAR applied
to the aboveground biomass studies in the Brazilian tropical rainforest,
Remote Sens. Environ., vol. 87, pp. 482493, 2003.
Special issue on retrieval of bio- and geophysical parameters from SAR
data for land applications, IEEE Trans. Geosci. Remote Sens. , vol. 41,
no. 7, pp. 15391710, Jul. 2003.
E. P. W. Attema and F. T. Ulaby, Vegetation modeled as a water cloud,
Radio Sci., vol. 13, pp. 357364, 1978.
R. H. Lang, Electromagnetic scattering from sparse distribution of
lossy dielectric scatterers, Radio Sci., vol. 16, pp. 1530, 1981.
R. H. Lang and J. S. Sidhu, Electromagnetic backscattering from a layer
of vegetation: A discrete approach, IEEE Trans. Geosci. Remote Sens.,
vol. GE-21, no. 1, pp. 6271, Jan. 1983.
R. H. Lang and H. A. Saleh, Microwave inversion of leaf area and inclination angle distribution from backscattered data, IEEE Trans. Geosci.
Remote Sens. , vol. GE-23, no. 5, pp. 685694, Sep. 1985.
432
[98] J. A. Richards, Q. Sun, and D. S. Simonett, L-band radar backscatter

modeling of forest stands, IEEE Trans. Geosci. Remote Sens., vol.
GE-25, no. 4, pp. 487498, Jul. 1987.
[99] F. T. Ulaby, K. Sarabandi, K. McDonald, M. Whit, and M. C. Dobson,
Michigan microwave canopy scattering model (MIMICS), Int. J. Remote Sens., vol. 12, pp. 12231253, 1990.
[100] J. J. van Zyl, Unsupervised classification of scattering behavior using
radar polarimetry data, IEEE Trans. Geosci. Remote Sens. , vol. 27, no.
1, pp. 3645, Jan. 1989.
[101] C. M. Dobson, L. E. Pierce, and F. T. Ulaby, Knowledge-based landcover classification using ERS-1/JERS-1 SAR composites, IEEE Trans.
Geosci. Remote Sens., vol. 34, no. 1, pp. 8397, Jan. 1996.
[102] X. Li and A. G. Yeh, Multitemporal SAR images for monitoring cultivation systems using case-based reasoning, Remote Sens. Environ., vol.
90, pp. 524534, 2004.
[103] J. S. Lee, M. R. Grunes, and R. Kwok, Classification of multi-look
polarimetric SAR imagery based on complex Wishart distribution, Int.
J. Remote Sens., vol. 15, pp. 22992311, 1994.
[104] J. S. Lee, M. R. Grunes, and G. de Grandi, Polarimetric SAR speckle
filtering and its implication for classification, IEEE Trans. Geosci. Remote Sens. , vol. 37, no. 5, pp. 23632373, Sep. 1999.
[105] L. Ferro-Famil, E. Pottier, and J. S. Lee, Unsupervised classification
of multifrequency and fully polarimetric SAR images based on the
H/A/Alpha-Wishart classifier, IEEE Trans. Geosci. Remote Sens. , vol.
39, no. 11, pp. 23322342, Nov. 2001.
[106] J.-S. Lee, M. R. Grunes, E. Pottier, and L. Ferro-Famil, Unsupervised
terrain classification preserving polarimetric scattering characteristics,
IEEE Trans. Geosci. Remote Sens., vol. 42, no. 4, pp. 722731, Apr.
2004.
[107] B. Aiazzi, L. Alparone, S. Baronti, and A. Garzelli, Coherence estimation from multilook incoherent SAR imagery, IEEE Trans. Geosci.
Remote Sens., vol. 41, no. 11, pp. 25312539, Nov. 2003.
[108] J. Askne, M. Santoro, G. Smith, and J. E. S. Fransson, Multitemporal
repeat-pass SAR interferometry of boreal forests, IEEE Trans. Geosci.
Remote Sens., vol. 41, no. 7, pp. 15401550, Jul. 2003.
[109] S. R. Cloude and K. P. Papthanassiou, Polarimetric SAR interferometry, IEEE Trans. Geosci. Remote Sens., vol. 36, no. 5, pp. 15511565,
Sep. 1998.
[110] M. C. Crawford, S. Kumar, M. R. Ricard, J. C. Gibeaut, and A. Neuenschwander, Fusion of airborne polarimetric and interferometric SAR
for classification of coastal environments, IEEE Trans. Geosci. Remote
Sens., vol. 37, no. 3, pp. 13061315, May 1999.
[111] Special issue on data fusion, IEEE Trans. Geosci. Remote Sens. , pt.
1, vol. 37, no. 3, pp. 11871377, May 1999.
[112] G. H. Halldorsson, J. A. Benediktsson, and J. R. Sveinsson, Support
vector machines in multisource classification, in Proc. IGARSS, vol. 3,
Toulouse, France, Jul. 2003, pp. 20542056.
[113] C. M. Bachman, M. H. Bettenhausen, R. A. Fusina, T. F. Donato, A.
L. Russ, J. W. Burke, G. M. Lamela, W. J. Rhea, B. R. Truit, and J.
H. Porter, A credit assignment approach to fusing classifiers of multiseason hyperspectral imagery, IEEE Trans. Geosci. Remote Sens., vol.
41, no. 11, pp. 24882499, Nov. 2003.
[114] A. Srinivasan, An artificial intelligence approach to the analysis of multiple information sources in remote sensing, Ph.D. thesis, Univ. New
South Wales, School of Elect. Eng., Kensington, Australia, 1991.
[115] A. Srinivasan and J. A. Richards, Analysis of GIS spatial data using
knowledge-based methods, Int. J. Geograph. Inf. Syst., vol. 7, no. 6,
pp. 479500, 1993.
[116] J. Lloyd, Logic for Learning, ser. Cognitive Technologies Series. Berlin, Germany: Springer-Verlag, 2003.
John A. Richards (S68M72SM83F96)

received the B.E. (Hons1) and Ph.D. degrees from
the University of New South Wales, Kensington,
Australia, in 1968 and 1972, respectively.
He is currently Director of the Research School
of Information Sciences and Engineering at the
Australian National University, Canberra, Australia
and was Deputy Vice-Chancellor and Vice-President
from 1998 to 2003. He previously worked for 11
years at the University College, University of New
South Wales, Australian Defence Force Academy,
where he served as Head of the School of Electrical Engineering from June
1987 to July 1996, becoming Deputy Rector and then Rector in July 1996.
From 1981 to 1987, he was Foundation Director of the Centre for Remote
Sensing at the University of New South Wales. He is the author of the textbook
Remote Sensing Digital Image Analysis (Berlin, Germany: Springer-Verlag,
1986; Revised 1999, with X. Jia).
Dr. Richards is a Fellow of the Australian Academy of Technological Sciences and Engineering.

Richards 2005

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Richards 2005

Hochgeladen von

Copyright:

Verfügbare Formate

422

Analysis of Remotely Sensed Data: The

AbstractDevelopments in the field of image understanding

AVID Landgrebe and his coworkers from the Laboratory

All relevant, coregistered spatial measurements can contribute

0196-2892/$20.00 2005 IEEE

RICHARDS: ANALYSIS OF REMOTELY SENSED DATA

where the represent the different data types. Recognition of

where the is a representation of context.

What has happened since? Three broad trends in sensor and

4Mixture models have also been employed to resolve multimodalitysee

Hoffer and colleagues at LARS were also among the first to

RICHARDS: ANALYSIS OF REMOTELY SENSED DATA

or, having used Bayes rule, maximizing

have reason to believe a pixel may be a mixture of cover types)

Fig. 2. Separation of two classes of pixel by the optimal hyperplane of a

in Fig. 2) subject to the condition that the training classes must

methods for estimating the class covariances while generating

RICHARDS: ANALYSIS OF REMOTELY SENSED DATA

mechanisms. Surface scattering can also be specular at long

be used, demonstrating the importance of the trunk component

Meeting all these conditions is difficult when using any (fusion)

RICHARDS: ANALYSIS OF REMOTELY SENSED DATA

Handling mixed data types by fusion at the label level.

A rule-based expert system for handling mixed data analysis

Operational spatial data analysis framework for handling mixed data

being handled individually. However, in an operational context,

[23] J. M. Soha and A. A. Schwartz, Multispectral histogram normalization

RICHARDS: ANALYSIS OF REMOTELY SENSED DATA

[48] J. P. Hoffbeck and D. A. Landgrebe, Covariance matrix estimation and

[98] J. A. Richards, Q. Sun, and D. S. Simonett, L-band radar backscatter

John A. Richards (S68M72SM83F96)

Das könnte Ihnen auch gefallen