Beruflich Dokumente
Kultur Dokumente
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 3, MARCH 2005
I. INTRODUCTION
from mixed data types to allow the inherent information contents of complementary datasets to be brought together. Indeed,
much of the current research agenda is focused on finding viable
fusion methods. It is difficult, though, to envisage operational
image understanding with a continuation of methods that seek to
fuse data. At minimum, the fact that each given data type is usually best handled by its own (matched) analytical methods belies data fusion as an operational method for multisource image
analysis. Instead, individual data types may be best analyzed
separately, with combination occurring at the label level through
some form of symbolic processing.
After reviewing the significant developments in thematic
mapping, concentrating particularly on the work of Landgrebe
and colleagues, the problem of operational multisource classification is considered from the perspective of label-level fusion.
II. PROBLEM DOMAIN
The essential problem in thematic mapping is to specify the
data or information that needs to be gathered about a pixel
to allow a label to be attached to the pixel consistent with an
application of interest. In a modern operational setting, the
application requirements would be specified by a client, and
a consultant would have the task of choosing the datasets and
analytical methods for generating an acceptable thematic map.
That is the ideal. In reality, and certainly in the early years
of remote sensing, the problem statement is more like: given
the data or information actually available, how effectively can
a pixel be labeled into a class of interest?
The process is a mapping from available data (and sometimes
other information) to a label. We express this mapping as
where
in which
is the data description of the pixel, and
is the
associated class of interest. As indicated, generally consists
of a (column) pixel vector of the measurements.
We need to find procedures that allow us to extract meaningful information ( ) from the data ( ) and then move to an
understanding of the scene being imaged. In other words, we
need to move along the chain
data
information
understanding
423
Fig. 1. Deriving the most appropriate label for a pixel based on spatial,
contextual, and prior sources of data or information.
424
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 3, MARCH 2005
first book dealing with quantitative techniques for analyzing remotely sensed data was produced from LARS in 1978 [13].
A 1975 report by Fleming et al. [14] demonstrated how clustering can be used to resolve data multimodality in preparation
for the use of Gaussian maximum-likelihood supervised classification.4 Flemings hybrid methodology has remained one
of the analytical mainstays for thematic mapping from optical
image data. More recently, it has been seen that the method can
be generalized, rendering it amenable to hyperspectral datasets
[15].
During the same period, there were a number of key developments occurring outside LARS that have had a continuing influence of remote sensing image analysis. Haralick et al. [16]
in 1973 provided measures of image texture that are still in use
today [17]. In 1976, Hord and Brooner [18] looked at measures
of thematic map accuracy and thus commenced a significant period of continuing research concerned with sampling methods
for accuracy assessment [19], [20].
So, by the end of the first decade after the formation of LARS,
all the foundations for digital image analysis in remote sensing
had been established. It was the principal decade upon which
later work was to elaborate and expand.
In the latter part of the 1970s, there were three early, important developments to do with image transformation. In 1976, Eppler used the class-sensitive transform associated with canonical
analysis to effect feature reduction [21], while Jensen and Waltz
[22], in a celebrated short paper in 1979, applied the principal
components transformation for feature reduction and display
purposes. Soha and Schwartz [23] showed how the transformation could lead to a decorrelated image display procedure. Kauth
and Thomas in 1976 [24] demonstrated that application-specific
transformations could be devised, and proposed their tasseled
cap model. In 1979, it was shown that the principal components
transformation could be used to monitor changes in land cover
between images of the same region taken on different dates [25].
B. Classification in Context
The next significant era in remote sensing thematic mapping
commenced with the realization that classification should be
carried out sensitive to the spatial neighborhood of a pixel and
by incorporating information from other available spatial data
sources.
The significance of spatial correlation among pixels at satellite spatial resolutions was recognized early on by LARS staff
[26][28] leading to the well-known ECHO classifier described
in detail in 1976 [29]. ECHO, which is still used 25 years after
its introduction,5 is based on region growing to find homogeneous sets of pixels that can be characterized by group means
and covariances. All the pixels of a given group are labeled in
one step by assessing group similarity to each of the training
classes. Pixels that do not naturally occur in groups are handled individually by regular point classification methods (such
as maximum likelihood).
where
,
are two different estimates of the sample cowould be the (poor) class covariance
variance. Typically,
estimate obtained from the available training samples, and
would be the (global) covariance matrix computed over all the
training samples treated as a single group. Sometimes the principal diagonal matrix of the covariance matrix, or a diagonal
matrix with the trace of the matrix as its elements, is used inas appropriate.
stead of ,
This work developed into a series of major contributions
by Landgrebe and colleagues through the 1990s and the
early 2000s, all concerned with the problem of reliable maximum-likelihood classification with high-dimensionality image
datasets [47][56]. Much of this work has appeared in Ph.D.
theses supervised by Landgrebe and is summarized in his recent
book [57].
425
An alternative method for handling covariance matrix estimation in the face of limited training samples has been to simplify
the structure of the matrix. By recognizing that interband correlations are generally strongest in the region of the diagonal, a
block diagonal simplification of the matrix can be adopted that
relaxes considerably the need to gather large numbers of training
samples per class [58].
Algorithm development for thematic mapping may well have
taken a different path had technological limitations not restricted
the spectral range and resolution of the first sensors. If hyperspectral measurements had been available from the start, it is
possible that pixel labeling might have been based on spectroscopic principles because the samples could then have been
composed into a discrete version of the real reflectance spectrum of the earth cover type.
With upward of 200 spectral channels now available the sampled pixel spectrum contains enough detail to allow spectroscopic principles to be applied for image understanding. Rather
than doing that ab initio each time a pixel has to be identified,
the approach normally taken is to compile reference libraries of
prerecorded spectra against which newly measured data can be
compared for identification. This work actually commenced in
the mid 1980s [59]. Early research concentrated on coding the
spectral data to make full-spectrum library searching feasible
[60]. More recently, though, matching procedures have been
developed around only those spectral features known to be diagnostic [61]. They are generally absorption features that can
be characterized by measurements of their position, depth, and
width. Expert system methods are used to assist in the feature
selection process.
D. Multisource Methods
In the mid to late 1980s, LARS researchers and their associates looked systematically at statistical and evidential
procedures for thematic mapping from several data sources
[62][64]. Generally, statistical approaches have been based on
maximizing the multisource posterior probability
This work was generalized by Benediktsson and Swain in devising the consensus theoretic approach to multisource classification [65].
Evidential methods for handling multisource data derive from
the treatment of Shafer [66]. These entail allocating measures
of evidential mass or belief to each of the possible labels for
a pixel (the higher the mass the more likely the label). Mass
can also be allocated to joint labeling possibilities (where we
426
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 3, MARCH 2005
in which the
are a set of mixing proportions. If such a
model is to be used successfully, the number of components
, the mixing parameters , and the means and covariances
for each of the individual distributions all have to be found.
Kuo and Landgrebe [82] use expectation-maximization (EM)
and a range of goodness of fit measures to apply the model to
hyperspectral data. They use feature selection based on nonparametric weighted feature extraction (NWFE) to render the
approach tractable [57]. Dundar and Landgrebe [56] essentially
examine the same problem, but use a range of regularization
where
is polarization, wavelength and the look angle.
Sometimes these models are based on regression relationships.
More complex models simulate the scattering matrix, the elements of which relate the scattered electric field vector to the
incident field for various polarization combinations
Inversion of scattering models, which is often nontrivial, allows the analyst to derive meaningful biophysical information
(biomass, age of sea ice, crop yield, etc.) directly from the
recorded data.
We now look at some of the more significant developments in
backscatter modeling that have found application over the past
two decades.
Moisture content is an important determinant of the strength
of scattering from vegetation. One of the earliest effective,
though simple, radar scattering models was the water-cloud
model for vegetation canopy scattering developed by Attema
and Ulaby [94]. It represents the canopy of a forest, say, as
a cloud of water droplets and then uses results from cloud
physics to avoid having to model the complex canopy physical
structure.
Lang et al. [95][97] introduced dielectric disk models for
foliage to give a more realistic canopy model, while Richards
et al. [98] showed how simple separable structural models can
427
428
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 3, MARCH 2005
Methods such as multisource (joint) statistical modeling, evidential methods and relaxation labeling show promise for data
fusion, as do support vector machines [112] and more general
credit assignment methods [113]. But from an operational viewpoint, they have a number of potentially serious practical limitations, as do all methods that rely on fusing at the data and
decision levels, as discussed in the following.
It is unlikely that a given analyst would be expert in understanding more than one particular data type. Moreover, many
application specialists are unlikely to be expert in techniques
for handling data and, in some cases, might not even care
about the primary datasets that went into deriving products
of importance. While they might find it interesting that the
data originated from satellite sensors, that may not be significant from the point of view of their specific application.
As a simple analogue, those who are interested in weather
forecasts would not normally know (or care) anything about
cloud top temperatures.
In addition, given that the user most likely would now be
served spatial data over a network, there are a number of fundamental requirements that an operational thematic mapping
schema should satisfy. Apart from being able to cope with a
variety of data types, including mixtures of categorical and numerical data, it must do the following.
1) Account for relative data quality and relevance. This has
been well known since the earliest studies of multisource
classification methods. Poor quality data, or that which is
marginally relevant for a particular application, must not
unreasonably influence the outcome.
2) Allow each data source to be analyzed separately in time
and location, and by its own experts. Different data types
are generally recorded by different agencies, and sometimes those data are not all available when an analysis
is first performed. Also, each particular data source has
its own expert analysis community and it is unlikely that
any single agency or person will be expert across the variety of relevant, contemporary data typesspectroscopic
analysis of hyperspectral data and the derivation of biophysical information from multipolarization InSAR data
is an illustration.
3) Allow preexisting thematic maps to be incorporated into
the analysis process, and thematic map revision must be
accommodated. Previously classified data may have value
to a current exercise, as might existing categorical maps,
and should be able to be incorporated. Likewise, map revision is important.
4) Accept that the thematic classes from a combined dataset
might be different from the classes achievable with any
dataset on its own. This is a particularly important reason
as to why data fusion methods are limited when seeking
to form operational joint inferences. Table I provides a
simple illustration: the types of information class relevant
to one particular dataset can be quite different from those
relevant to a different data typein other words information classes are often source-specific. The information
classes of interest to the user may be quite different again,
but should be derivable from the source-specific labels.
TABLE I
CLASSES CAN BE DIFFERENT FOR DIFFERENT DATA TYPES
Fig. 3.
429
Fig. 4.
types.
430
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 3, MARCH 2005
generally. Remote sensing, fundamentally, is an applicationsdriven field, and while there is still room for the development
of further thematic mapping algorithms, the requirements of
the end user must drive the outcomes. Our methodologies must
now be as much about choosing the most relevant primary data
types as doing the actual analysis. It is to consultants, skilled
in translating user requirements into tasks for data source
experts, and expert in combining the single-source outputs
into the product required by the client, that the field must
given. Only then will the ground-breaking and fundamental
analytical work pioneered at LARS, and by Dave Landgrebe,
find maturity.
REFERENCES
[1] K. R. Castleman, Digital Image Processing, 1st ed. Englewood Cliffs,
NJ: Prentice-Hall, 1979.
[2] N. J. Nilsson, Learning Machines. New York: McGraw-Hill, 1965.
[3] R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[4] D. A. Landgrebe. (1986) A brief history of the Laboratory for
Applications of Remote Sensing (LARS). [Online]. Available:
www.lars.purdue.edu/home/LARSHistory.html
[5] G. Cardillo and D. Landgrebe, On pattern recognition, Purdue Univ.,
West Lafayette, IN, LARS Tech. Note 101 866, 1966.
[6] K. S. Fu, D. A. Landgrebe, and T. L. Phillips, Information processing
of remotely sensed agricultural data, Proc. IEEE, vol. 57, no. 4, pp.
639653, Apr. 1969.
[7] A. Wacker, A cluster approach to finding spatial boundaries in multispectral imagery, Purdue Univ., West Lafayette, IN, LARS Tech. Note
122 969, 1969.
[8] K. S. Fu and P. J. Min, On feature selection in multiclass pattern recognition, Purdue Univ., West Lafayette, IN, LARS Tech. Note 080 168,
1968.
[9] P. J. Min, D. A. Landgrebe, and K. S. Fu, Feature selection in multiclass pattern recognition, Purdue Univ., West Lafayette, IN, LARS
Tech. Note 050 170, 1970.
[10] A. Wacker and D. A. Landgrebe, The minimum distance approach to
classification, Ph.D. dissertation, Purdue Univ., School of Elect. Eng.,
West Lafayette, IN, 1971.
[11] P. H. Swain and A. G. Wacker, Comparison of the divergence and B-distance in feature selection, Purdue Univ., West Lafayette, IN, LARS
Tech. Note 020 871, 1971.
[12] P. H. Swain and R. C. King, Two effective feature selection criteria
for multispectral remote sensing, in Proc 1st Int. Joint Conf. Pattern
Recognition, Washington, DC, Nov. 1973, pp. 536540.
[13] P. H. Swain and S. M. Davis, Remote Sensing: The Quantitative Approach. New York: McGraw-Hill, 1978.
[14] M. D. Fleming, J. S. Berkebile, and R. M. Hoffer, Computer-aided analysis of Landsat-1 MSS data: A comparison of three approaches including
a modified clustering approach, in Proc. Symp. Machine Processing of
Remotely Sensed Data, West Lafayette, IN, Jun. 35, 1975, pp. 5461.
[15] X. Jia and J. A. Richards, Cluster space representation for hyperspectral
classification, IEEE Trans. Geosci. Remote Sens., vol. 40, no. 3, pp.
593598, Mar. 2002.
[16] R. M. Haralick, K. Shanmugan, and I. Dinstein, Texture features for
image classification, IEEE Trans. Syst., Man Cybern., vol. SMC-3, no.
6, pp. 610621, Nov. 1973.
[17] K. R. Castleman, Digital Image Processing, 2nd ed. Upper Saddle
River, NJ: Prentice-Hall, 1996.
[18] R. M. Hord and W. Brooner, Land-use map accuracy criteria, Photogramm. Eng. Remote Sens., vol. 42, no. 5, pp. 671677, 1976.
[19] R. G. Congalton and K. Green, Assessing the Accuracy of Remotely
Sensed Data: Practices and Principles. Boca Raton, FL: Lewis, 1993.
[20] G. M. Foody, Status of land cover classification accuracy assessment,
Remote Sens. Environ., vol. 80, pp. 185201, 2002.
[21] W. G. Eppler, Canonical analysis for increased classification speed and
channel selection, IEEE Trans. Geosci. Electron., vol. GE-14, no. 1, pp.
2633, Jan. 1976.
[22] S. K. Jensen and F. A. Waltz, Principal components analysis and canonical analysis in remote sensing, in Proc. Amer. Soc. Photogrammetry
45th Annu. Meeting, 1979, pp. 337348.
431
[73]
[74]
[75]
[76]
[77]
[78]
[79]
[80]
[81]
[82]
[83]
[84]
[85]
[86]
[87]
[88]
[89]
[90]
[91]
[92]
[93]
[94]
[95]
[96]
[97]
, A low-cost classifier for multitemporal applications, Int. J. Remote Sens., vol. 6, pp. 14051417, 1985.
J. A. Benediktsson, P. H. Swain, and O. K. Ersoy, Neural network approaches versus statistical methods in classification of multisource remote sensing data, IEEE Trans. Geosci. Electron., vol. GE-28, no. 4,
pp. 540552, Jul. 1990.
J. A. Benediktsson and P. H. Swain, Statistical Methods and Neural
Network Approaches for Classification of Data from Multiple Sources,
Ph.D., Purdue Univ., School of Elect. Eng., West Lafayette, IN, 1990.
J. A. Gualitieri and R. F. Cromp, Support vector machines for hyperspectral remote sensing classification, in Proc. SPIE27th AIPR Workshop Advances in Computer Assisted Recognition, vol. 3584, , R. J.
Merisko, Ed., 1998, pp. 221232.
B. Sholkoff and A. Smola, Learning with Kernels. Cambridge, MA:
MIT Press, 2002.
C. J. C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining Knowl. Discov., vol. 2, pp. 121167, 1998.
C. Huang, L. S. Davis, and J. R. G. Townshend, An assessment of support vector machines for land cover classification, Int. J. Remote Sens.,
vol. 23, no. 4, pp. 725749, 2002.
J. A. Gualtieri and S. Chettri, Support vector machines for classification of hyperspectral data, in Proc. IGARSS, vol. 2, Honolulu, HI, Jul.
2428, 2000, pp. 813815.
F. Melgani and L. Bruzzone, Classification of hyperspectral remote
sensing images with support vector machines, IEEE Trans. Geosci. Remote Sens., vol. 42, no. 8, pp. 17781796, Aug. 2004.
B.-C. Kuo and D. A. Landgrebe, A robust classification procedure
based on mixture classifiers and nonparametric weighted feature
extraction, IEEE Trans. Geosci. Remote Sens., vol. 40, no. 11, pp.
24862494, Nov. 2002.
P. Cheeseman and J. Stutz, Bayesian classification (AutoClass): Theory
and results, in Advances in Knowledge Discovery and Data Mining,
U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy,
Eds. Menlo Park, CA: AAAI Press, 1996.
M. Datcu and K. Seidel, Bayesian methods: Applications in information aggregation and image data mining, in Int. Arch. Photogramm. Remote Sensing, vol. 34 part 7-4-3, Valladolid, Spain, Jun. 34, 1999, W6.
M. Schroeder, H. Rehrauer, K. Siedel, and M. Datcu, Interactive
learning and probabilistic retrieval in remote sensing image archives,
IEEE Trans. Geosci. Remote Sens., vol. 38, no. 5, pp. 22882298, Sep.
2000.
A. Plaza, P. Martinez, R. Perez, and J. Plaza, A quantitative and comparative analysis of endmember extraction algorithms for hyperspectral
data, IEEE Trans. Geosci. Remote Sens., vol. 42, no. 3, pp. 650663,
Mar. 2004.
R. A. Schowengerdt, Remote Sensing: Models and Methods for Image
Processing. San Diego, CA: Academic, 1997.
J. Zhang and G. M. Foody, A fuzzy classification of sub-urban land
cover from remotely sensed imagery, Int J. Remote Sens., vol. 19, no.
14, pp. 27212738, 1998.
F. del Frate, G. Schiavon, D. Solimini, M. Borgeaud, D. H. Hoekman,
and M. A. M. Vissers, Crop classification using multiconfiguration
C-band SAR data, IEEE Trans. Geosci. Remote Sens. , vol. 41, no. 7,
pp. 16111619, Jul. 2003.
M. E. Engdahl and J. M. Hyyppa, Land-cover classification using multitemporal ERS-1/2 InSAR data, IEEE Trans. Geosci. Remote Sens. ,
vol. 41, no. 7, pp. 16201628, Jul. 2003.
X. Blaes and P. Defurney, Retrieving crop parameters based on tandem
ERS1/2 interferometric coherence images, Remote Sens. Environ., vol.
88, pp. 374385, 2003.
J. R. Santos, C. F. Freitus, L. S. Aruajo, L. V. Dutra, J. C. Mura, F. F.
Gama, L. S. Sola, and S. J. S. SantAnna, Airborne P-band SAR applied
to the aboveground biomass studies in the Brazilian tropical rainforest,
Remote Sens. Environ., vol. 87, pp. 482493, 2003.
Special issue on retrieval of bio- and geophysical parameters from SAR
data for land applications, IEEE Trans. Geosci. Remote Sens. , vol. 41,
no. 7, pp. 15391710, Jul. 2003.
E. P. W. Attema and F. T. Ulaby, Vegetation modeled as a water cloud,
Radio Sci., vol. 13, pp. 357364, 1978.
R. H. Lang, Electromagnetic scattering from sparse distribution of
lossy dielectric scatterers, Radio Sci., vol. 16, pp. 1530, 1981.
R. H. Lang and J. S. Sidhu, Electromagnetic backscattering from a layer
of vegetation: A discrete approach, IEEE Trans. Geosci. Remote Sens.,
vol. GE-21, no. 1, pp. 6271, Jan. 1983.
R. H. Lang and H. A. Saleh, Microwave inversion of leaf area and inclination angle distribution from backscattered data, IEEE Trans. Geosci.
Remote Sens. , vol. GE-23, no. 5, pp. 685694, Sep. 1985.
432
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 3, MARCH 2005
[111] Special issue on data fusion, IEEE Trans. Geosci. Remote Sens. , pt.
1, vol. 37, no. 3, pp. 11871377, May 1999.
[112] G. H. Halldorsson, J. A. Benediktsson, and J. R. Sveinsson, Support
vector machines in multisource classification, in Proc. IGARSS, vol. 3,
Toulouse, France, Jul. 2003, pp. 20542056.
[113] C. M. Bachman, M. H. Bettenhausen, R. A. Fusina, T. F. Donato, A.
L. Russ, J. W. Burke, G. M. Lamela, W. J. Rhea, B. R. Truit, and J.
H. Porter, A credit assignment approach to fusing classifiers of multiseason hyperspectral imagery, IEEE Trans. Geosci. Remote Sens., vol.
41, no. 11, pp. 24882499, Nov. 2003.
[114] A. Srinivasan, An artificial intelligence approach to the analysis of multiple information sources in remote sensing, Ph.D. thesis, Univ. New
South Wales, School of Elect. Eng., Kensington, Australia, 1991.
[115] A. Srinivasan and J. A. Richards, Analysis of GIS spatial data using
knowledge-based methods, Int. J. Geograph. Inf. Syst., vol. 7, no. 6,
pp. 479500, 1993.
[116] J. Lloyd, Logic for Learning, ser. Cognitive Technologies Series. Berlin, Germany: Springer-Verlag, 2003.