Beruflich Dokumente
Kultur Dokumente
net/publication/233246457
The use of Neural Networks for the estimation of oceanic constituents based
on the MERIS instrument
CITATIONS READS
58 67
3 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Eon O'mongain on 07 June 2014.
SEAÂ N DANAHER
FIES, Leeds Metropolitan University, Leeds LS1 3HE, England, UK
Abstract. Arti® cial Neural Networks ( NNs) are used in estimations of oceanic
constituents from simulated data for the Mechron Resolution Imaging
Spectrometer (MERIS) instrument system for Case II water applications. The
simulation includes the e ects of oceanic substances such as algal related chloro-
phyll, non-chlorophyllous suspended matter and DOM (dissolved organic matter).
It is shown here that NNs can be used to estimate oceanic constituents based on
simulated data which include the e ects of realistic noise and variability models.
The advantage of NNs is that they not only achieve higher retrieval accuracy
than more traditional techniques such as band ratio algorithms, but they also
allow the inclusion of usually super¯ uous or unused information, such as geomet-
ric parameters and atmospheric visibility.
1. Introduction
A number of methods have been proposed for the estim ation of oceanic constitu-
ents from both real and sim ulated remotely sensed data (Gordon et al. 1983, Parslow
1991, Danaher and O’M ongain 1992, Danaher et al . 1992, O’M ongain et al. 1993 ).
Recent attem pts at maximizing the inversion capabilities have concentrated on the
use of Neural Networks ( NNs) to allow inversion with the inclusion of parameters
such as Sun angle and viewing geometry ( Benediktsson et al. 1993, Buckton et al.
1995 ).
The sim ulations developed by University College Dublin and Leeds M etropolitan
University allow the calculation of the expected signal at satellite level, including the
e ects of atmospheric interaction and instrument noise (O’Mongain et al. 1993,
Buckton et al. 1995 ). Any realistic sim ulation requires noise and variability models
so as to provide information about the true ability to retrieve the concentration of
oceanic constituents.
² e-mail: Eon.Omongain@ucd.ie
Figure 1. General structure of the simulation and estimation of the oceanic constituents
utilizing oceanic, atmospheric and satellite models.
m g lÕ
1
C c hl 0 30
mg lÕ
1
C s ed 0 1
a ( 440 ) m Õ
1
C ye l 0 1
hS 23 53 deg.
hv 0 47 deg.
w 0 180 deg.
t p (l 0 ) 0.1 0.15
ME R IS 1843
b b (l , z)
R (l , z) = 0.331 . (1 )
a (l , z)
For this the absorption a and backscatter b are calculated by the sim ple addition
of their speci® c components. Hence the total absorption and total backscatter are
given by the equations
a = aw + ac+ as + ay b b = bw + bc + bs+ b y (2 )
where the subscripts w, c, s and y denote water, chlorophyll-a indexed algae, sedi-
ments and DOM . M orel’s model for oceanic re¯ ectance for Case I waters is used to
calculate the absorption and backscatter coe cients attributable to algae and is then
modi® ed to the Case II situation by the inclusion of the e ects of sedim ents and
DOM .
The pure water measurem ents used are taken from Smith and Baker (1981 ),
while the sedim ent’ s speci® c absorption and backscatter are taken from measure-
ments made by Doer er ( 1992 ). The values for the absorption and scatter due to
sedim ents are very much dependent on the sedim ent type chosen and hence are
speci® c to a particular body of water. The absorption coe cient due to DOM can
be modelled by the equation
a y (l )= a (l 0 ) exp [Õ 0.014 (l Õ l0 )] (3 )
where l 0 is taken to be 440 nm, which is our reference point for the concentration
of the DOM . The backscatter due to DOM is assumed to be negligible and can be
dropped from equation ( 2 ).
The atmospheric model requires as input the surface re¯ ectance, which is assumed
to be a Lambertian re¯ ector. The above surface re¯ ectance is calculated from the
subsurface re¯ ectance by the constant factor of a half. This sim ulation of the air ± water
interface will require further work to improve its realism. The subsurface re¯ ectances
due to increasing concentrations of the constituents, individually, are shown in
® gure 2.
re¯ ectance decreases in the blue as the chlorophyll concentration increases; for sedi-
ment the re¯ ectance increases as the concentration increases; and for yellow substance
the re¯ ectance in the blue decreases as the concentration increases.
the sim ulation having a range 0.1 to 0.15, where t p (l 0 ) = 0.132 correspo nds to 23 km
horizontal visibility.
Figure 3. The signal-to-noise ratio (SNR) for the MERIS instrument for a typical dataset,
including photonic and instrument noise. Instrument and noise models used were
those available at the time of the simulation and may di er from current models.
every neuron on the next layer. The input p and the output a of a single neuron are
related by a weight w , bias b and a transfer function f, as represented by ® gure 4
and the equation
a = f (w p + b) (4 )
The transfer function of the neuron can vary from binary to linear to complex
functions; log± sigmoid transfer functions are the type used here (with the exception
of the output layer which is linear). The output of a single neuron can be represented
by the equation
1
a= (5 )
1 + exp (w p + b)
The output of the neuron would either feed through into the next layer or form one
of the outputs of the network (if the neuron is in the last layer).
The overall structure is graphically represented in ® gure 5 where the input values
to the complete network are speci® ed in the input vector P with the ® nal output
vector I , being determ ined by the combination of the input with the weights and
biases. The network may be represented by some function F such that we may write
I = F ( P , w 1 , .., w l , b 1 , .., b l ) (6 )
Figure 4. A single neuron consisting of an input value p , a weight w , a bias b and a transfer
function f.
1846 D. B u ck to n et al.
Figure 5. A two-layer network consisting of three inputs, a four-neuron sigmoid layer and a
linear output layer.
where w i and b i represent vectors containing the weights and biases of the i th layer,
l being the total number of layers in the network.
The purpose of NNs in this context is to e ectively perform a function approxi-
mation. If, for example, we have a function G that produces, based on some physical
parameters x and y , the observation O given by
O = G (x , y ) (7 )
then it is possible, using derived values of the weights and biases, to produce a NN
which can either approximate the function G or its inverse G Õ . Here the sim ulation
1
performs the function of G , with x being the constituents, y the viewing geometries
and O the observations. We desire the network to perform e ectively an inversion
of this, such that from the observational data and the viewing information we can
robustly estim ate the constituent concentrations.
For the network to approximate a function, the appropriate values for the weights
and biases have to be determ ined. This is done in the training stage. This utilizes a
set of inputs P and a known correspo nding calibrating or training set T which span
the input range that the user wishes the network to operate over. An error minimizing
function is presented with the network, along with random initial guesses of the
weights and biases and the set of inputs P with the correspo nding set of training
parameters T . The training algorithm initially calculates the output vector or matrix
I from P . This is denoted I 0 , the 0 subscript denoting the zeroth epoch or stage in
the error minimization process. The training algorithm calculates the error in a sum
squared sense between I 0 and T and modi® es the weights and biases in accordance
with the method used by the error minimization algorithm, which would often be
something like the line of greatest slope of the error curve. This is repeated until the
error between T and I reaches a satisfactory level. Once training has been completed
it is then possible to use the network to estim ate from a new set of observables the
correspo nding input values.
The initial requirem ents for the implementation of a neural network lie in:
decisions about the numbers of layers to use in the network; the number of neurons
to use in each layer; the transfer function of the neurons in each layer; and the
ME R IS 1847
method of presenting the data to the network. The mechanisms for training and
implementation are well documented (M athworks Inc. 1994 ). The principal training
algorithm used here is that attributable to Levenberg± M arquardt which, while
requiring more memory than other techniques for its implementation, converges
quickly to global minima. It should also be noted that the ® nal implementation of
a NN and the training technique are quite independent processes. W hile the training
process tends to be numerically intensive, the ® nal implementation requires
comparatively little computation.
spanned by the signal space associated with the constituents, we can select the ® rst
k colum ns of the W matrix as input to the network which we denote by a prime,
giving W ¾ . This reduces the size of the input matrix to the network. Typically the
value of k used is between eight and ten, depending on the relative levels of signal
and noise. The geometric parameters are usually appended (columnwise) to this to
generate the input to the network P , given by
P = [W ¾ hS hv w] . ( 11 )
The NN is trained for this with the associated training matrix T ; this contains the
oceanic constituent concentrations as speci® ed in equation ( 9 ).
Once the weights and biases have been established by training, the constituent
concentrations for subsequent datasets can be estim ated by presenting the network
with a new set of data. A new input matrix PÄ can be generated from the new
observation matrix OÄ by making use of the unitary orthogonal properties of the V
matrix, hence
Ä V ¾ L¾ Õ
PÄ = [ O
1
hÄ S hÄ v wÄ ] ( 12 )
where again V ¾ and L¾ are reduced matrices containing only the ® rst k colum ns and
diagonal elements respectively, with the tilde signifying the parameters used belong
to a set independent of the training set.
The training phase is deem ed to be complete when the error on a second training
set decreases negligibly or starts to increase. Continued training is unlikely to improve
the inversion accuracy, or in the case of an overdetermined network it is likely to
cause overtraining , a situation where the network ® ts to speci® c data points instead
of generalizing the overall scene. At this stage inversion accuracy is tested on a third
sim ulation set to provide a truly independent estim ate of the accuracy.
W hereas the weights and biases are calculated within the training algorithm,
human intervention is used to determ ine both the number of layers and the number
of neurons in each layer. W ith too sim ple an architecture (too few neurons and / or
layers) the sum squared error will not reduce su ciently even with the training
data. W ith too complex an architecture the network will yield a very low error on
the ® rst training set but the performance will degrade on the second dataset
(overtraining).
relationship between the estim ated and the true value, and a correlation of zero
indicates no relationship between the two measures. It should be noted that the
correlation measure ignores any o set or bias and scaling errors between the two
sets. Hence the correlation measure is not a complete measure of accuracy. The
RM S error calculates the normed di erence between the two measures and express
it in a decibel format. Assuming a normal distribution of error between the estim ate
and the sim ulated value, the error can then be described as to be within 30% for
Õ 10 dB, 10% for Õ 20 dB and 3% for Õ 30 dB.
4. Results
The model input in the form of sim ulated concentrations and geometry is shown
in table 1. The general procedure for the selection of a suitably sized NN is to start
with an undersized network which is trained until little improvement per epoch is
attained. The size of the network is gradually increased until no signi® cant improve-
ment with size is attained or the presence of overtraining is observed. The number
of ¯ oating point operations (¯ ops) for the inversion process is dependent on the size
of the network and increases signi® cantly with the number of nodes in each layer.
n dB %
n dB %
Figure 6. Simulated chlorophyll concentration versus estimated concentration for 300 data
points. The left plot corresponds to simulated data without noise, obtaining a retrieval
accuracy of 3% with a correlation of 0.994. The right plot shows retrieval abilities in
the presence of photonic and instrument noise, obtaining a retrieval accuracy of 27%
with a correlation of 0.85. Both plots include unknown atmospheric optical depth.
e ect on the inversion accuracy, the results of which are shown in table 3. A scatter
plot of the sim ulated chlorophyll concentration versus chlorophyll concentration
retrieved is shown in ® gure 6. For the results presented here k = 10; the slight
reduction in the retrieval accuracy due to this is o set by the reduction in the
computational requirem ents for inversion and training.
These results show that the use of NNs in the inversion of oceanic constituents
in Case II waters is likely to be able to calculate the chlorophyll-a and sedim ent
concentration to an accuracy of between 10% and 30% over the range speci® ed.
5. Conclusion
In the absence of noise we have shown how it is possible, using a reasonable
number of pixels with ground truth (approximately 300 data points), to invert
sim ulated satellite observations to calculate oceanic constituent concentrations.
However, the performance of any retrieval technique will be adversely a ected by
ME R IS 1851
the presence of noise sources and variability as outlined in §4.2. W ith the inclusion
of instrument and photonic noise, variations due to the atmosphere (in a limited
way) and geometric parameters, the inversion accuracy is reduced but still performs
within an acceptable accuracy for Case II waters. The required level of computation
to perform such an inversion (including pre-processing ) is approximately 1130 ¯ ops
per pixel for the network used here. This number may be reduced by the use of look
up tables.
Further work is required to examine the e ects of additional noise, variabilities
and instabilities as described. It is also desirable to include atmospheric correction
schem es, whether NN based (O’Mongain et al. 1993 ) or using traditional techniques,
so that the inversion mechanism can be trained on estim ated above water re¯ ectances.
References
B enediktsson, J ., S wain P ., and E rsoy, K ., 1993, Conjugate-gradient neural networks in
classi® cation of multisource and very high dimensional remote sensing data.
Inte rnatio nal Jo urnal of R em ote S ensing , 14 , 2883± 2903.
B uckton, D ., D anaher, S ., and O’ M ongain, E ., 1995, Simulation of the MERIS instrument
and constituent estimation. P roceeding s S P IE G lo bal P rocess Monito ring and R em ote
S ensing of the O cean and S ea Ice , 2586 , 2± 13.
C arling, A ., 1992, Introducing Neural Netw orks ( Sigma Press, UK). ISBN: 1-85058-174-6.
D anaher, S ., and O’ M ongain, E ., 1992, Singular value decomposition in multispectral
radiometry. Inte rnatio nal Jo urnal of R em ote S ensing , 13 , 1771± 1777.
D anaher, S ., O’ M ongain, E ., and W alsh, J ., 1992, A new cross-correllation algorithm and
the detection of rhodamine-B dye in sea water. Inte rnatio nal Jo urnal of R em ote S ensing ,
13 , 1743± 1755.
D emuth, H ., and B eale, M., 1994, T he Neural Netw ork T oolbo x : User G uide , The Mathworks
Inc., Natick, MA, USA.
D eschamps, P . Y ., H erman, M ., T anre, D ., R ouquet, M . C ., and D urpaire, J . P ., 1982,
E ets atmosphrics et e valuation du signal pour des instruments optiques de te le de tec-
tion. E SA Jo urnal , 6 , 233± 246.
D oerffer, R ., 1992, Imaging spectroscopy for detection of chlorphyll and suspended matter,
GKSS, ISSN 0344-9629; F. Toselli and J. Bodechtel (eds.), Im agin g S pectroscopy:
F u ndam enta ls and P rospectiv e A pplica tio ns , 215± 257.
G ordon, H ., B rown, O . B ., and J acobs, M . M ., 1975, Computed relationships between
inherent and apparent optical properties of a ¯ at homogenous ocean. A pplied O ptic s ,
14 ( 2), 417± 427.
G ordon, H ., C lark, D . K ., B rown, O . B ., E vans, R . H ., and B roenkow, W . W ., 1983,
Phytoplankton pigment concentrations in the middle Atlantic bight: comparison of
ship determinations and CZCS estimates. A pplied O ptic s , 22 ( 1 ), 20± 36.
M orel, A ., 1988, Optical modeling of the upper ocean in relation to its biogenous matter
content (Case I waters). Jo urnal of G eophysical R esearch , 93 , 10 749± 10 768.
O’ M ongain, E ., D anaher, S ., B uckton, D ., and B ezy, J . L ., 1993, De® nition of the calib-
ration requirements for an imaging spectrometer system. P roceeding s S P IE R ecent
A dvances in S ensors, R adio m etric C alib ratio n, and P rocessing of R em ote ly S ensed D ata ,
1938 , 88± 99.
P arslow, J ., 1991, An e cient algorithm for estimating chlorophyll from CZCS data.
Inte rnatio nal Jo urnal of R em ote S ensing , 12 , 2065± 2072.
S mith, R ., and B aker, K ., 1981, Optical properties of the clearest natural waters ( 200± 800 nm).
A pplied O ptic s , 20 , 177± 184.