Support Vector Regression Methodology For Storm Surge Predictions

Support vector regression methodology for storm surge predictions
S. Rajasekaran
a,
, S. Gayathri
a
, T.-L. Lee
b
a
Department of Civil Engineering, PSG College of Technology, Coimbatore 641004, Tamil Nadu, India
b
Department of Construction and Facility Management, Leader University, Taiwan, ROC
a r t i c l e i n f o
Article history:
Received 11 March 2008
Accepted 8 August 2008
Available online 14 August 2008
Keywords:
Prediction
Storm surge
Support vector regression
Kernel function
a b s t r a c t
To avoid property loss and reduce risk caused by typhoon surges, accurate prediction of surge deviation
is an important task. Many conventional numerical methods and experimental methods for typhoon
surge forecasting have been investigated, but it is still a complex ocean engineering problem. In this
paper, support vector regression (SVR), an emerging articial intelligence tool in forecasting storm
surges is applied. The original data of Longdong station at Taiwan invaded directly by the Aere typhoon
are considered to verify the present model. Comparisons with the numerical methods and neural
network indicate that storm surges and surge deviations can be efciently predicted using SVR.
& 2008 Elsevier Ltd. All rights reserved.
1. Introduction
Storm surge is caused primarily by high winds pushing on the
oceans surface. The wind causes water to pile up higher than the
ordinary sea level. Low pressure at the center of a weather system
also has a small secondary effect, as can the seabed bathymetry. It
is this combined effect of low pressure and persistent wind over a
shallow water body that is the most common cause of storm surge
ooding problems. Storm surges are particularly dangerous when
a high tide occurs together with the surge. In the United States,
the greatest recorded storm surge was generated by 2005s
Hurricane Katrina, which produced a storm surge 9m (30ft) high
in the town of Bay St. Louis, Mississippi. While high-impact events
will undoubtedly occur in the future, the advent and further
improvement of predictions of storm surges may greatly reduce
the loss of lives, and potentially reduce property damage.
Storm surge models were developed in the 1950s. Hansen
(1956) proposed a uid dynamic model to describe the storm
surge phenomenon in the North Sea. Coarse- and ne-grid cases
were considered by Jelesnianski (1965) to calculate storm surges.
A special program to list the amplitudes of surges from hurricanes
(SPLASH) was developed by Jelesnianski (1972). FEMA (1988)
developed a storm surge forecast model using the nite-difference
method. Holland (1980) used wind forcing to derive storm surge
models. Kawahara et al. (1982) applied a two-step explicit nite-
element method for storm surge propagation analysis. Hubbert
et al. (1991) proposed an operational typhoon storm surge
forecast model for the Australian region. Flather (1991) applied
the typhoon model of Holland (1980) to simulate typhoon-
induced storm surges in the northern Bay of Bengal. The sea,
lake, and overland surges from hurricanes (SLOSH) model
(Jelesnianski and Shaffer, 1992) followed SPLASH and considered
wetting and drying processes. Emergency managers use this
SLOSH data to determine which areas must be evacuated to
avoid storm surge consequences. Storm surge forecast systems
are based on two-dimensional shallow-water equation models
(Vested et al., 1995; Bode and Hardy, 1997). Hsu et al. (1999)
developed a storm surge model for Taiwan and arrived at
analytical expressions considering both gradient wind and radius.
The MIKE 21 hydrodynamic model is a general numerical
modeling system for simulation of unsteady two-dimensional
ow, developed at the Danish Hydraulic Institute of Water and
Environment (DHI, 2002). Xie et al. (2004) introduced a Princeton
ocean model storm surge to consider ooding and drying, and
further consider nonlinear effects in the surge process. Huang
et al. (2005) applied nite-volume method (FVM), a numerical
method to simulate typhoon surges in the northern part of
Taiwan. Despite close attention of the engineering community, the
storm surge problem is still far from its solution, because there are
too many unknown parameters to account for, such as the central
typhoon pressure, speed of typhoon, rainfall and inuence of local
topography.
Articial neural networks (ANNs) are being widely applied to
various areas to overcome the problem of exclusive and nonlinear
relationships. The back-propagation neural network (BPN) devel-
oped by Rumelhart et al. (1986) is the most representative
learning model for the ANN. The procedure of the BPN repeatedly
adjusts the weights of connections in the network so as to
minimize the measure of difference between the actual output
vector of the net and the desired output vector. The BPN is widely
ARTICLE IN PRESS
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/oceaneng
Ocean Engineering
0029-8018/$ - see front matter & 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.oceaneng.2008.08.004
Corresponding author. Tel.: +914222574754; fax: +914222573833.

E-mail address: drrajasekaran@gmail.com (S. Rajasekaran).
Ocean Engineering 35 (2008) 15781587
applied in a variety of scientic areasespecially in applications
involving diagnosis and forecasting.
ANN has been widely applied in various areas (Daniell, 1991;
French et al., 1992; Karunanithi et al., 1994; Grubert, 1995; Minns,
1998; Tsai and Lee, 1999; Lee et al., 2002; Nagy et al., 2002; Lee
and Jeng, 2002; Coppola et al., 2003; Jeng et al., 2004; Lee, 2004,
2006; Makarynskyy, 2005; Bateni et al., 2007). Recently, Lee
(2008) forecast a storm surge using neural networks (NNs).
However, the estimated results still need to be improved.
In uid dynamic models, the accuracy of storm surges depends
on the neness of the grid. The storm surge forecast model using
nite-difference method involved the solution of a large number
of equations. Explicit nite-element methods for storm surge
propagation analysis are not very accurate compared to implicit
methods. The storm surge forecast system cannot be applied to
deep-water equation models. In FVMs, similar to nite-difference
methods, values are evaluated at discrete places on a meshed
geometry. In this method, the typhoon model is always repre-
sented in terms of relatively simple parametric model, the
distribution of wind velocity and pressure. These methods also
involve ve equations and many boundary conditions and the
method is conservative. This work was carried out by the third
author and the calculated computer time was 9.6h on an Intel
Pentium IV 2.6GHz personal computer. Since there are too many
unknown parameters to account for, predictions of storm surges
by NN and support vector machine (SVM) will be useful. BPN
model was adopted by the third author to predict storm surges by
using one hidden layer with three hidden neurons with a learning
rate of 0.07 and momentum factor of 0.9, and the network was
trained for 10,000 epochs. The training time was 50 s for every
case based on an Intel Pentium IV personal computer with a
2.6GHz CPU and 1GB DRAM. Hence one needs to determine the
number of hidden layers, number of neurons in a hidden layer,
momentum factor, learning rate, sigmoidal gain, etc., whereas
in SVM, if a proper kernel is selected, two parameters such as C
and e are required to achieve the desired accuracy. Therefore, the
support vector regression (SVR) model considered for forecasting
storm surges and surge deviations is desired and considered in
this paper.
2. Numerical storm surge model
In general, numerical methods are often applied to establish
an efcient storm surge forecasting system, such as FVM and
nite-difference method. Many different analytical models were
proposed in the literature for generating realistic air pressure and
surface wind distributions (Dube et al., 1994). Common to all
these analytical models is that they require information in terms
of the following typhoon characteristics: (i) maximum wind
speed, (ii) central pressure, (iii) radius of maximum wind, and (iv)
parameters describing the shape of pressure and wind distribu-
tions. Often some of these parameters are related using empirical
formulae derived from historical typhoon data. The analytical
storm surge models are based on different formulations of surface
wind and pressure elds.
Comparing with the measured and simulated results of the
SVR model to storm surge, the most suitable storm surge model
can be applied for Taiwan. In this storm surge model, two-
dimensional shallow-water equation models based on three-
dimensional NavierStokes equation as the hydrodynamic model
(Vested et al., 1995) are used. The factors affecting the uid
eld, such as the tide, wind, Coriolis force, bottom friction, uid
shear stress, and topographical boundary conditions must be
considered in this model. Natural typhoon wind elds are usually
intense, spatially inhomogeneous and directionally varying. The
large gradients in wind speed and rapidly varying wind directions
of the typhoon vortex can generate very complex ocean wave
elds, but for practical applications the wind elds are always
represented in terms of relatively simple parametric models.
In this paper, the air pressure distribution form is assumed to
be an exponential relation that can be expressed as (Hubbert et al.,
1991)
P
r
P
c
P
n
P
c
exp
r
R
max
_ _
B
_ _
(1)
where P
c
is the central pressure, P
n
the ambient or environmental
pressure depression, r the radius (distance from the typhoon
center), R
max
the radius to the point of maximum wind speed, and
B is a shape parameter. The value of shape parameter B can be
assumed as 12.5 for the most favorable results (Hsu et al., 1999).
The wind speed distribution is usually derived using a
gradient wind model or simply by using an empirical expression.
The storm surge model, where analytical expressions of both
the gradient wind and the radius, derived from the governing
momentum equation using the air pressure distribution in Eq. (1),
is given by
V
g

P
n
P
r
a
R
max
r
exp
r
R
max
_ _
B
_ _
1
2
Or
_ _
2
_ _
1=2
1
2
Or (2)
in which O is the Coriolis parameter, r
a
the air density, and the
shape parameter B is 1 in this paper.
ARTICLE IN PRESS
Notations
B shape parameter ( 1)
C pre-specied value
L Lagrange function
H hidden layer
I input layer
l no. of training samples
O output layer
O
k
output value
P
c
central pressure, hPa
P
n
environmental pressure depression
r distance from typhoon center
R the set of real numbers
R
max
radius to the point of maximum wind speed
R(a) risk
R
emp
(a) empirical risk
T
k
target value
W objective function
K kernel function
n dimension of input space
N the set of natural numbers
X input space (space of observable states)
Y output space (space of hidden states)
a
, a coefcients
r scale parameter
r
a
air density
s standard deviation
O Coriolis parameter
x
, x slack variables
S. Rajasekaran et al. / Ocean Engineering 35 (2008) 15781587 1579
3. Neural networks
The ANN is an information-processing system mimicking
the biological NN of the brain by interconnecting many articial
neurons. The neuron accepts inputs from a single source or
multiple sources and produces outputs by simple calculation
processing with a predetermined nonlinear function. Since the
principle of ANN has been well documented in the literature, only
a brief is given in this section.
A typical three-layered network with an input layer (I), a
hidden layer (H), and an output layer (O; see Fig. 1) is adopted in
this study. Each layer consists of several neurons and the layers
are interconnected by sets of correlation weights. The neurons
receive inputs from the initial inputs or the interconnections and
produce outputs by transformation using an adequate nonlinear
transfer function. A common transfer function is the sigmoid
function expressed by f(x) (1+e
x
)
1
; it has the characteristic
of df/dx f(x)[1f(x)]. The training processing of NN is essentially
executed through a series of patterns. In the learning process,
the interconnection weights are adjusted within input and output
value.
The BPN is the most representative learning model for the
ANN. The procedure of the BPN is repeated by adjusting the
weights of the connections in the network so as to minimize
the error (the difference between actual output vector of the net
and the desired output vector). The gradient descent method is
utilized to calculate the weight of the network and adjust the
weight of interconnections to minimize the output error. The error
function at the output neuron is dened as
E
1
2
k
T
k
O
k
2
(3)
where T
k
and O
k
denote the value of target and output,
respectively. Further details of the BPN algorithm can be found
in Rumelhart et al. (1986).
4. Support vector machine
4.1. Introductory remarks on support vector machine
SVM (Burges, 1998) is an emerging technique which has
proven successful in modeling problems in engineering applica-
tions. An interesting property of this approach is that it is an
approximate implementation of the structural risk minimization
(SRM) principle and is based mainly on Vapniks Statistical
Learning theory (Vapnik, 1995). SVM is gaining popularity due
to many attractive features and promising empirical performance.
It can be generally thought of as an alternative training technique
for multi-layer perceptron (MLP) and radial basis function (Rbf)
classiers, in which the weights of the network are found by
solving a quadratic programming (QP) problem with linear
inequality and equality constraints, rather than by solving a
non-convex, unconstrained minimization problem, as in standard
NN training techniques. Their formulation embodies the SRM
principle, which has been shown to be superior to traditional
empirical risk minimization (ERM) principle, employed by many
of the other modeling techniques (Osuna et al., 1997; Gunn, 1998).
SRM minimizes the error on the training data. It is this difference
which equips SVM with a greater ability to generalize, which is
the goal in statistical learning. SVMs were rst developed to solve
the classication problem, but recently they have been extended
to the domain of regression problems. Zhang et al. (2006) applied
SVM for online health monitoring problems. Yu et al. (2006) used
SVM regression method for real-time ood storage forecasting,
whereas Lee et al. (2007) applied the technique for predicting
concrete strength. Chapelle et al. (2002) determined multiple
parameters for SVM and Cherkassky and Ma (2004) elaborated a
method for practical selection of SVM parameters; noise estima-
tion for SVM regression model induction with SVM was carried
out by Dibike et al. (2000).
4.2. Basis of support vector machine
This section reviews a few concepts from the theory of
statistical learning that was developed by Vapnik (1995) and
others and is necessary to appreciate the support vector (SV)
learning algorithm. For the case of two-class pattern recognition,
the task of learning from examples can be formulated in the
following way:
Given a set of functions
ff
a
: a 2 ^g; f
a
: R
N
! f1g (4a)
where ^ is a set of parameters and a a set of examples, i.e. pairs of
patterns x
l
and labels y
l
:
x
l
; y
l
; . . . ; x
l
; y
l
2 R
N
xf1g (4b)
Each one of them is generated from an unknown probability
distribution P(x, y) containing the underlying dependency. It is
now required to learn a function f
a
that provides the smallest
possible value for the average error committed on independent
examples randomly drawn from the same distribution P, called
the risk:
Ra
1
2
f x y dPx; y (5a)
The problem is that R(a) is unknown, since P(x, y) is unknown.
Therefore an induction principle for risk minimization is neces-
sary. The straightforward approach to minimize the empirical risk
R
emp
a
1
l
l
i1
jf x
i
y
i
j (5b)
does not guarantee a small actual risk if the number l of training
examples is limited. In other words, a smaller error on the training
set does not necessarily imply higher generalization ability (i.e. a
smaller error on an independent test set). In order to make use of
the limited amount of data, a novel statistical technique called
structural risk minimization has been developed. The theory of
uniform convergence in probability developed by Vapnik and
Chervonenkis (VC) provides bounds on the deviation of the
empirical risk from the expected risk. For a and l4h, a typical
uniform VC bound, which holds with probability 1Z, has the
ARTICLE IN PRESS
Fig. 1. Structure of storm surge model for BPN.
following form:
RapR
emp
a
hlog2l=h 1 logZ=4
l
_
(6)
The parameter h is called the VapnikChervonenkis (VC)
dimension (Cortes and Vapnik, 1995) of a set of functions and it
describes the capacity of a set of functions. The VC dimension is a
measure of the complexity of the classier, and it is often
proportional to the number of free parameters of the classier
f
a
. The SVM algorithm achieves the goal of minimizing a bound
on the VC dimension and the number of training errors at the
same time.
4.3. Support vector regression
SVMs (Dibike et al., 2000; Cortes and Vapnik, 1995) can also
be applied to regression problems by the introduction of an
alternative loss function that is modied to include a distance
measure. Considering the problem of approximating the set of
data (x
1
, y
1
),y,(x
l
, y
l
), xAR
N
, yAR with a linear function
f x; a w dx b (7)
The optimal regression function is obtained by minimizing the
empirical risk
R
emp
w; b
1
l
jy
i
f x
i
; aj
(8)
With the most general loss function with e-insensitive zone
described as
y f x; aj
if jy f x; ajp
jy f x; aj otherwise
_
(9)
the goal is to nd a function f(x, a) that has at most e deviation
from the actual observed targets y
i
for all the training data, and at
the same time, is as at as possible. This is equivalent to
minimizing the functional
fw; x
n
; x jjwjj=2 C
x
n
i

x
i
_ _
(10)
where C is a pre-specied value and x
, x (see Fig. 2) are slack

variables representing upper and lower constraints on the outputs
of the system as follows:
y
i
wx
i
bp x
i
; i 1; 2; . . . ; l
wx
i
bp x
n
i
; i 1; 2; . . . ; l (11)
x
n
i
X0 and x
i
X0; i 1; 2; . . . ; l
Now the Lagrange function is constructed from both the
objective function and the corresponding constraints by introdu-
cing a dual set of variables as follows:
L jjwjj
2
=2 C
x
i
x
n
i

a
i
x
i
y
i
wx
i
b
a
n
i
x
n
i
y
i
wx
i
b
Z
i
x
i
Z
n
i
x
n
i
(12)
It follows from the saddle point condition that the partial
derivatives of L with respect to the primary variables (w, b, x
i
, x
i
)
have to vanish for optimality. Substituting the results of this
derivation into Eq. (11) yields the dual optimization problem
Wa
n
; a
a
n
i
a
i

y
i
a
n
i
a
i
1=2
a
n
i
a
i
a
n
j
a
j
x
i
x
j
(13a)
which has to be maximized subject to the constraints
a
n
i

a
i
; 0pa
i
pC; and
0pa
i
pC for i 1; 2; . . . ; l (13b)
Once the coefcients a
i
and a
i
are determined from Eq. (12),
the desired vectors can now be found as
w
0

a
n
i
a
n
i
x
i
and therefore
f x
a
n
i
a
i
x
i
x b
0
(13c)
Similarly in the nonlinear SVR approach, a nonlinear mapping
kernel can be used to map the data into a higher dimensional
feature space where linear regression is performed. The quadratic
form to be maximized can then be rewritten as
wa
n
; a
a
n
i
a
i

y
i
a
n
i
a
i
1=2
a
n
i
a
i
a
n
j
a
j
Kx
i
; x
j
(14)
and the regression function is given by
f x w
0
b
0
(15)
where
w
0
x
a
0
i
a
0n
i
kx
i
; x and
b 1=2
a
0
i
a
0n
i
kx
r
; x
i
kx
s
; x
j
(16)
The SVR Model diagram is shown in Fig. 2.
ARTICLE IN PRESS
- +
+
Fig. 2. Pre-specied accuracy e and slack variable x in SV regression (adapted from Scholkopf, 1997).
5. Kernel functions
In machine learning context, a vector
X
T
hX
1
X
2
X
3
i (17)
The inner product or dot product is dened as
h
X;

Yi
X
1
X
2
X
3
_
_
_
_
d
Y
1
Y
2
Y
3
_
_
_
_
hXifYg
X
1
Y
1
X
2
Y
2
X
3
Y
3
(18)
Consider an example of projection
P : I ) F
X
1
X
2
_ _
!
X
2
1
X
1
X
2
X
2
2
_
_
_
_
(19)
The inner product of two vectors X and Y projected in space F is
given by
X
2
1
X
1
X
2
X
2
2
_
_
_
_
d
Y
2
1
Y
1
Y
2
Y
2
2
_
_
_
_
X
2
1
Y
2
1
X
1
X
2
Y
1
Y
2
X
2
2
Y
2
2
(20)
Kernel function is the function that represents the inner
product of some space into another space.
In the case of nonlinearly separable data, SVM maps the input
into richer higher dimensional feature space by using nonlinear
mapping function such that the data are linearly separable.
6. Kernel trick
In machine learning, the kernel trick is a method for using
a linear classier algorithm to solve a nonlinear problem
by mapping the original nonlinear observations into a higher
dimensional space where the liner classier is subsequently used.
This makes a linear classication in the new space equivalent to
nonlinear classication in the original space.
This is done using Mercers theorem (Vapnik, 1995), which
states that any unknown, symmetric positive denite kernel
function K(X,Y) can be expressed as a dot product in a high-
dimensional space. More specically, if the arguments to the
kernel are in a measurable space X and if the kernel is positive
semi-denite,
i;j
KX
i
; X
j
C
i
C
j
X0 (21)
for any nite subset {X
1
,y,X
n
} of X and subset {C
1
,y,C
n
} of objects
(typically real numbers). Then there exists a function f(X) whose
range is in an inner product space of possibly higher dimension
such that
X; Y fX dfY (22)
The kernel trick transforms any algorithm that solely depends on
the dot product between two vectors. Kernel functions enable
operations to be performed in input space. Hence there is no need
to evaluate inner products in the feature space. Computations
depend on the number of training patterns. Large training set is
required for a high-dimensional problem. The most important
benet of using kernels is that the dimensionality of the feature
space does not affect the computation. Some commonly used
kernels in SVM are as follows.
6.1. Linear
A linear mapping is a simple method for SVM modeling.
Kx; x
i
hx; x
i
i (23)
Assume that there are four training samples and two inputs as
x
i

0:1 0:42
0:3 0:74
0:7 0:1
0:9 0:9
_
_
_
_
A (24)
Kx
i
dx
j
AA
T
0:19 0:34 0:11 0:47

0:34 0:64 0:28 0:94
0:11 0:28 0:50 0:72
0:47 0:94 0:72 1:62
_
_
_
_
(25)
6.2. Polynomial
A polynomial mapping is a popular method for nonlinear
modeling:
Kx; x
i
hx; x
i
i
d
Kx; x
i
hx; x
i
i b
d
(26)
where d is called the order and b is the lower order.
If a second polynomial kernel is applied to the vectors given in
linear kernel, assuming b 1 and d 2 the kernel function will
simplify to
Kx
i
dx
j
AA
T
1 0:19
2
1:34
2
1:11
2
1:47
2
1:34
2
1:64
2
1:28
2
1:94
2
1:11
2
1:28
2
1:50
2
1:72
2
1:47
2
1:94
2
1:72
2
2:62
2
_
_
_
1:41 1:8 1:24 2:16

1:8 2:68 1:65 3:75
1:24 1:65 2:25 2:96
2:16 3:75 2:96 6:86
_
_
_
_
(27)
The second kernel is usually preferable as it avoids problems
with the Hessian becoming zero.
6.3. Gaussian radial basis function (Rbf)
Rbf is a real-valued function whose value depends only on the
distance from the origin so that
fx fjjxjj (28)
or alternatively distance from some other point C as
fx fjjxjj C (29)
Any function that satises fx fjjxjj is a Rbf. Rbfs are
typically used to build up functional approximation of the form
Yx
N
i1
W
i
fjjx C
i
jj (30)
Rbfs have received signicant attention, most commonly with a
Gaussian of the form
Kx; x
i
exp
jjx x
i
jj
2
2s
2
_ _
(31)
ARTICLE IN PRESS
Classical techniques utilizing Rbf employ some method of
determining a subset of centers. Typically, a method of clustering
is rst employed to select a subset of centers. An attractive feature
of the SVM is that this selection is implicit, with each SV
contributing one local Gaussian function, centered at that data
point. By further considerations it is possible to select the global
basis function width, s, using the SRM principle.
6.4. Exponential radial basis function
A Rbf of the form
Kx; x
i
exp
jjx x
i
jj
2s
2
_ _
(32)
produces a piecewise linear solution, which can be attractive
when discontinuities are acceptable
6.5. Multi-layer perceptron (MLP; sigmoidal function)
A sigmoidal function is a mathematical function that produces
a sigmoidal curve, i.e. a curve having S shape. Often, sigmoid
function refers to the specic case of the logistic function shown
at the right-hand side of the equation dened by the formula
Pt
1
1 e
t
(33)
The long-established MLP, with a single hidden layer, also has a
valid kernel representation
Kx; x
i
tanhrhx; x
i
Z (34)
for certain values of the scale r and offset parameter Z. Here the
SV corresponds to the rst layer and the Lagrange multipliers to
the weights of MLP. Another function that is usually adopted is the
Gompertz function given by
Yt ae
be
ct
(35)
where a is the upper asymptote, c the growth rate, b and c are
negative numbers, and e is the exponent.
6.6. Fourier series
A Fourier series can be considered an expansion in the
following (2N+1)-dimensional feature space. The kernel is dened
on the interval [p/2, p/2]:
Kx; x
i

sinN 1=2x x
i
sin1=2x x
i
(36)
However, this kernel is probably not a good choice because its
regularization capability is poor, which is evident on considera-
tion of its Fourier transform.
6.7. B Splines
B splines are another popular spline formulation. The kernel is
dened on the interval [1, 1], and has an attractive closed form:
Kx; x
i
B
2N1
x x
i
(37)
where x and x
i
are the training and test patterns, respectively, d is
a dimension of the input vector, and s is the global basis function
width.
6.8. Splines
Splines are a popular choice for modeling due to their
exibility. A nite spline of order k, with N knots located at t
s
, is
given by
Kx; x
i

k
r0
x
r
x
0 r
N
s1
x t
s
k
x
i
t
s
k
. . . . . . (38)
An innite spline is dened on the interval [0, 1] by
Kx; x
i

k
r0
x
r
x
0 r
_
1
0
x t
s
k
x
i
t
s
k
dt (39)
In the case when k 1, (S
1
N
), the kernel is given by
Kx; x
i
1 hx; x
i
i
1
2
hx; x
i
iminx; x
i
1
6
minx; x
i
3
(40)
where the solution is a piecewise cubic.
7. Case study
In this paper, two cases demonstrating the SVM regression
model applicability to the conditions of storm surge level at
Longdong station, Taiwan, are considered. Table 1 contains the
information about parts of time series used to train implemented
SVR, and the other part for SVR validation. The Longdong station is
situated at 25105
0
46
00
N 121155
0
24
00
E (Fig. 3), Taiwan, and the data
ARTICLE IN PRESS
Table 1
Two cases used to train the SVR
Case Training data Testing data
A1 1/3 2/3
A2 1/2 1/2
T
h
e

T
a
i
w
a
n

S
t
r
a
i
t
Taiwan
Longdong
T
h
e

P
a
c
i
f
i
c

O
c
e
a
n
120E 122E
22N
24N
Fig. 3. Locations of Longdong harbour, Taiwan.
collected during 08/23/200408/26/2004 were used to test
the accuracy of the present SVR forecasting model. The path of
the Aere typhoon over Taiwan is shown in Fig. 4. Totally 72 data
(one datum for each hour) were available for storm surge level
prediction.
The root mean square error (RMSE) and correlation coefcient
(CC) were used to estimate the accuracy of the present model:
RMSE
n
k1
y
act
y
pre
2
n
(41)
CC
n
k1
y
act
y
act
y
pre
y
pre
n
k1
y
act
y
act
n
k1
y
pre
y
pre
2
_ (42)
in which y
act
is the value of observation and y
pre
is the value of
prediction.
First, all the input data (pressure, wind velocity, wind direction,
estimated astronomical tide) and output data (storm surge level)
are normalized within the program (steps 1 and 2). Then training
patterns are selected and transformed into feature space using
kernel function K d (step 3). The mapped training patterns and
outputs are optimized by the QP, resulting in the Lagrange
multipliers a and a
(step 4). After transforming test patterns, the

normalized surge level is calculated and recovered.
A problem regarding the choice of two parameters (C and e) for
SVR was studied by Cherkassky and Ma (2004). The parameter C
controls the smoothness or atness of the approximate function.
A large value of C indicates that the objective function is only to
minimize the empirical risk, which makes the learning machine
more complex. On the other hand, a smaller C may cause learning
errors with poor approximation (Yu et al., 2006).
In each case all the kernels discussed in Section 5 were used to
train the network with different C values. The parameters of
kernel for best prediction (based on RMSE) are shown in Tables 2
and 3. The RMSE was calculated for each combination of all the
cases as shown in Tables 2 and 3. Graphs were drawn between C
and RMSE for different kernels in order to nd the best C value for
each kernel type as shown in Fig. 5. It is observed from Fig. 5
and Tables 2 and 3 that for case A1 the best kernel is linear with
C 2 and a minimum RMSE of 0.135 and for case A2 the best
kernel is Rbf with C 2 and a minimum RMSE of 0.0994.
Fig. 6 illustrates the prediction results for case A1, when one
third of the observations (1/3) were employed to forecast storm
surge over the following two thirds data (2/3) using the records
from Aere typhoon. Fig. 7 illustrates the prediction results for case
A2, when half of the observations (1/2) were employed to forecast
ARTICLE IN PRESS
30
25
20
15
110 115 120 125 130 135
Longitude in degrees
L
o
n
g
i
t
u
d
e

i
n

d
e
g
r
e
e
s
TAIWAN
CHINA
Aere
8/22/2004
8/23
8/24
8/25
8/26
Fig. 4. The moving path of AERE typhoon.
Table 2
Optimum C value for different kernelscase A1
Sl. no. Kernel C value Min. RMSE
1 Linear 2 0.1350
2 Rbf 12 0.170
Table 3
Optimum C value for different kernelscase A2
Sl. no. Kernel C value Min. RMSE
1 Rbf 2 0.00994
Ker Linear Case A1
0
0.005
0.01
0.015
0.02
0.025
0.03
0
C Value
R
M
S
E
Ker Spline Case A1
0.056
0.0562
0.0564
0.0566
0.0568
0.057
0.0572
0.0574
0
C Value
R
M
S
E
Ker Rbf Case A2
0
0.005
0.01
0.015
0.02
0.025
0
C Value
R
M
S
E
2 4 6
5 10 15
1 2 3
Fig. 5. Variation of root mean square error against C value: (a) case A1linear, (b)
case A1spline, and (c) case A2Rbf.
storm surge over the following half data (1/2) using the records
from Aere typhoon. Fig. 8 illustrates the prediction results for all
data, when full data are used for both training and testing of the
Aere typhoon. When the same data are used for training and for
testing, the RMSE is zero. Hence it is seen that a large training set
is necessary to minimize the error.
By comparing case A1 with case A2, the important fact is that
the forecasting performance of SVR improves with an increase
of the number of training patterns and it is conrmed by the low
RMSE (0.0994) obtained in case A2. The comparison between
observation, BPN, FVM, and SVR model for storm surge level for
cases A1 and A2 is shown in Figs. 9 and 10, respectively. Fig. 11
shows the RMSE values obtained in BPN, FVM, and SVR and it is
seen that error is minimum for case A2 using SVM. Figs. 12 and 13
depict the surge deviation comparison between observation, BPN,
FVM, and SVR model for cases A1 and A2.
Figs. 14 and 15 show the correlation between observed and
predicted data for cases A1 and A2. The results of the two cases
are also shown in Table 4. The above discussion shows that case
A2 gives the best result in forecasting storm surge levels. When
comparing SVR (solid line with circle) with FVM (dash line
with triangle) and BPN model (solid line with diamond), the
prediction of storm surge using SVR is almost the same as
ARTICLE IN PRESS
0 5 10 15 20 25 30 35 40 45 50
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
Time in hrs (Test data 48)

S
t
o
r
m

S
u
r
g
e

L
e
v
e
l

i
n

m

(
a
c
t
u
a
l

a
n
d

p
r
e
d
i
c
t
e
d
)
Storm Surge Level Prediction Case A1
actual value
predicted value
Fig. 6. The storm surge prediction case A1 for Longdong station.
0 5 10 15 20 25 30 35 40
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Time in hrs (Test Data 36)

S
t
o
r
m

S
u
r
g
e

L
e
v
e
l

i
n

m

(
a
c
t
u
a
l

a
n
d

p
r
e
d
i
c
t
e
d
)
Storm Surge Level Prediction Case A2
actual value
predicted value
Fig. 7. The storm surge prediction case A2 for Longdong station.
0 10 20 30 40 50 60 70 80
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
Time in hrs
S
t
o
r
m

S
u
r
g
e

L
e
v
e
l

i
n

m

(
a
c
t
u
a
l

a
n
d

p
r
e
d
i
c
t
e
d
)
Storm Surge Predictions
actual value
predicted value
Fig. 8. Storm surge predictions.
1.2
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
0 10 20 30 40 50 60
Test Pattern Number
S
t
r
o
m

s
u
r
g
e

l
e
v
e
l

f
r
o
m

a
s
t
r
o
n
o
m
i
c
a
l
l
e
v
e
l

i
n

m
Longdong Case A1
Observation
BPN
FVM
SVR
Fig. 9. Comparison of storm surge levels predicted by SVR with observations, BPN,
and FVM.
1.2
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
S
t
r
o
m

S
u
r
g
e

l
e
v
e
l

f
r
o
m

a
s
t
r
o
n
o
m
i
c
a
l
l
e
v
e
l

i
n

m
Longdong Case A2
Observation
BPN
FVM
SVR
0 5 10 15 20 25 30 35 40
Test Pattern Number
Fig. 10. Comparison for storm surge levels predicted by SVR with observations,
BPN, and FVM.
observation (solid line, refer to Fig. 12) and SVR yields a low RMSE
of 0.0994 when compared to BPN with a RMSE of 0.1432 and FVM
with 0.27023 for case A2, whereas for case A1, SVR, BPN, and FVM
yield a RMSE of 0.1350, 0.1601, and 0.269, respectively. Compared
to case A1 the error in SVM for case A2 is lesser, which conrms
that if more data are taken for training, the error will be
minimum. The computer time taken by SVR is 4s compared to
50s in BPN and 9.6h for FVM.
From the results, the variation of surge deviation has the same
trend between the measured data and the present solution of SVR.
And the SVRs forecasting results have better accuracy than that
of BPN and numerical results. In other words, the prediction
of surge deviations proposed by SVR model using Rbf kernel is
capable of forecasting storm tidal levels.
The above results are representative for the following reasons.
SVR utilizes structural minimization principle (SRM), which
reduces an upper bound on the generalized error. On the other
hand, BPN employs the traditional ERM principle, which reduces
the training error. Global optimal solutions can be found in SVR
while with BPN there are none. Furthermore, SVR is less complex
than BPN in choosing parameters. The values of SVR parameters
(e.g.; kernel function, global basis function width s, C, and e) can
be determined easily (Cherkassky and Ma, 2004; Yu et al., 2006).
In BPN however, results are dened by various parameters (e.g.
number of hidden layers and neurons, the transfer function,
iteration of training and initial weight values) and therefore it
ARTICLE IN PRESS
0
0.05
0.1
0.15
0.2
0.25
0.3
case A1 0.1601
case A2 0.1432
BPN FVM SVR
0.269 0.135
0.27 0.0994
Fig. 11. Comparison of RMSE by different methods.
1.000
0.800
0.600
0.400
0.200
0.000
-0.200
20 30 40 50 60 70 80
Time (hrs)
S
t
r
o
m

D
e
v
i
a
t
i
o
n

f
r
o
m

a
s
t
r
o
n
o
m
i
c
a
l
l
e
v
e
l

i
n

m
Longdong Case A1
Observation
BPN
FVM
SVR
Fig. 12. Comparison of surge deviations predicted by SVR with observations, BPN,
and FVM.
1.000
0.800
0.600
0.400
0.200
0.000
-0.200 S
u
r
g
i
c
a
l

D
e
v
i
a
t
i
o
n

f
r
o
m

a
s
t
r
o
n
o
m
i
c
a
l
l
e
v
e
l

i
n

m
35 40 45 50 55 60 65 70 75
Time (hrs)
Longdong Case A2
Observation
BPN
FVM
SVR
Fig. 13. Comparison for surge deviations predicted by SVR with observations, BPN,
and FVM.
1.2
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
Observed Strom surge level in m
P
r
e
d
i
c
t
e
d

s
t
r
o
m

s
u
r
g
e

l
e
v
e
l

i
n

m
Case A1 Prediction
Fig. 14. Correlation of predicted and observed valuescase A1 (CC 0.939072).
1.2
1
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2
Observed strom surge level in m
P
r
e
d
i
c
t
e
d

s
t
r
o
m

s
u
r
g
e

l
e
v
e
l

i
n

m
Case A2 Predication
Fig. 15. Correlation of predicted and observed valuescase A2 (CC 0.938795).
Table 4
Results of relative parameters for the two cases
Description Case A1 Case A2
Best kernel Linear Rbf
Best C value 2 2
Insensitive loss function 1E6 1E6
RMSE 0.135 0.0994
Correlation coefcient 0.93907 0.93879
requires considerable effort in order to construct a good NN
model.
8. Conclusions
This paper presents a promising support vector regression
(SVR) technique for storm surge predictions. The performance of
the proposed method is veried by comparing the predicted storm
surge levels (with different kernel functions) with actual values.
However, estimated results by SVR produce remarkably smaller
estimation errors compared to those of NNs. Moreover, SVR takes
lesser computer time than NNs for the two cases. From the results
it can be concluded that SVR method can predict storm surges
with higher estimation accuracy and shorter computation time. It
is seen that from 36h of data, storm surge levels for next 36h can
be predicted by SVR. Hence SVR can be used for online prediction
of surge levels or surge deviations that are going to take place
after several hours. It is expected that prediction of storm surges
by SVR method will play an important role in future surge level
predictions as well. Moreover, SVR method can be utilized by eld
engineers since it does not require any procedure to determine the
explicit form as in the regression analysis or any knowledge on
the networks architecture as in the NN techniques.
Acknowledgements
The rst two authors thank the management and
Dr. R. Rudramoorthy, Principal, PSG College of Technology, for
providing necessary facilities to carry out the research
work reported in this paper. The programs of SVR models for this
study are developed using the MATLABSVM tool box of Dr. Gunn.
The authors also thank Dr. Gunn for making SVM tool box freely
available for academic purposes.
References
Bateni, S.M., Jeng, D.S., Melville, B.W., 2007. Bayesian Neural Networks for
prediction of equilibrium and time-dependent scour depth around bridge
piers. Advances in Engineering Software 38 (2), 102111.
Bode, L., Hardy, T.A., 1997. Progress and recent developments in storm surge
modeling. Journal of Hydraulic Engineering 123 (4), 315331.
Burges, C.J.C., 1998. A Tutorial on Support Vector Machines for Pattern Recognition,
Data Mining and Knowledge Discovery, vol. 2. Kluwer Academic Publishers,
Boston, pp. 121167.
Chapelle, O., Vapnik, V., Bousquet, O., Mukerjee, S., 2002. Choosing multiple
parameters for support vector machine. Machine Learning 46, 131159.
Cherkassky, V., Ma, Y., 2004. Practical selection of SVM parameters and noise
estimation for SVM regression. Neural Network 17, 113126.
Coppola Jr., E., Szidarovszky, F., Poulton, M., Charles, E., 2003. Articial neural
network approach for predicting transient water levels in a multilayered
groundwater system under variable state, pumping, and climate conditions.
Journal of Hydrologic Engineering 8 (6), 348360.
Cortes, C., Vapnik, V., 1995. Support vector networks. Machine Learning 20,
273297.
Daniell, T.M., 1991. Neural networksapplication in hydrology and water resources
engineering. In: Proceedings of the International Hydrology and Water
Resources Conference, Perth, pp. 791802.
Danish Hydraulic Institute, 2002. MIKE 21a modelling system for estuaries
coastal waters and seas. Danish Hydraulic Institute, pp. 10200.
Dibike, Y.B., Velickov, S., Solomatine, D., 2000. Support vector machines: review
and applications in civil engineering. In: Proceedings of the Second Joint
Workshop on Application of AI in Civil Engineering, Cottabus, Germany, March
2000.
Dube, S.K., Rao, A.D., Sinha, P.C., Chittibabu, P., 1994. A real time storm surge
prediction system. An application to east coast of India. Proceedings of the
Indian National Science Academy 60, 157170.
Federal Emergency Management Agency (FEMA), 1988. Coastal Flooding Hurricane
Storm Surge Model, vols. III. Federal Emergence Management Agency,
Washington, DC.
Flather, R.A., 1991. A storm surge prediction model for the Northern Bay of Bengal
with application to the cyclone disaster. Journal of Physical Oceanography,
172190.
French, M.N., Krajewski, W.F., Cuykendall, R.R., 1992. Rainfall forecasting in space
and time using a neural network. Journal of Hydrology 137, 131.
Grubert, J.P., 1995. Application of neural networks in stratied ow stability
analysis. Journal of Hydraulic Engineering 121 (7), 523532.
Gunn, S.R., 1998. Support vector machines for classication and regression.
Technical Report ISIS-1-98, University of Southampton.
Hansen, W., 1956. Theorie zur Errechnung des Wasserstands und der Stromungen
in Randemeeren. Tellus 8, 287300.
Holland, G.J., 1980. An analytical model of the wind and pressure proles in
hurricanes. Monthly Weather Review 108, 12121218.
Hsu, T.W., Liao, C.M., Lee, Z.X., 1999. Finite element method to calculate the surge
deviation for north-east coastal of Taiwan. Journal of Chinese Institute of Civil
and Hydraulic Engineering 11 (4), 849857 (in Chinese).
Huang, W.P., Hsu, C.A., Huang, C.G., Kuang, C.S., Lin, J.H., 2005. Numerical studies
on typhoon surges in the northern part of Taiwan. In: Proceedings of the 27th
Conference on Ocean Engineering in ROC, pp. 275282.
Hubbert, G.D., Holland, G.J., Leslie, L.M., Manton, M.J., 1991. A real-time system for
forecasting tropical cyclone storm surges. Weather and Forecasting 6, 8697.
Jelesnianski, C.P., Shaffer, J.W.A., 1992. SLOSH (Sea, Lake, and Overland Surges from
Hurricanes). NOAA Technical Report NWS 48.
Jelesnianski, C.P., 1965. A numerical calculation of storm tides induced by a tropical
storm on a continental shelf. Monthly Weather Review 93, 343358.
Jelesnianski, C.P., 1972. SPLASH (Special program to list the amplitudes of surges
from hurricanes): I. Landfall storms. NOAA Technical Memorandum NWS
TDL-46.
Jeng, D.S., Cha, D.H., Blumenstein, M., 2004. Neural Network model for the
prediction of the wave-induced liquefaction potential in a porous seabed.
Ocean Engineering 31 (1718), 20732086.
Karunanithi, N., Grenney, W.J., Whitly, D., Bovee, K., 1994. Neural networks for river
ow production. Journal of Computing in Civil Engineering 8 (2), 201220.
Kawahara, M., Hirano, H., Tsubota, K., Inagaki, K., 1982. Selective lumping nite
element method for shallow water ow. International Journal for Numerical
Methods in Engineering 2, 89112.
Lee, T.L., 2004. Back-propagation neural network for the long-term tidal
predictions. Ocean Engineering 31 (2), 225238.
Lee, T.L., 2006. Neural network prediction of a storm surge. Ocean Engineering 33
(34), 483494.
Lee, T.L., 2008. Back-propagation neural network for the prediction of the short
term storm surge in Taichung harbor, Taiwan. Engineering Applications of
Articial Intelligence 21, 6372.
Lee, T.L., Jeng, D.S., 2002. Application of articial neural networks in tide
forecasting. Ocean Engineering 29 (9), 10031022.
Lee, T.L., Jeng, D.S., Lin, C., Shieh, S.J., 2002. Assessment of earthquake-induced
liquefaction by articial neural networks. In: Proceedings of the Ninth
International Conference for Computing in Civil and Building Engineering,
Taipei, Taiwan, ROC, pp. 5560.
Lee, J.J., Kie kim, D., Chang, S.K., Lee, J.H., 2007. Application of SVR for prediction of
concrete strength. Computers and Concrete l (4).
Makarynskyy, O., 2005. Articial neural networks for wave tracking, retrieval and
prediction. Pacic Oceanography 3 (1), 130.
Minns, A.W., 1998. Articial Neural Networks as Sub Symbolic Process Descriptors.
Balkema, Rotterdam, The Netherlands.
Nagy, H.M., Watanabe, K., Hirano, M., 2002. Prediction of sediment load
concentration in river using articial neural network model. Journal of
Hydraulic Engineering 128 (6), 588595.
Osuna, E., Freund, R., Girosi, F., 1997. An improved training algorithm for support
vector machines. In: Proceedings of the IEEE Workshop on Neural Networks for
Signal Processing, VII, New York, pp. 276285.
Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by
back-propagating errors. Nature 323, 533536.
Scholkopf, B., 1997. Support Vector Learning. R. Oldenbourg, Munich.
Tsai, C.P., Lee, T.L., 1999. Back-propagation neural network in tidal-level forecasting.
Journal of Waterway, Port, Coastal and Ocean Engineering, ASCE 12 (4),
195202.
Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer, New York.
Vested, H.J., Nielsen, H.R., Jensen, H.R., Kristensen, K.B., 1995. Skill Assessment of an
Operational Hydrodynamic Forecast System for the North Sea and Danish
Belts, Coastal and Estuarine Studies, vol. 47. American Geophysical Union,
Washington, DC, pp. 373396.
Xie, L., Pietrafesa, L.J., Peng, M., 2004. An integrated storm surge and inundation
modeling system for lakes, estuaries and coastal ocean. Journal of Coastal
Research 20, 12091223.
Yu, P.S., Chen, S.T., Chang, I.F., 2006. Support vector regression for the real-time
ood stage forecasting. Journal of Hydrology 328 (34), 704716.
Zhang, J., Sato, T., Iai, S., 2006. Support vector regression for online health
monitoring of large scale structures. Structural Safety 28 (4), 392406.
Further reading
/www.support-vector.netS.
/www.kernelfunctions.orgS.
/http://www.isis.ecs.soton.ac.uk/resources/svminfo/S.
ARTICLE IN PRESS

Support Vector Regression Methodology For Storm Surge Predictions - 2008 - Ocean Engineering PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Support Vector Regression Methodology For Storm Surge Predictions - 2008 - Ocean Engineering PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Corresponding author. Tel.: +914222574754; fax: +914222573833.

, x (see Fig. 2) are slack

0:19 0:34 0:11 0:47

1:41 1:8 1:24 2:16

(step 4). After transforming test patterns, the

Das könnte Ihnen auch gefallen