18 views

Uploaded by call2000

save

- 1131.pdf
- Numerical Methods (Read-Only)
- Arithmetic Coding
- Formula Book Multimedia Internet
- Answers to Notes Graph Translate Absolute Value Functions 1-31-13
- Dorrie Heinrich Problems
- Svd Inequalities
- 05_Kali_Dasar_5sd10
- F6_MS_0708_ME
- A COMPARISON STUDY FOR INTRUSION DATABASE.pdf
- Slides Lecture5
- תכן לוגי מתקדם- הרצאה 3 | הוספת שקופיות ב-2011
- Vissim Training - 12. Dynamic Routes
- Research--Trace Ability With LSI
- fence_xen
- Add to Cart Not Go to Cart
- HV 3 Arthit TU
- ReadME v1.01.txt
- DualPorosity Model
- School
- Martyn Hadoop Aws
- 4. Mathematics - Ijmcar - A Study on Fixed Point Theorem - Prashant Chauhan
- Checklist: Check a Cloud Computing provider for authenticity
- Msc
- Caem1 2013 Aug
- sammy.js
- Solution2_1_18
- hasil prepost cerlef
- Robotics1_Midterm_Test_2016-17_16.11.18
- Module 4
- Técnicas de Modelagem de Processos
- Premio Henri Nestle_2011
- Portaria Anvisa No 326.97
- Cartilha_Dessalinizador_Solar_Ecoengenho

**www.elsevier.com/locate/cor
**

Neural network and regression spline value function

approximations for stochastic dynamic programming

Cristiano Cervellera

a

, Aihong Wen

b

, Victoria C.P. Chen

b,∗,1

a

Institute of Intelligent Systems for Automation-ISSIA-CNR, National Research Council of Italy,

Via De Marini 6, 16149 Genova, Italy

b

Department of Industrial & Manufacturing Systems Engineering, The University of Texas at Arlington,

Campus Box 19017, Arlington, TX 76019-0017, USA

Available online 30 June 2005

Abstract

Dynamic programming is a multi-stage optimization method that is applicable to many problems in engineering.

A statistical perspective of value function approximation in high-dimensional, continuous-state stochastic dynamic

programming (SDP) was ﬁrst presented using orthogonal array (OA) experimental designs and multivariate adap-

tive regression splines (MARS). Given the popularity of artiﬁcial neural networks (ANNs) for high-dimensional

modeling in engineering, this paper presents an implementation of ANNs as an alternative to MARS. Comparisons

consider the differences in methodological objectives, computational complexity, model accuracy, and numerical

SDP solutions. Two applications are presented: a nine-dimensional inventory forecasting problem and an eight-

dimensional water reservoir problem. Both OAs and OA-based Latin hypercube experimental designs are explored,

and OA space-ﬁlling quality is considered.

᭧ 2005 Elsevier Ltd. All rights reserved.

Keywords: Design of experiments; Statistical modeling; Markov decision process; Orthogonal array; Latin hypercube;

Inventory forecasting; Water reservoir management

∗

Corresponding author. Tel.: +1 817 272 2342; fax: +1 817 272 3406.

E-mail addresses: cervellera@ge.issia.cnr.it (C.Cervellera), axw0932@omega.uta.edu (A.Wen),

vchen@uta.edu (V.C.P. Chen).

1

Supported by NSF Grants #INT 0098009 and #DMI 0100123 and a Technology for Sustainable Environment (TSE) grant

under the US Environmental Protection Agency’s Science to Achieve Results (STAR) program (Contract #R-82820701-0).

0305-0548/$ - see front matter ᭧ 2005 Elsevier Ltd. All rights reserved.

doi:10.1016/j.cor.2005.02.043

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 71

1. Introduction

The objective of dynamic programming (DP) is to minimize a “cost” subject to certain constraints over

several stages, where the relevant “cost” is deﬁned for a speciﬁc problem. For an inventory problem, the

“cost” is the actual cost involved in holding inventory, having to backorder items, etc., and the constraints

involve capacities for holding and ordering items. An equivalent objective is to maximize a “beneﬁt”,

such as the robustness of a wastewater treatment system against extreme conditions. The state variables

track the state of the systemas it moves through the stages, and a decision is made in each stage to achieve

the objective.

Recursive properties of the DP formulation permit a solution via the fundamental recurrence equation

[1]. In stochastic dynamic programming (SDP), uncertainty is modeled in the formof randomrealizations

of stochastic system variables, and the estimated expected “cost” (“beneﬁt”) is minimized (maximized).

In particular, Markov decision processes may be modeled by a SDP formulation through which the state

variables resulting fromeach decision comprise a Markov process. DP has been applied to many problems

in engineering, such as water resources, revenue management, pest control, and wastewater treatment.

For general background on DP, see Bellman [1], Puterman [2] and Bertsekas [3]. For some applications,

see Gal [4], Shoemaker [5], White [6,7], Foufoula-Georgiou and Kitanidis [8], Culver and Shoemaker

[9], Chen et al. [10], and Tsai et al. [11]. In high dimensions, traditional DP solution methods can become

computationally intractable and attempts have been made to reduce the computational burden [12–14].

For continuous-state SDP, the statistical perspective of Chen et al. [15] and Chen [16] enabled the ﬁrst

truly practical numerical solution approach for high dimensions. Their method utilized orthogonal array

(OA, [17]) experimental designs and multivariate adaptive regression splines (MARS, [18]). Although

MARS has been applied in several areas (e.g., [19–22]), artiﬁcial neural network (ANN) models are much

more prevalent in all areas of engineering (e.g., [23–30]). Given their popularity, an ANN-based SDP

solution method would be more accessible to engineers. In this paper, we present:

• The statistical modeling approach to solving continuous-state SDP.

• Implementation of ANN models in continuous-state SDP as an alternative to MARS.

• Implementation of OA-based Latin hypercube (OA–LH) designs [31] as an alternative to pure OAs.

• Consideration of the “space-ﬁlling” quality of generated experimental designs. The designs are in-

tended to distribute discretization points so that they “ﬁll” the state space; however, generated designs

of the same type do not necessarily have equivalent space-ﬁlling quality.

• Comparisons between ANN and MARS value function approximations, including methodological

perspectives, computational considerations, and numerical SDP solution quality on two applications.

The novelty of our study is to simultaneously consider how the SDP solution is affected by different

approximation methods and designs of different types and different space-ﬁlling qualities.

In the statistical perspective of continuous-state SDP, the two basic requirements are an experimental

design to discretize the continuous state space and a modeling method to approximate the value function

An appropriate design should have good space-ﬁlling quality, and an appropriate modeling method should

be ﬂexible in high dimensions, provide a smooth approximation, and be reasonably efﬁcient. A review

of several experimental designs and modeling methods is provided by Chen et al. [32] in the context of

computer experiments. Of the possible modeling choices given in this review paper, only MARS, neural

networks, and kriging satisfy the ﬁrst two criteria. Some preliminary exploration of the kriging model

72 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

yielded positive results on a ten-dimensional wastewater treatment application [33], but solving for the

maximum likelihood estimates of the kriging parameters continues to be a computational challenge. For

this application, kriging required 40 times more computational effort than MARS. In addition, kriging

software is even harder to obtain than MARS software, which is counter to our motivation to employ

ANNs.

Comparisons betweenMARSandANNs for empirical modelinghave beenconductedbyDeVeauxet al.

[34] and Pham and Peat [35], and in both studies MARS performed better overall. Although it is obvious

that ANN models can be employed to approximate functions, the existing literature does not provide

sufﬁcient studies to predict how they will compare to MARS in the SDP setting. In continuous-state SDP,

value functions are typically well-behaved (e.g., convex) and the SDP data involve little noise. Previous

comparisons involved statistical data with considerable noise. It is expected that ANNmodels will perform

better in low noise situations. However, ANN models can add extraneous curvature, a disadvantage

when a smooth convex approximation is desired. MARS, on the other hand, constructs a parsimonious

approximation, but its limited structure, although quite ﬂexible, could affect its approximation ability,

which would be most noticeable in low noise situations. Finally, computational speed is an important

aspect of SDP, and the ANN model ﬁtting process has a reputation of being slow.

Studies on different experimental designs for metamodeling have been conducted [36–38]. However,

none have considered OA–LH designs. Given the success of OA designs in continuous-state SDP, we

chose the OA–LH design as a variation that might work well with ANN models. Points of an OA design

are limited to a fairly sparse grid, while the OA–LH design allows points to move off the grid. The grid

restriction was convenient for the MARS structure, but provides no advantage for ANN models.

Section 2 describes SDP in the context of the two applications that are later used to test our methods.

The SDP statistical modeling approach is given in Section 3. Section 4 provides background on OA and

OA–LH designs. Section 5 describes ANNs, and Section 6 presents the comparisons between the ANNs

and MARS, including the implementation of OA–LH designs. Conclusions are presented in Section 7.

2. SDP applications

In continuous-state DP, the systemstate variables are continuous. Chen et al. [15] and Chen [16] solved

inventory forecasting test problems with four-, six-, and nine-state variables modeled over three stages.

Application to ten- and seven-dimensional water reservoir SDP problems appears in Baglietto et al. [39]

and Baglietto et al. [40], and application to a 20-dimensional wastewater treatment systemappears in Tsai

et al. [11]. Here, we consider the nine-dimensional inventory forecasting problem, which is known to

have signiﬁcant interaction effects between inventory levels and forecasts due to the capacity constraints,

and an eight-dimensional water reservoir problem, which has simpler structure and is more typical of

continuous-state SDP problems.

2.1. Inventory forecasting problem

For an inventory problem, the inventory levels are inherently discrete since items are counted, but the

range of inventory levels for an item is too large to be practical for a discrete DP solution. Thus, one can

relax the discreteness and use a continuous range of inventory levels. The inventory forecasting problem

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 73

uses forecasts of customer demand and a probability distribution on the forecasting updates to model the

demand more accurately than traditional inventory models (see [41]).

2.1.1. State and decision variables

The state variables for inventory forecasting consist of the inventory levels and demand forecasts for

each item. For our nine-dimensional problem, we have three items and two forecasts for each. Suppose

we are currently at the beginning of stage t. Let I

(i)

t

be the inventory level for item i at the beginning of

the current stage. Let D

(i)

(t,t )

be the forecast made at the beginning of the current stage for the demand of

item i occurring over the stage. Similarly, let D

(i)

(t,t +1)

be the forecast made at the beginning of the current

stage for the demand of item i occurring over the next stage (t +1). Then the state vector at the beginning

of stage t is

x

t

=

I

(1)

t

, I

(2)

t

, I

(3)

t

, D

(1)

(t,t )

, D

(2)

(t,t )

, D

(3)

(t,t )

, D

(1)

(t,t +1)

, D

(2)

(t,t +1)

, D

(3)

(t,t +1)

T

.

The decision variables control the order quantities. Let u

(i)

t

be the amount of item i ordered at the

beginning of stage t. Then the decision vector is u

t

=

u

(1)

t

, u

(2)

t

, u

(3)

t

T

, where u

(i)

t

is the amount of item

i ordered in stage t. For simplicity, it is assumed that orders arrive instantly.

2.1.2. Transition function

The transition function for stage t speciﬁes the new state vector to be x

t +1

=f

t

(x

t

, u

t

,

t

), where x

t

is

the state of the system(inventory levels and forecasts) at the beginning of stage t, u

t

is the amount ordered

in stage t, and

t

is the stochasticity associated with updates in the demand forecasts. The evolution of

these forecasts over time is modeled using the multiplicative Martingale model of forecast evolution, for

which the stochastic updates are applied multiplicatively and assumed to have a mean of one, so that the

sequence of future forecasts for stage t + k forms a martingale [42]. See Chen [16] for details on the

stochastic modeling.

2.1.3. Objectives and constraints

The constraints for inventory forecasting are placed on the amount ordered, in the form of capacity

constraints, and on the state variables, in the formof state space bounds. For this test problem, the capacity

constraints are chosen to be restrictive, forcing interactions between the state variables, but are otherwise

arbitrary. The bounds placed on the state variables in each stage are necessary to ensure accurate modeling

over the state space.

The objective function is a cost function involving inventory holding costs and backorder costs. The

typical V-shaped cost function common in inventory modeling is

c

v

(x

t

, u

t

) =

n

I

¸

i=1

h

i

[I

(i)

t +1

]

+

+ ¬

i

[−I

(i)

t +1

]

+

, (1)

where [q]

+

=max{0, q}, h

i

is the holding cost parameter for itemi, and ¬

i

is the backorder cost parameter

for item i. The smoothed version of Eq. (1), described in Chen [16], is employed here.

74 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

Fig. 1. Reservior network.

2.2. Water reservoir problem

Optimal operation of water reservoir networks has been studied extensively in the literature (e.g.,

Archibald et al. [43], Foufoula-Georgiou and Kitanidis [8], Gal [4], Lamond and Sobel [44], Turgeon

[45] and Yalpwotz [46] provides an excellent survey). The water reservoir network we consider consists

of four reservoirs (see Fig. 1 for the conﬁguration). In typical models, the inputs to the nodes are water

released from upstream reservoirs and stochastic inﬂows from external sources (e.g., rivers and rain),

while the outputs correspond to the amount of water to be released during a given time stage (e.g., a

month).

2.2.1. State and decision variables

Ina realistic representationof inﬂowdynamics, the amount of (random) water ﬂowingintothe reservoirs

from external sources during a given stage t depends on the amounts of past stages. In this paper, we

follow the realistic modeling of Gal [4] and use an autoregressive linear model of order one. Thus, the

ith component of the random vector

t

, representing the stochastic external ﬂow into reservoir i, has the

form

c

(i)

t

= a

(i)

t

c

(i)

t −1

+ b

(i)

t

+ c

(i)

t

.

(i)

t

, (2)

where

t

is a random correction that follows a standard normal distribution. The coefﬁcients of the

linear combinations (a

t

, b

t

, c

t

) actually depend on t, so that it is possible to model proper external inﬂow

behaviors for different months. The actual values of the coefﬁcients used for the model were based on

the one-reservoir real model described in Gal [4] and extended to our four-reservoir network.

Given Eq. (2), the state vector consists of the water level and stochastic external inﬂow for each

reservoir:

x

t

=

w

(1)

t

, w

(2)

t

, w

(3)

t

, w

(4)

t

, c

(1)

t

, c

(2)

t

, c

(3)

t

, c

(4)

t

T

,

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 75

where w

(i)

t

is the amount of water in reservoir i (i = 1, 2, 3, 4) at the beginning of stage t, and c

(i)

t

the

stochastic external inﬂowinto reservoir i during stage t. The decision vector is u

t

=

u

(1)

t

, u

(2)

t

, u

(3)

t

, u

(4)

t

T

,

where u

(i)

t

be the amount of water released from reservoir i during stage t.

2.2.2. Transition function

The transition function for stage t updates the water levels and external inﬂows. The amount of water

at stage t + 1 reﬂects the ﬂow balance between the water that enters (upstream releases and stochastic

external inﬂows) and the water that is released. In order to deal with unexpected peaks from stochastic

external inﬂows, each reservoir has a ﬂoodway, so that the amount of water never exceeds the maximum

value W

(i)

. Thus, the new water level is

w

(i)

t +1

= min

⎧

⎨

⎩

w

(i)

t

+

¸

j∈U

(i)

u

(j)

t

− u

(i)

t

+ c

(i)

t

, W

(i)

⎫

⎬

⎭

,

where U

(i)

the set of indexes corresponding to the reservoirs which release water into reservoir i. For the

stochastic external inﬂows, we can update them using Eq. (2). Like the inventory forecasting model, the

two equations above can be grouped in the compact transition function x

t +1

= f

t

(x

t

, u

t

,

t

).

2.2.3. Objectives and constraints

The constraints for water reservoir management are placed on the water releases u

(i)

t

in the form:

0u

(i)

t

min

⎧

⎨

⎩

w

(i)

t

+

¸

j∈U

i

u

(j)

t

, R

(i)

⎫

⎬

⎭

,

where R

(i)

the maximum pumpage capability for reservoir i. This implies nonnegativity of w

i

t +1

. The ob-

jective function is intended to maintain a reservoir’s water level close to a target value ˜ w

(i)

t

and maximize

beneﬁt represented by a concave function g of the water releases and/or water levels (e.g., power genera-

tion, irrigation, etc.). For maintaining targets, the intuitive cost would be |w

(i)

t

− ˜ w

(i)

t

|, which is aV-shaped

function. To facilitate numerical minimization, we utilized a smoothed version, denoted by l

w

(i)

t

, ˜ w

(i)

t

,

that is analogous to the smoothed cost function for the inventory forecasting problem. For the modeling

beneﬁts, we consider a nonlinear concave function g of the following form (for z ∈ [0, +∞)):

g(z, j, [) =

⎧

⎪

⎪

⎪

⎪

⎪

⎨

⎪

⎪

⎪

⎪

⎪

⎩

([ + 1)z, 0z

1

3

j,

[z +

1

2

j −

45

j

2

z −

2

3

j

3

−

153

2j

2

z −

2

3

j

4

,

1

3

jz

2

3

j,

[z +

1

2

j, z

2

3

j.

Such a function models a beneﬁt which becomes relevant for “large” values of the water releases, de-

pending on suitable parameters j and [. In our example, we used R

(i)

for j and a small number 0.1

76 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

for [. Then, the total cost function at stage t can be written as

c

t

(x

t

, u

t

,

t

) =

4

¸

i=1

l

w

(i)

t

, ˜ w

(i)

t

−

4

¸

i=1

p

(i)

g

u

(i)

t

, j

(i)

, [

,

where the p

(i)

, i = 1, 2, 3, 4 are weights chosen to balance the beneﬁts against the costs.

3. SDP statistical modeling solution approach

3.1. Future value function

For a given stage, the future value function provides the optimal cost to operate the system from stage t

through T as a function of the state vector x

t

. Following the notation in Section 2, this is written recursively

as

F

t

(x

t

) = min

u

t

E{c(x

t

, u

t

,

t

) + F

t +1

(x

t +1

)}

s.t. x

t +1

= f

t

(x

t

, u

t

,

t

),

(x

t

, u

t

) ∈ I

t

. (3)

If we have the future value functions for all the stages, then given the initial state of the system, we

can track the optimal solution through the value functions. Thus, given a ﬁnite time horizon T, the basic

SDP problem is to ﬁnd F

1

(·), . . . , F

T

(·). The recursive formulation permits the DP backward solution

algorithm [1], ﬁrst by assuming F

T +1

(x

T +1

) ≡ 0, so that F

T

(x

T

) =c

T

(x

T

) (where c

T

(x

T

) is the cost of

the ﬁnal stage depending only on x

T

), then solving in order for F

T

(·) through F

1

(·). See the description

in Chen [16] or Chen et al. [47] for further details.

3.2. Numerical solution algorithm

Given the state of the systemat the beginning of the current stage, the corresponding optimal cost yields

one point on that stage’s value function. The complete future value function is obtained by solving this

optimization problem for all possible beginning states. For continuous states, this is an inﬁnite process.

Thus, except in simple cases where an analytical solution is possible, a numerical SDP solution requires

a computationally tractable approximation method for the future value function.

Froma statistical perspective, let the state vector be the vector of predictor variables. We seek to estimate

the unknown relationship between the state vector and the future value function. Response observations

for the future value function may be collected via the minimization in Eq. (3). Thus, the basic procedure

uses an experimental design to choose Npoints in the n-dimensional state space so that one may accurately

approximate the future value function; then constructs an approximation.

Chen et al. [15] introduced this statistical perspective using strength three OAs and MARS (smoothed

with quintic functions) applied to the inventory forecasting test problems. MARS was chosen due to its

ease of implementation in high dimensions, ability to identify important interactions, and easy derivative

calculations. The choice of the strength three OAs was motivated by the desire to model interactions,

but univariate modeling can suffer due to the small number of distinct levels in each dimension (e.g.,

p = 13 was the highest number of levels explored). Chen [16] additionally explored strength two OAs

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 77

with equivalent N to the strength three OAs, but with lower order balance permitting higher p; however,

their performance was disappointing.

4. Discretizations based on OAs

In continuous-state SDP, the classical solution method discretizes the state space with a full grid,

then uses multi-linear interpolation to estimate the future value function. This full grid is equivalent to a

full factorial experimental design, for which the number of points grows exponentially as the dimension

increases. To avoid this, a fractional factorial design, i.e., taking a subset of the full grid, may be employed.

OAs are special forms of fractional factorial designs. Afull factorial design for n variables, each at p levels,

contains all p

n

level combinations. An OA of strength d (d <n) for n variables, each at p levels, contains

all possible level combinations in any subset of d dimensions, with the same frequency, z. Therefore, when

projected down onto any d dimensions, a full factorial grid of p

d

points replicated z times is represented.

4.1. Measuring space-ﬁlling quality

There are a variety of ways to generate OAs (e.g., [48–53]). However, there is no way to guarantee their

space-ﬁlling quality. In practice, one can generate many OAs, calculate various space-ﬁlling measures,

then choose the best of the bunch. Chen [17] has conducted the only study of how this quality affects an

approximation, and only OAs of strength three were studied.

Looking at the two conﬁgurations of points in Fig. 2, the one on the left is seen to space the points

better. There are two basic ways to measure space-ﬁlling quality: minimax and maximin (refer to [32]).

Let D be the n-dimensional closed and bounded set of the continuous state space, and P be the set of n

design points in D. Minimax criteria seek to minimize the maximumdistance between (nondesign) points

in D and the nearest-neighbor design point in P. Using the same distance measure as above, deﬁne:

MAXDIST(D, P) = sup

x∈D

¸

min

y∈P

d(x, y)

¸

.

Since the supremum over the inﬁnite set of points in D is difﬁcult to calculate, this measure is usually

calculated over a ﬁnite sample of points in D, using a full factorial grid or some type of sampling scheme.

However, this measure is generally impractical to calculate in high dimensions. A design that represents

D well will have lower MAXDIST(D, P) than a design that does not.

•

•

•

•

•

•

•

•

x

1

0 1 4

x

2

• •

0

2

4

1

3

x

1

0

2

4

1

3

2 3 0 1 4

x

2

2 3

Fig. 2. Two conﬁgurations of design points.

78 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

Maximin criteria seek to maximize the minimum distance between any pair of points in P. Let d(x, y)

be a distance measure (commonly Euclidean distance) over R. Then deﬁne:

MINDIST(P) = min

x∈P

⎧

⎨

⎩

min

y∈P

y=x

d(x, y)

⎫

⎬

⎭

.

Higher MINDIST(P) should correspond to a more uniform scattering of design points. For the OA

designs in this paper, their space-ﬁlling quality was judged using measures of both types and a measure

introduced by Chen [17].

4.2. OA-based LH designs

LH designs are equivalent to OA designs of strength one. In other words, if we project the N points

of a LH design onto any single dimension, the points will correspond to N distinct levels in that di-

mension. However, only OAs of strength two and higher are technically orthogonal, so the LH design

does not maintain this property. A hybrid OA–LH design, for which an LH design is (randomly) cre-

ated from an OA structure, was introduced by Tang [31] to address the lack of orthogonality in LH

designs. Although OA–LH designs only satisfy LH properties, the underlying OA provides some bal-

ance. With regard to approximating a function, the LH design provides more information along a single

dimension (for univariate terms), while the OA design provides better coverage in higher dimensions (for

interactions).

In this paper, we utilize the strength three OAs described in Chen [17], and the OA–LH designs are

based on these strength three OAs. Speciﬁcally, we employed strength three OAs with p = 11 levels

(N = 1331) and p = 13 levels (N = 2197) in each dimension, and two realizations of OA–LH designs

were generated for each OA. For each size (N = 1331 and 2197), one “good”, one “neutral”, and one

“poor” OA was selected based on space-ﬁlling measures described above. Thus, a total of 6 OAs and

12 OA–LH designs were tested. For the inventory forecasting problem, nine-dimensional designs were

generated, and for the water reservoir problem, eight-dimensional designs were generated.

5. Artiﬁcial neural networks

5.1. Model description

ANNs are mathematical modeling tools widely used in many different ﬁelds (for general background

and applications see Lippmann [54], Haykin [55], Bertsekas and Tsitsiklis [14], Cherkassky and Mulier

[56]; for statistical perspectives see White [57], Barron et al. [58], Ripley [59], or Cheng and Titterington

[60]; for approximation capabilities see Barron [61]). From a purely “analytical” perspective, an ANN

model is a nonlinear parameterized function that computes a mapping of n input variables into (typically)

one output variable, i.e., n predictors and one response. The basic architecture of ANNs is an interconnec-

tion of neural nodes that are organized in layers. The input layer consists of n nodes, one for each input

variable, and the output layer consists of one node for the output variable. In between there are hidden

layers which induce ﬂexibility into the modeling. Activation functions deﬁne transformations between

layers, the simplest one being a linear function. Connections between nodes can “feed back” to previous

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 79

layers, but for function approximation, the typical ANN is feedforward only with one hidden layer and

uses “sigmoidal” activation functions (monotonic and bounded). A comprehensive description of various

ANN forms may be found in [55].

In our inventory forecasting SDP application, we utilized a feedforward one-hidden-layer ANN

(FF1ANN) with a “hyperbolic tangent” activation function: Tanh(x) = (e

x

− e

−2x

)/(e

x

+ e

−x

). De-

ﬁne n as the number of input variables, H as the number of hidden nodes, v

jh

as weights linking input

nodes j to hidden nodes h, w

h

as weights linking hidden nodes h to the output node, and 0

h

and ¸ as

constant terms called “bias” nodes (like intercept terms). Then our ANN model form is

ˆ g(x; w, v, , ¸) = Tanh

H

¸

h=1

w

h

Z

h

+ ¸

,

where for each hidden node h

Z

h

= Tanh

n

¸

i=1

v

ih

x

i

+ 0

h

.

The parameters v

ih

, w

h

, 0

h

, and ¸ are estimated using a nonlinear least-squares approach called training.

For simplicity, we will refer to the entire vector of parameters as w in the following sections.

5.2. Model ﬁtting

Our training employed a batch processing method (as opposed to the more common on-line processing)

that minimized the squared error loss lack-of-ﬁt (LOF) criterion:

LOF( ˆ g) =

N

¸

j=1

[y

j

− ˆ g(x

j

, w)]

2

,

where N is the number of data points on which the network is being trained, a.k.a., the training data set.

Nonlinear optimization techniques used to conduct the minimization are typically gradient descent-based

methods. We used the backpropagation method (see [62,63]), which computes the gradient of the output

with respect to the outer parameters, then “backpropagates” the partial derivatives through the hidden

layers, in order to compute the gradient with respect to the entire set of parameters. This is completed in

two phases: (i) a forward phase in which an input x

j

is presented to the network in order to generate the

various outputs (i.e., the output of the network and those of the inner activation functions) corresponding

to the current value of the parameters, and (ii) a backward phase in which the gradient is actually computed

on the basis of the values obtained in the forward phase. After all N samples are presented to the network

and the corresponding gradients are computed, the descent algorithm step is completed by upgrading the

vector of parameters w using the steepest-descent method:

w

k+1

= w

k

− :

k

∇LOF(w

k

),

where ∇LOF(w

k

) is the gradient of the LOF function with respect to the vector w

k

. Implementation of

this approach is very simple, since the computation of the gradient requires only products, due to the

80 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

mathematical structure of the network. Unfortunately, backpropagation presents many well-known dis-

advantages, such as entrapment in local minima. As a consequence, the choice of the stepsize :

k

is crucial

for acceptable convergence, and very different stepsizes may be appropriate for different applications.

The best training algorithmfor FF1ANNs, in terms of rate of convergence, is the Levenberg–Marquardt

method (ﬁrst introduced in [64]), which is an approximation of the classic second-order (i.e., it makes

use of the Hessian matrix) Newton method. Since our LOF criterion is quadratic, we can approximate its

Hessian matrix via backpropagation as

J[e(w)]

T

J[e(w)],

where

e(w) = [y

1

− ˆ g(x

1

, w), . . . , y

N

− ˆ g(x

N

, w)]

T

and J[e(w)] is the Jacobian matrix of e(w). Each row of the matrix contains the gradient of the jth single

error term y

j

− ˆ g(x

j

, w) with respect to the set of parameters w.

The update rule for the parameter vector at each step is

w

k+1

= w

k

− [J[e(w

k

)]

T

J[e(w

k

)] + zI]

−1

J[e(w

k

)]

T

e(w

k

)

T

.

When z is large, the method is close to gradient descent with a small stepsize. When z = 0 we have a

standard Newton method with an approximated Hessian matrix (Gauss–Newton method), which is more

accurate near the minima. After each successful step (i.e., when the LOF decreases) z is decreased, while

it is increased when the step degrades the LOF.

5.3. Model implementation

Other than the selection of a model ﬁtting approach, the most important decision in implementing an

ANN model is the choice of architecture. For a FF1ANN, the number of nodes in the hidden layer, H,

determines the architecture. Unfortunately, the appropriate number of hidden nodes is never clear-cut.

Existing theoretical bounds on the accuracy of the approximation are non-constructive or too loose to

be useful in practice. Thus, it is not possible to deﬁne “a priori” the optimal number of hidden nodes,

since it depends on the number of samples available and on the smoothness of the function we want to

approximate. Too few nodes would result in a network with poor approximation properties, while too

many would “overﬁt” the training data (i.e., the network merely “memorizes” the data at the expense

of generalization capability). Therefore, a tradeoff between these two possibilities is needed. There are

many rules of thumb and methods for model selection and validation in the literature, as well as methods

to avoid or reduce overﬁtting (see [55]), but the choice of a good architecture requires several modeling

attempts and is always very dependent on the particular application.

For our applications, we identiﬁed the appropriate number of hidden nodes by exploring many ar-

chitectures and checking the average cost using a test data set from the simulation described in Section

6.2. The best results for the nine-dimensional inventory forecasting problem were obtained for values of

H = 40 for N = 1331 and H = 60 for N = 2197. Similarly, the best results for the eight-dimensional

water reservoir problem were obtained using H = 10 for both sample sizes. For larger values of H the

solutions worsened.

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 81

6. ANNs and MARS comparisons

ANNs have become very popular in many areas of science and engineering, due to their proven ability

to ﬁt any data set and to approximate any real function, together with the existence of accessible software,

such as Matlab’s (www.mathworks.com) Neural Network Toolbox, which was used in our study.

The primary disadvantages are (1) the difﬁculty in selecting the number of hidden nodes H and (2)

potential overﬁtting of the training data. The second disadvantage occurs because it is tempting to try

to ﬁt the data as closely as possible even when it is not appropriate. Both disadvantages can be handled

by proper use of a test data set, but this process greatly increases the computational effort required.

From a statistical perspective, ANNs have the additional disadvantage of an approximation that does not

provide interpretable information on the relationships between the input (predictor) variables and output

(response) variable.

MARS [18] is based on traditional statistical thinking with a forward stepwise algorithm to select

terms in a linear model followed by a backward procedure to prune the model. The approximation bends

to model curvature at “knot” locations, and one of the objectives of the forward stepwise algorithm

is to select appropriate knots. The search for new basis functions can be restricted to interactions of a

maximumorder. For the inventoryforecastingproblems, MARSwas restrictedtoexplore uptothree-factor

interations. Chen [16] also considered MARS limited to two-factor interactions, but the approximation

degraded. However, for the water reservoir problem, two-factor interactions were sufﬁcient. In addition,

the backwardprocedure was not utilizedsince preliminarytestingdemonstrateda verylarge computational

burden with little change on the approximation. After selection of the basis functions is completed,

smoothness to achieve a certain degree of continuity may be applied. In our case, quintic functions were

used to achieve second derivative continuity (see [16]). The primary disadvantage of MARS is that the

user must specify the maximum number of basis functions to be added, a parameter called M

max

. This

is very similar to the difﬁculty in selecting H for an ANN, and, like ANNs, a test data set can be used

to select M

max

. MARS is both ﬂexible and easily implemented while providing useful information on

signiﬁcant relationships between the predictor variables and the response; however, commercial MARS

software has only recently become available from Salford Systems (www.salford-systems.com).

For our study, our own C code was employed.

From the perspective of SDP value function approximation, MARS and ANNs are both excellent for

modeling in high dimensions when noise is small relative to the signal, as is the case with SDP. Their

main difference is the interpretability of the approximation, but this is not a critical issue for accurate

SDP solutions. An issue that may arise in other SDP applications is the selection of the key parameter,

M

max

for MARS and H for ANNs. For the inventory forecasting and water reservoir problems, the value

functions have similar structure fromstage to stage; thus, the same M

max

or Hfor all stages is appropriate.

When different M

max

or H is preferred for different stages, then the SDP algorithm would greatly beneﬁt

from automatic selection of M

max

or H. This is an issue for future research.

6.1. Memory and computational requirements

The memory and computational requirements for a FF1ANN with one output variable depend on the

number of input (predictor) variables n, the number of hidden nodes H, and the size of the training data set

N. For MARS, the key quantities are n, N, the number of eligible knots K, and the maximum number of

basis functions M

max

. In this section, we will explore how the requirements might grow with increasing n

82 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

(the dimension of an SDPproblem). Chen et al. [15] set N=O(n

3

), which assumes Nis based on a strength

three OA, K=O(n), which assumes knots can only placed at the p distinct levels of the strength three OA,

and O(n)M

max

O(n

3

). In this paper, the addition of the OA–LH designs permits the placement of a

higher number of knots. However, given an adequate K, it should not be expected to increase with n since

the larger K primarily impacts the quality of univariate modeling. Based on computational experience

with a variety of value function forms, K50 is more than adequate, and a more reasonable upper limit

on M

max

is O(N

2/3

) = O(n

2

). For ANNs, H depends on n, N, and the complexity of the underlying

function. A reasonable range on H (in terms of growth) matches that of O(n)HO(N

2/3

) = O(n

2

).

One should keep in mind that the above estimates are intended to be adequate for approximating SDP

value functions, which are convex and, consequently, generally well behaved. The inventory forecasting

value function, however, does involve a steep curve due to high backorder costs.

For memory requirements, a FF1ANN requires storage of V = [(n + 1)H + H + 1] parameters.

Assuming each of these is a double-precision real number, the total number of bytes required to store a

FF1ANN grows as

O(n

2

)S

FF1ANN

= 8V O(n

3

).

The maximum total number of bytes required to store one smoothed MARS approximation with up to

three-factor interactions grows as [15]

O(n)S

MARS

= 8(6M

∗

max

+ 2n + 3Kn + 1)O(n

2

).

Thus, it appears that the memory requirement for FF1ANNs grows faster than that for MARS. However,

the largest FF1ANN tested in the paper required 5288 bytes and the largest MARS required 36,800 bytes,

so for n = 9, FF1ANNs are still considerably smaller.

Another aspect of memory requirements is memory required while constructing the approximation.

We utilized Levenberg–Marquardt (LM) backpropagation to train the FF1ANN, and a disadvantage of

the LM method lies in its memory requirements. Each step requires the storage of a (N×V) matrix and

the (V × V) Hessian approximation, so the memory needed grows as

O(n

5

)S2

FF1ANN

= O(NnH) + O(n

2

H

2

)O(n

6

).

However, if the memory at disposal is sufﬁcient, LM backpropagation is generally preferred over other

training methods because of its faster convergence. MARS construction stores up to a (N×M

max

) matrix,

resulting in a growth of

O(n

4

)S2

MARS

= O(NM

max

)O(n

5

).

The computational requirements for one iteration of our SDP statistical modeling solution approach

consist of: (i) solving the optimization in Eq. (3) for one stage t at every experimental design point and

(ii) constructing the spline approximation of the future value function. The optimization evaluates a value

function approximation and its ﬁrst derivative several times in each stage, except the last. For FF1ANNs,

(i) is performed via optimization that requires the evaluation of the output of an ANN and of its ﬁrst

derivative. The time required for these evaluations is O(nH), so we can write the total time required

to solve the SDP equation for all N points as O(WnNH), where W is the work required to solve the

minimization.

For what concerns (ii), we consider LM backpropagation, which is the one we actually used for

our FF1ANNs. The time for the computation of LM backpropagation is dominated by the Hessian

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 83

Table 1

Actual average run times (in minutes) for one SDP iteration using FF1ANN and MARS models

Model N Inventory forecasting Water reservoir

450 MHz P2 550 MHz P2 933 MHz P2 3.06 GHz IX

ANN 1331 65 min (H = 40) 90 min (H = 40) 13 min (H = 10)

MARS 1331 61 min (M

max

= 300) 43 min (M

max

= 100)

ANN 2197 185 min (H = 60) 87 min (H = 60) 18 min (H = 10)

MARS 2197 112 min (M

max

= 400) 67 min (M

max

= 100)

The FF1ANN runs used W

LM

= 125. For the inventory forecasting problem, Matlab was used to conduct the FF1ANN runs

on two Pentium II (P2) machines, and a C code executed the MARS runs on a different P2 machine. For the water reservoir

problem, all runs were conducted in Matlab on an Intel Xeon (IX) workstation; however, calling the MARS C code in Matlab

resulted in signiﬁcant slow down.

approximation and inversion. The (V×V) Hessian approximation is the product of two (N×V) matrices

that are obtained by backpropagation. Here, for each row of the matrix (which corresponds to a single

experimental design point x

j

), the computation is basically reduced to [H + 1 + (n + 1)H] products,

so the time required to obtain the gradient for all N experimental design points is O(NnH). The matrix

product requires NV

2

real valued products, so this operation (which is O(Nn

2

H

2

)) occupies the relevant

part of the time required to compute the approximation. For the inversion of a (V ×V) matrix, an efﬁcient

algorithm can provide a computational time that is O(V

2

) = O(n

2

H

2

). Therefore, the time needed to

compute one step of LMbackpropagation is dominated by the matrix product: O(Nn

2

H

2

). Deﬁne W

LM

to

be the number of steps required to attain the backpropagation goal. (Typically, W

LM

never needs to exceed

300.) Then the time required to train the network by LMbackpropagation grows as O(W

LM

Nn

2

H

2

). Thus,

the total run time for one iteration of the FF1ANN-based SDP solution grows as

O(n

7

)C

FF1ANN

= O(WNnH) + [O(W

LM

n

2

NH

2

)]O(n

9

).

The total run time for one iteration of the MARS-based SDP solution grows as [15]

O(n

7

)C

MARS

= O(WNM

max

) + O(nNM

3

max

)O(n

10

),

where the time to construct the MARS approximation (second term) dominates.

Actual average run times for one SDP iteration are shown in Table 1. For the inventory forecasting

problem, the MARS and ANN runs were conducted on different machines in different countries. Despite

this, it can been seen that the run times for the same N are comparable. In order to conduct a fairer

computational comparison on the same programming platform, considerable effort was expended to

develop a Matlab code that could access our MARS C code. Unfortunately, the ability to call C code

from Matlab is still ﬂawed, and the MARS runs suffered signiﬁcant slow down. For the water reservoir

problem, all runs were conducted in the same place; however, the slow down experienced by calling the

MARS C code from Matlab is obvious. Unfortunately, at this time, a completely fair comparison that

uses the same programming platform is not readily implementable. Overall, our experience demonstrates

that the ANN model is clearly easier to implement.

84 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

••

••

•

•

•

•

•

•

•

•

•

•

•

ANN

OA3G

p11

H40

MARS

OA3G

p11

200

MARS

OA3G

p11

300

ANN

OA3N

p11

H40

MARS

OA3N

p11

200

MARS

OA3N

p11

300

ANN

OA3P

p11

H40

MARS

OA3P

p11

200

MARS

OA3P

p11

300

0

4

8

1

5

9

A

b

s

o

l

u

t

e

E

r

r

o

r

2

3

6

7

11

12

13

14

15

10

Fig. 3. Inventory forecasting: Boxplots illustrate the distribution of absolute error for all 9 SDP solutions using strength 3 OAs

with p = 11 (N = 1331). Below each boxplot, the ﬁrst line indicates the use of ANNs or MARS. The second line indicates the

type of OA, speciﬁcally: OA3G = “good” strength 3 OA, OA3N = “neutral” strength 3 OA, and OA3P = “poor” strength 3 OA.

The third line speciﬁes p = 11, and the fourth line provides M

max

for MARS and the number of hidden nodes (H) for ANNs.

•

•

••

••

•

•

•

•

•

•

••

•

••

•••

•

•

•

••

•

•

•

•

•

•

•

•

••

•

•••

•

•

••

•

••

•

•

•••

•

•

•

•

•

••

•

•

•

ANN

OA3G

p13

H60

MARS

OA3G

p13

300

MARS

OA3G

p13

400

ANN

OA3N

p13

H60

MARS

OA3N

p13

300

MARS

OA3N

p13

400

ANN

OA3P

p13

H60

MARS

OA3P

p13

300

MARS

OA3P

p13

400

0

4

8

1

5

9

A

b

s

o

l

u

t

e

E

r

r

o

r

2

3

6

7

11

10

12

13

14

15

Fig. 4. Inventory forecasting: Boxplots illustrate the distribution of absolute error for all 9 SDP solutions using strength 3 OAs

with p = 13 (N = 2197). Labeling below the boxplots uses the same notation as that in Fig. 3.

6.2. SDP solutions

Both the nine-dimensional inventory forecasting problem and the eight-dimensional water reservoir

problem were solved over three stages (T = 3). Although several FF1ANN architectures were tested,

only one FF1ANN architecture for each design was employed to generate the plots in Figs. 3–8 For the

inventory forecasting problem, H = 40 for N = 1331 and H = 60 for N = 2197, and for the water

reservoir problem, H = 10 for both N. Several MARS approximations were constructed for each design

by varying M

max

and K. For the inventory forecasting problem, results with different M

max

are shown,

and for the water reservoir problem, M

max

= 100. For the OAs we set K = 9 for p = 11 and K = 11

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 85

•

••

••

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

••

•

•

•

•

•

•••

••

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•••

•

•

•

ANN

OA3P

p11

H40

ANN

LH1G

N1331

H40

MARS

OA3P

p11

200

MARS

LH2G

N1331

200

K9

MARS

LH2N

N1331

200

K9

ANN

OA3N

p11

H40

ANN

LH1N

N1331

H40

MARS

OA3P

p11

300

MARS

LH2G

N1331

300

K35

MARS

LH2P

N1331

300

K17

A

b

s

o

l

u

t

e

E

r

r

o

r

0

4

8

1

5

9

2

3

6

7

11

10

12

13

14

15

Fig. 5. Inventory forecasting: Boxplots illustrate the distribution of absolute error for 10 SDP solutions using N =1331. Included

fromFig. 3 are the best (ANN/OA3P; MARS/OA3P/M

max

=200) and worst (ANN/OA3N; MARS/OA3P/M

max

=300) solutions.

Beloweach boxplot, the ﬁrst line indicates the use of ANNs or MARS. The second line indicates the type of experimental design,

speciﬁcally: OA3 notation as in Fig. 3, LH1G = ﬁrst LH randomization based on a “good” strength 3 OA, LH2G = second LH

randomization based on a “good” strength 3 OA, and similarly for LH1N, LH2N, LH1P, and LH2P. The third line indicates the

size of the design, specifying p = 11 for the OAs and N = 1331 for the OA–LH designs. The fourth line provides M

max

for

MARS and the number of hidden nodes (H) for ANNs. Finally, a ﬁfth line appears for the OA–LH/MARS solutions indicating

the number of eligible knots (K) utilized in the MARS algorithm. The three best OA–LH-based solutions are shown beside the

best OA-based solutions (in the left half), and similarly for the worst solutions (in the right half).

•

•

••

••

•

•

•

•

•

••

•

•

••

•

••

•••

•

•

•

•

•••

••

•

•

•

•

•

•

•

•

•

••

•

•

•

••

•

•

•

•

•

•

•

•

••

•

•

•

•

••

•

•

•

•

•

ANN

OA3G

p13

H60

ANN

LH2G

N2197

H60

MARS

OA3G

p13

400

MARS

LH2G

N2197

400

K11

MARS

LH2N

N2197

300

K17

ANN

OA3N

p13

H60

ANN

LH1P

N2197

H60

MARS

OA3P

p13

400

MARS

LH2P

N2197

400

K11

MARS

LH1P

N2197

400

K17

A

b

s

o

l

u

t

e

E

r

r

o

r

0

4

8

1

5

9

2

3

6

7

11

10

12

13

14

15

Fig. 6. Inventory forecasting: Boxplots illustrate the distribution of absolute error for 10 SDP solutions using N = 2197 exper-

imental design points. Included from Fig. 4 are the best (ANN/OA3G, MARS/OA3G/M

max

= 400) and worst (ANN/OA3N,

MARS/OA3P/M

max

= 400 solutions. Labeling below the boxplots uses the same notation as that in Fig. 5.

for p = 13 (the maximum allowable in both cases). For the OA–LH designs K was chosen as high as

50. Simulations using the “reoptimizing policy” described in Chen [16] were conducted to determine the

optimal decisions and to test the SDP solutions.

86 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

A

b

s

o

l

u

t

e

E

r

r

o

r

ANN

OA3G

p11

H10

MARS

OA3G

p11

100

ANN

OA3N

p11

H10

MARS

OA3N

p11

100

ANN

OA3P

p11

H10

MARS

OA3P

p11

100

ANN

OA3G

p13

H10

MARS

OA3G

p13

100

ANN

OA3N

p13

H10

MARS

OA3N

p13

100

ANN

OA3P

p13

H10

MARS

OA3P

p13

100

0

4

8

1

5

9

2

3

6

7

10

Fig. 7. Water reservoir: Boxplots illustrate the distribution of absolute error for all 12 SDP solutions using strength 3 OAs with

p = 11 (N = 1331) and p = 12 (N = 2197). Labeling below the boxplots uses the same notation as that in Fig. 3.

A

b

s

o

l

u

t

e

E

r

r

o

r

ANN

OA3N

p11

H10

ANN

LH1G

N1331

H10

MARS

OA3N

p13

100

MARS

LH1G

N1331

100

K50

MARS

LH1G

N2197

100

K50

ANN

LH1N

N2197

H10

ANN

OA3P

p11

H10

ANN

LH1P

N1331

H10

MARS

OA3P

p13

100

MARS

LH1P

N1331

100

K50

MARS

LH1N

N2197

100

K50

ANN

LH1G

N2197

H10

0

4

8

1

5

9

2

3

6

7

10

Fig. 8. Water reservoir: Boxplots illustrate the distribution of absolute error for 12 SDP solutions. Included from Fig. 7 are

the best (ANN/OA3N/p11, MARS/OA3N/p13) and worst (ANN/OA3P/p11, MARS/OA3P/p13) solutions. Labeling below the

boxplots uses the same notation as that in Fig. 5.

6.2.1. Inventory forecasting problem

For each of 100 randomly chosen initial state vectors, a simulation of 1000 demand forecast sequences

was conducted for the SDP solutions. The simulation output provided the 100 mean costs (corresponding

to the 100 initial state vectors), each over 1000 sequences. For each initial state vector, the smallest mean

cost achieved by a set of 90 SDP solutions was used as the “true” mean cost for computing the “errors”

in the mean costs. Boxplots in Figs. 3–6 display the distributions (over the 100 initial state vectors) of

these errors. The average over the 100 “true” mean costs was approximately 62.

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 87

Figs. 3 and 4 display all 18 of the OA-based SDP solutions. They are organized left to right by the

“goodness” of the OAdesign. The OA/MARS solutions with p=11 (N=1331) are all fairly close, except

for the “worst” one, which is “MARS OA3P p11 300”, i.e., the one that used the poor OAand M

max

=300.

The other MARS OA3P solution (with M

max

=200) is used to represent the “best” OA/MARS solution,

although it is essentially equivalent to the other good OA/MARSsolutions. For the OA/FF1ANNsolutions

with p = 11 (N = 1331), the “worst” one is “ANN OA3N p11 H40”, i.e., the one that used the neutral

OA3 (there is actually another outlier above 15). The “best” one is “ANNOA3P p11 H40”, using the poor

OA. The OA/FF1ANN solution with the good OA is not terrible, but is worst than most of the OA/MARS

solutions because of its higher median. Ideally, we would want good OAs to result in better solutions

than poor OAs; however, sometimes it does not happen this way because the poor qualities are located in

regions of the state space that are not critical to ﬁnding a good solution.

For the MARS solutions with p = 13 (N = 2197), the “best” solution is “MARS OA3G p13 400”,

which used the good OA and M

max

=400; while the “worst” is “MARS OA3P p13 400”, which used the

poor OA and M

max

= 400. For the FF1ANN solutions with p = 13 (N = 2197), the “best” solution is

with the good OA, while the “worst” is with the neutral OA. It should be noted in Figs. 3 and 4, as was

illustrated in Chen [16,17], that good OAs result in better SDP solutions on average.

Figs. 5and6displaythe best andworst solutions for eachof the four types of SDPsolutions, withthe best

solutions on the left and worst on the right. The OA/FF1ANNand OA/MARS solutions are repeated from

Figs. 3 and 4. Note the solution for “ANN LH1N N1331 H40”, which used the ﬁrst OA–LH realization

from the neutral OA. This is the “worst” solution among the FF1ANNs using OA–LH designs, but it is

actually a good solution. The interesting observation here is that the FF1ANNs performed better with the

OA–LH designs. The best solution overall is “MARS LH2G N2197 400 K11”, which used the second

OA–LH realization from the good OA, M

max

= 400, and K = 11. However, there is one disconcerting

solution, and that is “MARS LH2GN1331 300 K35”. It is troubling because it uses a LHdesign generated

from the good OA, but it is visibly one of the “worst” solutions. To some extent the goodness measures

are no longer valid for the OA–LH designs, because the randomization used in generating them may have

resulted in a poorly spaced realization.

6.2.2. Water reservoir problem

For each of 50 randomly chosen initial state vectors, a simulation of 100 demand forecast sequences

was conducted for the SDP solutions. The simulation output provided the 50 mean costs (corresponding

to the 50 initial state vectors), each over 100 sequences. The “errors” in mean costs were computed in the

same manner as the inventory forecasting problem. Boxplots in Figs. 7 and 8 display the distributions (over

the 50 initial state vectors) of these errors. The average over the 50 “true” mean costs was approximately

−47.

Fig. 7 displays all the OA-based SDP solutions. They are organized left to right by OA design size

and by the space-ﬁlling quality of the OA design. Among the SDP solutions with p =11 (N =1331) the

“worst” one is clearly “ANN OA3P p11 H=10”. The “best” one is not as easy to identify, but “ANN

OA3N p11 H=10” is seen to have the lowest median. Among the SDP solutions with p=13 (N =2197)

the “worst” one, having a signiﬁcantly longer box, is “MARS OA3P p13 100”. All the other solutions

for this size design are somewhat comparable, but “MARS OA3N p13 100” is seen to have the lowest

median. For both sizes, the worst solutions employed the poor OA, indicating that poor solutions can be

avoided by not using a poor OA. This agrees with the ﬁndings in the previous section.

88 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

Fig. 8 displays the best and worst solutions fromFig. 7 along with the best and worst OA–LH/FF1ANN

and OA–LH/MARS solutions. The ﬁgure is organized with the best solutions on the left and worst on the

right. Some improvement is seen using the OA–LHdesigns. Unlike the inventory forecasting problem, the

water reservoir problem should not require many interaction terms in the value function approximation;

thus, the increased univariate information from LH designs should be an advantage. Like the ﬁndings in

the previous section, the worst OA–LH/FF1ANN solutions are not bad, supporting the expectation that

ANN models perform well with OA–LH designs.

7. Concluding remarks

The statistical modeling approach for numerically solving continuous-state SDP was used to simultane-

ously study OA–LHdesigns and anANN-based implementation as alternatives to the OA/MARS method

of Chen et al. [15] and Chen [16]. In addition, the space-ﬁlling quality of the OA designs was considered.

For the SDP solutions using OA designs, good OAs reduced the possibility of a poor SDP solution. For

the inventory forecasting problem, involving signiﬁcant interaction terms, MARS performed similarly

using both OAand OA–LHdesigns, while FF1ANNsolutions noticeably improved with OA–LHdesigns.

For the water reservoir problem, overall, good solutions could be achieved using MARS models with

either good OA or OA–LH designs and FF1ANN models with good OA–LH designs. With regard to

both computational SDP requirements and quality of SDP solutions, we judged FF1ANNs and MARS

to be very competitive with each other overall. A small advantage to MARS is the information on the

relationships between the future value function and the state variables that may be gleaned from the

ﬁtted MARS model. Current research is developing a method to automatically select M

max

for MARS,

so as to permit ﬂexibility in the approximation across different SDP stages. Unfortunately, a method for

automatically selecting H for ANNs does not seem possible.

References

[1] Bellman RE. Dynamic programming. Princeton: Princeton University Press; 1957.

[2] Puterman ML. Markov decision processes. NewYork, NY: Wiley; 1994.

[3] Bertsekas DP, Tsitsiklis JN. Dynamic programming and optimal control. Belmont, MA: Athena Scientiﬁc; 2000.

[4] Gal S. Optimal management of a multireservoir water supply system. Water Resources Research 1979;15:737–48.

[5] Shoemaker CA. Optimal integrated control of univoltine pest populations with age structure. Operations Research

1982;30:40–61.

[6] White DJ. Real applications of Markov decision processes. Interfaces 1985;15(6):73–83.

[7] White DJ. Further real applications of Markov decision processes. Interfaces 1988;18(5):55–61.

[8] Foufoula-Georgiou E, Kitanidis PK. Gradient dynamic programming for stochastic optimal control of multidimensional

water resources systems. Water Resources Research 1988;24:1345–59.

[9] Culver TB, Shoemaker CA. Dynamic optimal ground-water reclamation with treatment capital costs. Journal of Water

Resources Planning and Management 1997;123:23–9.

[10] ChenVCP, Günther D, Johnson EL. Solving for an optimal airline yield management policy via statistical learning. Journal

of the Royal Statistical Society, Series C 2003;52(1):1–12.

[11] Tsai JCC, Chen VCP, Chen J, Beck MB. Stochastic dynamic programming formulation for a wastewater treatment

decision-making framework. Annals of Operations Research, Special Issue on Applied Optimization Under Uncertainty

2004;132:207–21.

[12] Johnson SA, Stedinger JR, Shoemaker CA, Li Y, Tejada-Guibert JA. Numerical solution of continuous-state dynamic

programs using linear and spline interpolation. Operations Research 1993;41:484–500.

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90 89

[13] Eschenbach EA, Shoemaker CA, Caffey H. Parallel processing of stochastic dynamic programming for continuous state

systems with linear interpolation. ORSA Journal on Computing 1995;7:386–401.

[14] Bertsekas DP, Tsitsiklis JN. Neuro-dynamic programming. Belmont, MA: Athena Scientiﬁc; 1996.

[15] Chen VCP, Ruppert D, Shoemaker CA. Applying experimental design and regression splines to high-dimensional

continuous-state stochastic dynamic programming. Operations Research 1999;47:38–53.

[16] Chen VCP. Application of MARS and orthogonal arrays to inventory forecasting stochastic dynamic programs.

Computational Statistics and Data Analysis 1999;30:317–41.

[17] ChenVCP. Measuring the goodness of orthogonal array discretizations for stochastic programming and stochastic dynamic

programming. SIAM Journal of Optimizations 2001;12:322–44.

[18] Friedman JH. Multivariate adaptive regression splines (with discussion). Annals of Statistics 1991;19:1–141.

[19] Carey WP, Yee SS. Calibration of nonlinear solid-state sensor arrays using multivariate regression techniques. Sensors and

Actuators, B: Chemical 1992;B9(2):113–22.

[20] Grifﬁn WL, Fisher NI, Friedman JH, Ryan CG. Statistical techniques for the classiﬁcation of chromites in diamond

exploration samples. Journal of Geochemical Exploration 1997;59(3):233–49.

[21] Kuhnert PM, Do K-A, McClure R. Combining non-parametric models with logistic regression: an application to motor

vehicle injury data. Computational Statistics and Data Analysis 2000;34(3):371–86.

[22] Shepherd KD, Walsh MG. Development of reﬂectance spectral libraries for characterization of soil properties. Soil Science

Society of America Journal 2002;66(3):988–98.

[23] Smets HMG, Bogaerts WFL. Deriving corrosion knowledge from case histories: the neural network approach. Materials

& Design 1992;13(3):149–53.

[24] Labossiere P, Turkkan N. Failure prediction of ﬁbre-reinforced materials with neural networks. Journal of Reinforced

Plastics and Composites 1993;12(12):1270–80.

[25] AbdelazizAY, Irving MR, Mansour MM, El-ArabatyAM, NosseirAI. Neural network-based adaptive out-of-step protection

strategy for electrical power systems. International Journal of Engineering Intelligent Systems for Electrical Engineering

and Communications 1997;5(1):35–42.

[26] Chen VCP, Rollins DK. Issues regarding artiﬁcial beural network modeling for reactors and fermenters. Bioprocess

Engineering 2000;22:85–93.

[27] Li QS, Liu DK, Fang JQ, JearyAP, Wong CK. Damping in buildings: its neural network model andARmodel. Engineering

Structures 2000;22(9):1216–23.

[28] Liu J. Prediction of the molecular weight for a vinylidene chloride/vinyl chloride resin during shear-related thermal

degradation in air by using a back-propagation artiﬁcial neural network. Industrial and Engineering Chemistry Research

2001;40(24):5719–23.

[29] Pigram GM, Macdonald TR. Use of neural network models to predict industrial bioreactor efﬂuent quality. Environmental

Science and Technology 2001;35(1):157–62.

[30] Kottapalli S. Neural network based representation of UH-60A pilot and hub accelerations. Journal of the American

Helicopter Society 2002;47(1):33–41.

[31] Tang B. Orthogonal array-based Latin hypercubes. Journal of the American Statistical Association 1993;88:1392–7.

[32] Chen VCP, Tsui K-L, Barton RR, Allen JK. A review of design and modeling in computer experiments. In: Khattree R,

Rao CR, editors. Handbook of statistics: statistics in industry, vol. 22. Amsterdam: Elsevier Science; 2003. p. 231–61.

[33] Chen VCP, Welch WJ. Statistical methods for deterministic biomathematical models. In: Proceedings of the 53rd session

of the international statistical institute, Seoul, Korea. August 2001.

[34] De Veaux RD, Psichogios DC, Ungar LH. Comparison of two nonparametric estimation schemes: MARS and neural

networks. Computers & Chemical Engineering 1993;17(8):819–37.

[35] Pham DT, Peat BJ. Automatic learning using neural networks and adaptive regression. Measurement and Control

1999;32(9):270–4.

[36] Palmer KD. Data collection plans and meta models for chemical process ﬂowsheet simulators. PhD dissertation, School

of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, USA, 1998.

[37] Simpson TW, Lin DKJ, Chen W. Sampling strategies for computer experiments: design and analysis. International Journal

of Reliability and Applications 2001;2(3):209–40.

[38] Allen T, Bernshteyn MA, Kabiri K. Constructing meta-models for computer experiments. Journal of Quality Technology

2003;35:264–74.

90 C. Cervellera et al. / Computers & Operations Research 34 (2007) 70–90

[39] Baglietto M, Cervellera C, Parisini T, Sanguineti M, Zoppoli R. Approximating networks for the solution of T-stage

stochastic optimal control problems. In: Proceedings of the IFAC workshop on adaptation and learning in control and

signal processing, 2001.

[40] Baglietto M, Cervellera C, Sanguineti M, Zoppoli R. Water reservoirs management under uncertainty by approximating

networks and learning from data. In: Topics on participatory planning and managing water systems. Amsterdam: Elsevier

Science Ltd., 2005, to appear.

[41] Hadley G, Whitin TM. Analysis of inventory systems. Englewood Cliffs, NJ: Prentice-Hall; 1963.

[42] Heath DC, Jackson PL. Modelling the evolution of demand forecasts with application to safety stock analysis in

production/distribution systems. Technical report #989, School of Operations Research & Industrial Engineering, Cornell

University, Ithaca, NY, 1991.

[43] Archibald TW, McKinnon KIM, Thomas LC. An aggregate stochastic dynamic programming model of multireservoir

Systems. Water Resources Research 1997;33:333–40.

[44] Lamond BF, Sobel MJ. Exact and approximate solutions of afﬁne reservoirs models. Operations Research 1995;43:

771–80.

[45] Turgeon A. A decomposition method for the long-term scheduling of reservoirs in series. Water Resources Research

1981;17:1565–70.

[46] Yakowitz S. Dynamic programming applications in water resources. Water Resources Research 1982;18:673–96.

[47] Chen VCP, Chen J, Beck MB. Statistical learning within a decision-making framework for more sustainable urban

environments. In: Proceedings of the joint research conference on statistics in quality, industry, and technology, Seattle,

WA. June 2000.

[48] Rao CR. Hypercubes of strength ‘d’ leading to confounded designs in factorial experiments. Bulletin of the Calcutta

Mathematical Society 1946;38:67–78.

[49] Bose RC, Bush KA. Orthogonal arrays of strength two and three. Annals of Mathematical Statistics 1952;23:508–24.

[50] Raghavarao D. Constructions and combinatorial problems in design of experiments. NewYork: Wiley; 1971.

[51] Hedayat AS, Wallis WD. Hadamard matrices and their applications. Annals of Statistics 1978;6:1184–238.

[52] Dey A. Orthogonal fractional factorial designs. NewYork: Wiley; 1985.

[53] Hedayat AS, Stufken J, Su G. On difference schemes and orthogonal arrays of strength t . Journal of Statistical Planning

and Inference 1996;56:307–24.

[54] Lippmann RP. An introduction to computing with neural nets. IEEE ASSP Magazine (1987) 4–22.

[55] Haykin SS. Neural networks: a comprehensive foundation. 2nd ed., Upper Saddle River, NJ: Prentice-Hall; 1999.

[56] Cherkassky V, Mulier F. Learning from data: concepts theory, and methods. NewYork: Wiley; 1998.

[57] White H. Learning in neural networks: a statistical perspective. Neural Computation 1989;1:425–64.

[58] Barron AR, Barron RL, Wegman EJ. Statistical learning networks: a unifying view. Computer science and statistics. In:

Proceedings of the 20th symposium on the interface. 1992. p. 192–203.

[59] Ripley BD. Statistical aspects of neural networks. In: Barndorff-Nielsen OE, Jensen JL, Kendall WS, editors. Networks

and chaos—statistical and probabilistic aspects. NewYork: Chapman & Hall; 1993. p. 40–123.

[60] Cheng B, Titterington DM. Neural networks: a review from a statistical perspective (with discussion). Statistical science

1994;9:2–54.

[61] BarronAR. Universal approximation bounds for superpositions of a sigmoidal function. IEEETransactions on Information

Theory 1993;39:930–45.

[62] Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. In: Rumelhart DE,

McClelland JL, editors. Parallel distributed processing: explorations in the microstructure of cognition. Cambridge, MA:

MIT Press; 1986. p. 318–62.

[63] Werbos PK. The roots of backpropagation: from ordered derivatives to neural networks and political forecasting. New

York: Wiley; 1994.

[64] HaganMT, Menhaj M. Trainingfeedforwardnetworks withthe Marquardt algorithm. IEEETransactiononNeural Networks

1994;5:989–93.

C. Cervellera et al. / Computers & Operations Research 34 (2007) 70 – 90

71

1. Introduction The objective of dynamic programming (DP) is to minimize a “cost” subject to certain constraints over several stages, where the relevant “cost” is deﬁned for a speciﬁc problem. For an inventory problem, the “cost” is the actual cost involved in holding inventory, having to backorder items, etc., and the constraints involve capacities for holding and ordering items. An equivalent objective is to maximize a “beneﬁt”, such as the robustness of a wastewater treatment system against extreme conditions. The state variables track the state of the system as it moves through the stages, and a decision is made in each stage to achieve the objective. Recursive properties of the DP formulation permit a solution via the fundamental recurrence equation [1]. In stochastic dynamic programming (SDP), uncertainty is modeled in the form of random realizations of stochastic system variables, and the estimated expected “cost” (“beneﬁt”) is minimized (maximized). In particular, Markov decision processes may be modeled by a SDP formulation through which the state variables resulting from each decision comprise a Markov process. DP has been applied to many problems in engineering, such as water resources, revenue management, pest control, and wastewater treatment. For general background on DP, see Bellman [1], Puterman [2] and Bertsekas [3]. For some applications, see Gal [4], Shoemaker [5], White [6,7], Foufoula-Georgiou and Kitanidis [8], Culver and Shoemaker [9], Chen et al. [10], and Tsai et al. [11]. In high dimensions, traditional DP solution methods can become computationally intractable and attempts have been made to reduce the computational burden [12–14]. For continuous-state SDP, the statistical perspective of Chen et al. [15] and Chen [16] enabled the ﬁrst truly practical numerical solution approach for high dimensions. Their method utilized orthogonal array (OA, [17]) experimental designs and multivariate adaptive regression splines (MARS, [18]). Although MARS has been applied in several areas (e.g., [19–22]), artiﬁcial neural network (ANN) models are much more prevalent in all areas of engineering (e.g., [23–30]). Given their popularity, an ANN-based SDP solution method would be more accessible to engineers. In this paper, we present: • • • • The statistical modeling approach to solving continuous-state SDP. Implementation of ANN models in continuous-state SDP as an alternative to MARS. Implementation of OA-based Latin hypercube (OA–LH) designs [31] as an alternative to pure OAs. Consideration of the “space-ﬁlling” quality of generated experimental designs. The designs are intended to distribute discretization points so that they “ﬁll” the state space; however, generated designs of the same type do not necessarily have equivalent space-ﬁlling quality. • Comparisons between ANN and MARS value function approximations, including methodological perspectives, computational considerations, and numerical SDP solution quality on two applications.

The novelty of our study is to simultaneously consider how the SDP solution is affected by different approximation methods and designs of different types and different space-ﬁlling qualities. In the statistical perspective of continuous-state SDP, the two basic requirements are an experimental design to discretize the continuous state space and a modeling method to approximate the value function An appropriate design should have good space-ﬁlling quality, and an appropriate modeling method should be ﬂexible in high dimensions, provide a smooth approximation, and be reasonably efﬁcient. A review of several experimental designs and modeling methods is provided by Chen et al. [32] in the context of computer experiments. Of the possible modeling choices given in this review paper, only MARS, neural networks, and kriging satisfy the ﬁrst two criteria. Some preliminary exploration of the kriging model

and an eight-dimensional water reservoir problem. a disadvantage when a smooth convex approximation is desired. The grid restriction was convenient for the MARS structure. the inventory levels are inherently discrete since items are counted. the system state variables are continuous. but the range of inventory levels for an item is too large to be practical for a discrete DP solution. value functions are typically well-behaved (e. ANN models can add extraneous curvature. Thus. In continuous-state SDP. 2.1. It is expected that ANN models will perform better in low noise situations. and in both studies MARS performed better overall. but solving for the maximum likelihood estimates of the kriging parameters continues to be a computational challenge. kriging software is even harder to obtain than MARS software. which would be most noticeable in low noise situations. including the implementation of OA–LH designs. but its limited structure. Cervellera et al. Finally. which has simpler structure and is more typical of continuous-state SDP problems. [40]. and the ANN model ﬁtting process has a reputation of being slow. The SDP statistical modeling approach is given in Section 3. Chen et al.and seven-dimensional water reservoir SDP problems appears in Baglietto et al. Points of an OA design are limited to a fairly sparse grid. Comparisons between MARS andANNs for empirical modeling have been conducted by DeVeaux et al. on the other hand. / Computers & Operations Research 34 (2007) 70 – 90 yielded positive results on a ten-dimensional wastewater treatment application [33]. Previous comparisons involved statistical data with considerable noise. [15] and Chen [16] solved inventory forecasting test problems with four-. Inventory forecasting problem For an inventory problem. MARS.. Studies on different experimental designs for metamodeling have been conducted [36–38]. Section 2 describes SDP in the context of the two applications that are later used to test our methods. For this application. However. Given the success of OA designs in continuous-state SDP. we consider the nine-dimensional inventory forecasting problem.g. Section 5 describes ANNs. the existing literature does not provide sufﬁcient studies to predict how they will compare to MARS in the SDP setting. although quite ﬂexible. [34] and Pham and Peat [35]. and Section 6 presents the comparisons between the ANNs and MARS. Although it is obvious that ANN models can be employed to approximate functions. The inventory forecasting problem . which is known to have signiﬁcant interaction effects between inventory levels and forecasts due to the capacity constraints. and application to a 20-dimensional wastewater treatment system appears in Tsai et al. However. Here. computational speed is an important aspect of SDP. could affect its approximation ability. six-. we chose the OA–LH design as a variation that might work well with ANN models. none have considered OA–LH designs. kriging required 40 times more computational effort than MARS. [11]. 2. Conclusions are presented in Section 7. SDP applications In continuous-state DP.72 C. which is counter to our motivation to employ ANNs. but provides no advantage for ANN models. and nine-state variables modeled over three stages. Section 4 provides background on OA and OA–LH designs. one can relax the discreteness and use a continuous range of inventory levels. convex) and the SDP data involve little noise. [39] and Baglietto et al. Application to ten. In addition. constructs a parsimonious approximation. while the OA–LH design allows points to move off the grid.

1. and i is the backorder cost parameter for item i. It . ut is the amount ordered in stage t.1. q}. For this test problem.t) .2. described in Chen [16]. t ). Then the decision vector is ut = i ordered in stage t. Transition function The transition function for stage t speciﬁes the new state vector to be xt+1 = ft (xt . D(t. D(t. let D(t. where ut 2. Cervellera et al. the capacity constraints are chosen to be restrictive. (1) (2) (3) T (i) ut . Let D(t. The typical V-shaped cost function common in inventory modeling is nI cv (xt .t+1) . Let ut (i) be the amount of item i ordered at the is the amount of item beginning of stage t. It .t) . we have three items and two forecasts for each. for which the stochastic updates are applied multiplicatively and assumed to have a mean of one. (1). The bounds placed on the state variables in each stage are necessary to ensure accurate modeling over the state space.t+1) be the forecast made at the beginning of the current stage for the demand of item i occurring over the next stage (t + 1). and t is the stochasticity associated with updates in the demand forecasts. D(t.t) be the forecast made at the beginning of the current stage for the demand of item i occurring over the stage. and on the state variables. Suppose (i) we are currently at the beginning of stage t. but are otherwise arbitrary.1. The smoothed version of Eq. . D(t. / Computers & Operations Research 34 (2007) 70 – 90 73 uses forecasts of customer demand and a probability distribution on the forecasting updates to model the demand more accurately than traditional inventory models (see [41]). is employed here. forcing interactions between the state variables. For simplicity. 2. ut .t+1) (1) (2) (3) (1) (2) (3) (1) (2) (3) T (i) . ut . in the form of capacity constraints. D(t. (1) where [q]+ =max{0. Similarly. in the form of state space bounds. The objective function is a cost function involving inventory holding costs and backorder costs.C. The evolution of these forecasts over time is modeled using the multiplicative Martingale model of forecast evolution.t+1) . Objectives and constraints The constraints for inventory forecasting are placed on the amount ordered. The decision variables control the order quantities. ut . it is assumed that orders arrive instantly. Then the state vector at the beginning of stage t is xt = It . D(t. where xt is the state of the system (inventory levels and forecasts) at the beginning of stage t.1. Let It be the inventory level for item i at the beginning of (i) the current stage. 2.3. See Chen [16] for details on the stochastic modeling. State and decision variables The state variables for inventory forecasting consist of the inventory levels and demand forecasts for each item. so that the sequence of future forecasts for stage t + k forms a martingale [42]. hi is the holding cost parameter for item i. ut ) = i=1 hi [It+1 ]+ + (i) (i) i [−It+1 ]+ . For our nine-dimensional problem.t) .

rivers and rain). Turgeon [45] and Yalpwotz [46] provides an excellent survey). Lamond and Sobel [44]. representing the stochastic external ﬂow into reservoir i. Archibald et al. The water reservoir network we consider consists of four reservoirs (see Fig. State and decision variables In a realistic representation of inﬂow dynamics. t (1) (2) (3) (4) (1) (2) (3) (4) T . a month).2. so that it is possible to model proper external inﬂow behaviors for different months.g.. has the form (i) t = at (i) (i) t−1 + bt + c t (i) (i) (i) t . the ith component of the random vector t .g. (2) where t is a random correction that follows a standard normal distribution. 2. / Computers & Operations Research 34 (2007) 70 – 90 Fig. (2). 1 for the conﬁguration). .. ct ) actually depend on t. Reservior network. wt . the inputs to the nodes are water released from upstream reservoirs and stochastic inﬂows from external sources (e.1. Gal [4]. the amount of (random) water ﬂowing into the reservoirs from external sources during a given stage t depends on the amounts of past stages.. The actual values of the coefﬁcients used for the model were based on the one-reservoir real model described in Gal [4] and extended to our four-reservoir network. t . Given Eq. while the outputs correspond to the amount of water to be released during a given time stage (e. In this paper. Foufoula-Georgiou and Kitanidis [8]. 2. 1. bt . we follow the realistic modeling of Gal [4] and use an autoregressive linear model of order one. the state vector consists of the water level and stochastic external inﬂow for each reservoir: xt = wt . The coefﬁcients of the linear combinations (at . wt . Water reservoir problem Optimal operation of water reservoir networks has been studied extensively in the literature (e. Thus. wt . Cervellera et al.g.74 C. In typical models. t .2. [43]. t .

). +∞)): ⎧ 1 ⎪ ( + 1)z. so that the amount of water never exceeds the maximum value W (i) . ) = z+ − 2 z− − 2 z− .g. etc. ut . and stochastic external inﬂow into reservoir i during stage t. the new water level is ⎫ ⎧ ⎬ ⎨ (j ) (i) (i) (i) (i) (i) . 3.3.1 . (2). 2. we used R (i) for and a small number 0. ut . the intuitive cost would be |wt − wt |. 0 z . 2. z . For maintaining targets. For the stochastic external inﬂows. To facilitate numerical minimization. / Computers & Operations Research 34 (2007) 70 – 90 75 where wt (i) is the amount of water in reservoir i (i = 1. ut . . In order to deal with unexpected peaks from stochastic external inﬂows.2. the two equations above can be grouped in the compact transition function xt+1 = ft (xt . z .2. Cervellera et al. Like the inventory forecasting model. power genera(i) (i) ˜ tion. ⎪ ⎪ ⎪ 3 ⎪ ⎨ 2 2 3 153 2 4 1 1 45 g(z. W wt+1 = min wt + ⎭ ⎩ (i) j ∈U where U (i) the set of indexes corresponding to the reservoirs which release water into reservoir i.C. t ). R (i) . depending on suitable parameters and . 2 3 Such a function models a beneﬁt which becomes relevant for “large” values of the water releases. which is a V-shaped (i) (i) ˜ function. ut − u t + t . The amount of water at stage t + 1 reﬂects the ﬂow balance between the water that enters (upstream releases and stochastic external inﬂows) and the water that is released. irrigation. wt .. ⎩ ⎭ i j ∈U i where R (i) the maximum pumpage capability for reservoir i. The ob(i) jective function is intended to maintain a reservoir’s water level close to a target value wt and maximize ˜ beneﬁt represented by a concave function g of the water releases and/or water levels (e. denoted by l wt . we can update them using Eq. that is analogous to the smoothed cost function for the inventory forecasting problem. (i) (i) t the (1) (2) (3) (4) T ut . In our example. we utilized a smoothed version. Objectives and constraints (i) The constraints for water reservoir management are placed on the water releases ut in the form: ⎧ ⎫ ⎨ ⎬ (j ) (i) (i) 0 ut min wt + ut .2. ut . ⎪ 2 3 2 3 3 3 ⎪ ⎪ ⎪ 2 ⎪ ⎩ z+ 1 . Transition function The transition function for stage t updates the water levels and external inﬂows. For the modeling beneﬁts. 4) at the beginning of stage t. Thus. we consider a nonlinear concave function g of the following form (for z ∈ [0. 2. This implies nonnegativity of wt+1 . each reservoir has a ﬂoodway. The decision vector is ut = where ut be the amount of water released from reservoir i during stage t.

SDP statistical modeling solution approach 3. . let the state vector be the vector of predictor variables. 4 are weights chosen to balance the beneﬁts against the costs. then given the initial state of the system. 3. and easy derivative calculations. Chen [16] additionally explored strength two OAs . Thus. .1. t ). the corresponding optimal cost yields one point on that stage’s value function. where the p(i) . the basic SDP problem is to ﬁnd F1 (·). [15] introduced this statistical perspective using strength three OAs and MARS (smoothed with quintic functions) applied to the inventory forecasting test problems. this is written recursively as Ft (xt ) = min E{c(xt . Chen et al. the basic procedure uses an experimental design to choose N points in the n-dimensional state space so that one may accurately approximate the future value function. ut . ﬁrst by assuming FT +1 (xT +1 ) ≡ 0. ut . ut ) ∈ t . ability to identify important interactions. t ) + Ft+1 (xt+1 )} ut s. Following the notation in Section 2. Future value function For a given stage. this is an inﬁnite process. so that FT (xT ) = cT (xT ) (where cT (xT ) is the cost of the ﬁnal stage depending only on xT ). given a ﬁnite time horizon T. . t ) = i=1 l (i) (i) wt . Thus. [47] for further details. but univariate modeling can suffer due to the small number of distinct levels in each dimension (e. . we can track the optimal solution through the value functions.76 C. the future value function provides the optimal cost to operate the system from stage t through T as a function of the state vector xt . xt+1 = ft (xt . then constructs an approximation. The complete future value function is obtained by solving this optimization problem for all possible beginning states. 2. a numerical SDP solution requires a computationally tractable approximation method for the future value function. See the description in Chen [16] or Chen et al. Thus. except in simple cases where an analytical solution is possible. For continuous states. The recursive formulation permits the DP backward solution algorithm [1]. the total cost function at stage t can be written as 4 4 ct (xt . We seek to estimate the unknown relationship between the state vector and the future value function. p = 13 was the highest number of levels explored). / Computers & Operations Research 34 (2007) 70 – 90 for . (xt .2. . The choice of the strength three OAs was motivated by the desire to model interactions. Cervellera et al. (3) If we have the future value functions for all the stages. Response observations for the future value function may be collected via the minimization in Eq. 3.. MARS was chosen due to its ease of implementation in high dimensions. 3. then solving in order for FT (·) through F1 (·). Numerical solution algorithm Given the state of the system at the beginning of the current stage. i = 1. (i) (i) . (3). From a statistical perspective. FT (·).t.g. ut . Then. wt ˜ − i=1 p(i) g ut .

y) . There are two basic ways to measure space-ﬁlling quality: minimax and maximin (refer to [32]). calculate various space-ﬁlling measures. each at p levels. however. In practice..g. An OA of strength d (d < n) for n variables. x∈D y∈P Since the supremum over the inﬁnite set of points in D is difﬁcult to calculate. for which the number of points grows exponentially as the dimension increases.C. each at p levels. Let D be the n-dimensional closed and bounded set of the continuous state space. the one on the left is seen to space the points better. contains all pn level combinations. 2. Using the same distance measure as above. 4 3 x1 2 1 0 • • • • • 0 1 2 x2 3 4 x1 4 3 2 1 0 • • • • • 0 1 2 x2 3 4 Fig. using a full factorial grid or some type of sampling scheme. taking a subset of the full grid. this measure is generally impractical to calculate in high dimensions. OAs are special forms of fractional factorial designs. P) than a design that does not. then uses multi-linear interpolation to estimate the future value function. This full grid is equivalent to a full factorial experimental design. . Therefore. Measuring space-ﬁlling quality There are a variety of ways to generate OAs (e. one can generate many OAs. then choose the best of the bunch. Minimax criteria seek to minimize the maximum distance between (nondesign) points in D and the nearest-neighbor design point in P. i. with the same frequency. To avoid this. their performance was disappointing. but with lower order balance permitting higher p. may be employed. there is no way to guarantee their space-ﬁlling quality. and only OAs of strength three were studied. P) = sup min d(x. and P be the set of n design points in D.e. the classical solution method discretizes the state space with a full grid.. Chen [17] has conducted the only study of how this quality affects an approximation. However. Two conﬁgurations of design points. contains all possible level combinations in any subset of d dimensions. 4. A full factorial design for n variables. A design that represents D well will have lower MAXDIST(D. Cervellera et al. deﬁne: MAXDIST(D.1. this measure is usually calculated over a ﬁnite sample of points in D. when projected down onto any d dimensions. . a fractional factorial design. / Computers & Operations Research 34 (2007) 70 – 90 77 with equivalent N to the strength three OAs. However. a full factorial grid of p d points replicated times is represented. 4. Discretizations based on OAs In continuous-state SDP. [48–53]). Looking at the two conﬁgurations of points in Fig. 2.

Activation functions deﬁne transformations between layers. n predictors and one response. and for the water reservoir problem. their space-ﬁlling quality was judged using measures of both types and a measure introduced by Chen [17]. Model description ANNs are mathematical modeling tools widely used in many different ﬁelds (for general background and applications see Lippmann [54]. i. for approximation capabilities see Barron [61]). for which an LH design is (randomly) created from an OA structure. Thus. In this paper. eight-dimensional designs were generated. The input layer consists of n nodes. Connections between nodes can “feed back” to previous . Ripley [59]. only OAs of strength two and higher are technically orthogonal. we utilize the strength three OAs described in Chen [17]. However. In between there are hidden layers which induce ﬂexibility into the modeling. Cervellera et al. In other words.1. Although OA–LH designs only satisfy LH properties. Let d(x.. and the OA–LH designs are based on these strength three OAs. [58]. For each size (N = 1331 and 2197). Cherkassky and Mulier [56]. one “neutral”. / Computers & Operations Research 34 (2007) 70 – 90 Maximin criteria seek to maximize the minimum distance between any pair of points in P. for statistical perspectives see White [57]. With regard to approximating a function. Artiﬁcial neural networks 5. For the OA designs in this paper. Speciﬁcally. Then deﬁne: ⎧ ⎫ ⎨ ⎬ MINDIST(P) = min min d(x. one “good”. and the output layer consists of one node for the output variable. y) be a distance measure (commonly Euclidean distance) over R.2. was introduced by Tang [31] to address the lack of orthogonality in LH designs. From a purely “analytical” perspective. nine-dimensional designs were generated. if we project the N points of a LH design onto any single dimension. A hybrid OA–LH design. 5. ⎭ x∈P ⎩ y∈P y=x Higher MINDIST(P) should correspond to a more uniform scattering of design points. one for each input variable. 4. the LH design provides more information along a single dimension (for univariate terms). Haykin [55]. so the LH design does not maintain this property. The basic architecture of ANNs is an interconnection of neural nodes that are organized in layers. the underlying OA provides some balance. a total of 6 OAs and 12 OA–LH designs were tested. an ANN model is a nonlinear parameterized function that computes a mapping of n input variables into (typically) one output variable. and one “poor” OA was selected based on space-ﬁlling measures described above. or Cheng and Titterington [60].e. Bertsekas and Tsitsiklis [14]. while the OA design provides better coverage in higher dimensions (for interactions). Barron et al. and two realizations of OA–LH designs were generated for each OA.78 C. we employed strength three OAs with p = 11 levels (N = 1331) and p = 13 levels (N = 2197) in each dimension. y) . For the inventory forecasting problem. the points will correspond to N distinct levels in that dimension. the simplest one being a linear function. OA-based LH designs LH designs are equivalent to OA designs of strength one.

and are estimated using a nonlinear least-squares approach called training. v. A comprehensive description of various ANN forms may be found in [55]. After all N samples are presented to the network and the corresponding gradients are computed.a. a. This is completed in two phases: (i) a forward phase in which an input xj is presented to the network in order to generate the various outputs (i. and (ii) a backward phase in which the gradient is actually computed on the basis of the values obtained in the forward phase. / Computers & Operations Research 34 (2007) 70 – 90 79 layers.k. since the computation of the gradient requires only products. . Implementation of this approach is very simple. the descent algorithm step is completed by upgrading the vector of parameters w using the steepest-descent method: wk+1 = wk − k ∇LOF(wk ). vj h as weights linking input nodes j to hidden nodes h. in order to compute the gradient with respect to the entire set of parameters. Nonlinear optimization techniques used to conduct the minimization are typically gradient descent-based methods. which computes the gradient of the output with respect to the outer parameters. In our inventory forecasting SDP application. the training data set.. wh as weights linking hidden nodes h to the output node. then “backpropagates” the partial derivatives through the hidden layers.. w. ) = Tanh ˆ h=1 wh Z h + . We used the backpropagation method (see [62. H as the number of hidden nodes. Deﬁne n as the number of input variables. and h and as constant terms called “bias” nodes (like intercept terms). wh . we will refer to the entire vector of parameters as w in the following sections. we utilized a feedforward one-hidden-layer ANN (FF1ANN) with a “hyperbolic tangent” activation function: Tanh(x) = (ex − e−2x )/(ex + e−x ). the typical ANN is feedforward only with one hidden layer and uses “sigmoidal” activation functions (monotonic and bounded). due to the . ˆ where N is the number of data points on which the network is being trained. Cervellera et al. w)]2 . where ∇LOF(wk ) is the gradient of the LOF function with respect to the vector wk . the output of the network and those of the inner activation functions) corresponding to the current value of the parameters.C. The parameters vih . 5. For simplicity. where for each hidden node h n Zh = Tanh i=1 vih xi + h . but for function approximation. h .63]).2.e. Then our ANN model form is H g(x. Model ﬁtting Our training employed a batch processing method (as opposed to the more common on-line processing) that minimized the squared error loss lack-of-ﬁt (LOF) criterion: N LOF(g) = ˆ j =1 [yj − g(xj .

determines the architecture. such as entrapment in local minima. the most important decision in implementing an ANN model is the choice of architecture.3. Too few nodes would result in a network with poor approximation properties. Therefore.. when the LOF decreases) is decreased. The best results for the nine-dimensional inventory forecasting problem were obtained for values of H = 40 for N = 1331 and H = 60 for N = 2197. Existing theoretical bounds on the accuracy of the approximation are non-constructive or too loose to be useful in practice.e. The best training algorithm for FF1ANNs. Cervellera et al. ˆ The update rule for the parameter vector at each step is wk+1 = wk − [J [e(wk )]T J [e(wk )] + I ]−1 J [e(wk )]T e(wk )T . For a FF1ANN. the network merely “memorizes” the data at the expense of generalization capability). is the Levenberg–Marquardt method (ﬁrst introduced in [64]). the number of nodes in the hidden layer. the best results for the eight-dimensional water reservoir problem were obtained using H = 10 for both sample sizes.. in terms of rate of convergence. Unfortunately. the method is close to gradient descent with a small stepsize. Thus. but the choice of a good architecture requires several modeling attempts and is always very dependent on the particular application. backpropagation presents many well-known disadvantages. we identiﬁed the appropriate number of hidden nodes by exploring many architectures and checking the average cost using a test data set from the simulation described in Section 6. For our applications.80 C. Each row of the matrix contains the gradient of the jth single error term yj − g(xj . yN − g(xN . which is an approximation of the classic second-order (i. the choice of the stepsize k is crucial for acceptable convergence. H.e. while too many would “overﬁt” the training data (i. .. 5. As a consequence. as well as methods to avoid or reduce overﬁtting (see [55]). w) with respect to the set of parameters w. . Model implementation Other than the selection of a model ﬁtting approach. w)]T ˆ ˆ and J [e(w)] is the Jacobian matrix of e(w). since it depends on the number of samples available and on the smoothness of the function we want to approximate. while it is increased when the step degrades the LOF. w). . . it is not possible to deﬁne “a priori” the optimal number of hidden nodes. Unfortunately. it makes use of the Hessian matrix) Newton method. . When is large. we can approximate its Hessian matrix via backpropagation as J [e(w)]T J [e(w)]. the appropriate number of hidden nodes is never clear-cut. After each successful step (i.2. Similarly. When = 0 we have a standard Newton method with an approximated Hessian matrix (Gauss–Newton method). a tradeoff between these two possibilities is needed.e. where e(w) = [y1 − g(x1 . There are many rules of thumb and methods for model selection and validation in the literature. For larger values of H the solutions worsened. which is more accurate near the minima. / Computers & Operations Research 34 (2007) 70 – 90 mathematical structure of the network. and very different stepsizes may be appropriate for different applications. Since our LOF criterion is quadratic.

like ANNs. MARS [18] is based on traditional statistical thinking with a forward stepwise algorithm to select terms in a linear model followed by a backward procedure to prune the model. For our study. but the approximation degraded. the backward procedure was not utilized since preliminary testing demonstrated a very large computational burden with little change on the approximation. we will explore how the requirements might grow with increasing n . The primary disadvantage of MARS is that the user must specify the maximum number of basis functions to be added. such as Matlab’s (www. and the size of the training data set N. thus. but this is not a critical issue for accurate SDP solutions. The primary disadvantages are (1) the difﬁculty in selecting the number of hidden nodes H and (2) potential overﬁtting of the training data. Mmax for MARS and H for ANNs. MARS was restricted to explore up to three-factor interations. two-factor interactions were sufﬁcient. This is very similar to the difﬁculty in selecting H for an ANN. When different Mmax or H is preferred for different stages. and the maximum number of basis functions Mmax . commercial MARS software has only recently become available from Salford Systems (www. For the inventory forecasting and water reservoir problems. the same Mmax or H for all stages is appropriate. For the inventory forecasting problems. Their main difference is the interpretability of the approximation.salford-systems. the value functions have similar structure from stage to stage. ANNs and MARS comparisons ANNs have become very popular in many areas of science and engineering.com) Neural Network Toolbox. together with the existence of accessible software. due to their proven ability to ﬁt any data set and to approximate any real function. An issue that may arise in other SDP applications is the selection of the key parameter.1. the key quantities are n.com). The second disadvantage occurs because it is tempting to try to ﬁt the data as closely as possible even when it is not appropriate. quintic functions were used to achieve second derivative continuity (see [16]). and one of the objectives of the forward stepwise algorithm is to select appropriate knots. which was used in our study. 6. In our case. however. and. MARS and ANNs are both excellent for modeling in high dimensions when noise is small relative to the signal. From a statistical perspective. Cervellera et al. The search for new basis functions can be restricted to interactions of a maximum order. Memory and computational requirements The memory and computational requirements for a FF1ANN with one output variable depend on the number of input (predictor) variables n. However. a parameter called Mmax . In addition. then the SDP algorithm would greatly beneﬁt from automatic selection of Mmax or H. ANNs have the additional disadvantage of an approximation that does not provide interpretable information on the relationships between the input (predictor) variables and output (response) variable. / Computers & Operations Research 34 (2007) 70 – 90 81 6.mathworks. In this section. our own C code was employed. Chen [16] also considered MARS limited to two-factor interactions.C. as is the case with SDP. This is an issue for future research. for the water reservoir problem. From the perspective of SDP value function approximation. N. After selection of the basis functions is completed. MARS is both ﬂexible and easily implemented while providing useful information on signiﬁcant relationships between the predictor variables and the response. the number of hidden nodes H. the number of eligible knots K. The approximation bends to model curvature at “knot” locations. but this process greatly increases the computational effort required. a test data set can be used to select Mmax . smoothness to achieve a certain degree of continuity may be applied. For MARS. Both disadvantages can be handled by proper use of a test data set.

which assumes N is based on a strength three OA. Thus. In this paper. / Computers & Operations Research 34 (2007) 70 – 90 (the dimension of an SDP problem). For memory requirements. The maximum total number of bytes required to store one smoothed MARS approximation with up to three-factor interactions grows as [15] ∗ O(n) SMARS = 8(6Mmax + 2n + 3Kn + 1) O(n2 ). except the last. We utilized Levenberg–Marquardt (LM) backpropagation to train the FF1ANN. which is the one we actually used for our FF1ANNs. and a more reasonable upper limit on Mmax is O(N 2/3 ) = O(n2 ). Based on computational experience with a variety of value function forms. (i) is performed via optimization that requires the evaluation of the output of an ANN and of its ﬁrst derivative. so the memory needed grows as O(n5 ) S2FF1ANN = O(NnH) + O(n2 H 2 ) O(n6 ). [15] set N =O(n3 ). H depends on n. which are convex and. it should not be expected to increase with n since the larger K primarily impacts the quality of univariate modeling. a FF1ANN requires storage of V = [(n + 1)H + H + 1] parameters. if the memory at disposal is sufﬁcient. The optimization evaluates a value function approximation and its ﬁrst derivative several times in each stage. given an adequate K.800 bytes. However. (3) for one stage t at every experimental design point and (ii) constructing the spline approximation of the future value function. Chen et al. the total number of bytes required to store a FF1ANN grows as O(n2 ) SFF1ANN = 8V O(n3 ). and a disadvantage of the LM method lies in its memory requirements. which assumes knots can only placed at the p distinct levels of the strength three OA. Each step requires the storage of a (N×V ) matrix and the (V × V ) Hessian approximation. The time for the computation of LM backpropagation is dominated by the Hessian . The computational requirements for one iteration of our SDP statistical modeling solution approach consist of: (i) solving the optimization in Eq. However. For what concerns (ii). so for n = 9. however. MARS construction stores up to a (N ×Mmax ) matrix.82 C. K =O(n). A reasonable range on H (in terms of growth) matches that of O(n) H O(N 2/3 ) = O(n2 ). and the complexity of the underlying function. The inventory forecasting value function. so we can write the total time required to solve the SDP equation for all N points as O(WnNH). resulting in a growth of O(n4 ) S2MARS = O(NMmax ) O(n5 ). FF1ANNs are still considerably smaller. The time required for these evaluations is O(nH). For ANNs. it appears that the memory requirement for FF1ANNs grows faster than that for MARS. K 50 is more than adequate. However. Assuming each of these is a double-precision real number. and O(n) Mmax O(n3 ). One should keep in mind that the above estimates are intended to be adequate for approximating SDP value functions. Another aspect of memory requirements is memory required while constructing the approximation. the largest FF1ANN tested in the paper required 5288 bytes and the largest MARS required 36. where W is the work required to solve the minimization. the addition of the OA–LH designs permits the placement of a higher number of knots. we consider LM backpropagation. Cervellera et al. does involve a steep curve due to high backorder costs. LM backpropagation is generally preferred over other training methods because of its faster convergence. For FF1ANNs. generally well behaved. consequently. N.

06 GHz IX 83 13 min (H = 10) 43 min (Mmax = 100) 18 min (H = 10) 67 min (Mmax = 100) The FF1ANN runs used WLM = 125. our experience demonstrates that the ANN model is clearly easier to implement. For the inventory forecasting problem. it can been seen that the run times for the same N are comparable. For the water reservoir problem. WLM never needs to exceed 300. For the inversion of a (V × V ) matrix. however. The total run time for one iteration of the MARS-based SDP solution grows as [15] O(n7 ) CMARS = O(WNMmax ) + O(nNM3 ) O(n10 ). Actual average run times for one SDP iteration are shown in Table 1. however. Cervellera et al. approximation and inversion. Deﬁne WLM to be the number of steps required to attain the backpropagation goal. For the water reservoir problem. calling the MARS C code in Matlab resulted in signiﬁcant slow down. the total run time for one iteration of the FF1ANN-based SDP solution grows as O(n7 ) CFF1ANN = O(WNnH) + [O(WLM n2 NH2 )] O(n9 ). and the MARS runs suffered signiﬁcant slow down.) Then the time required to train the network by LM backpropagation grows as O(WLM Nn2 H 2 ). considerable effort was expended to develop a Matlab code that could access our MARS C code. a completely fair comparison that uses the same programming platform is not readily implementable. Matlab was used to conduct the FF1ANN runs on two Pentium II (P2) machines. the time needed to compute one step of LM backpropagation is dominated by the matrix product: O(Nn2 H 2 ). In order to conduct a fairer computational comparison on the same programming platform. the computation is basically reduced to [H + 1 + (n + 1)H ] products. the slow down experienced by calling the MARS C code from Matlab is obvious. Despite this. The matrix product requires NV 2 real valued products. all runs were conducted in the same place. Overall. for each row of the matrix (which corresponds to a single experimental design point xj ). all runs were conducted in Matlab on an Intel Xeon (IX) workstation. Unfortunately. at this time. the ability to call C code from Matlab is still ﬂawed. Unfortunately. the MARS and ANN runs were conducted on different machines in different countries. and a C code executed the MARS runs on a different P2 machine. Therefore. an efﬁcient algorithm can provide a computational time that is O(V 2 ) = O(n2 H 2 ). . Thus. so this operation (which is O(Nn2 H 2 )) occupies the relevant part of the time required to compute the approximation. so the time required to obtain the gradient for all N experimental design points is O(NnH). Here. The (V ×V ) Hessian approximation is the product of two (N ×V ) matrices that are obtained by backpropagation. / Computers & Operations Research 34 (2007) 70 – 90 Table 1 Actual average run times (in minutes) for one SDP iteration using FF1ANN and MARS models Model N Inventory forecasting 450 MHz P2 ANN MARS ANN MARS 1331 1331 2197 2197 65 min (H = 40) 185 min (H = 60) 550 MHz P2 61 min (Mmax = 300) 112 min (Mmax = 400) 933 MHz P2 90 min (H = 40) 87 min (H = 60) Water reservoir 3. (Typically.C. max where the time to construct the MARS approximation (second term) dominates. For the inventory forecasting problem.

2. speciﬁcally: OA3G = “good” strength 3 OA. 4. and OA3P = “poor” strength 3 OA. Several MARS approximations were constructed for each design by varying Mmax and K. Labeling below the boxplots uses the same notation as that in Fig. results with different Mmax are shown. Mmax = 100. SDP solutions Both the nine-dimensional inventory forecasting problem and the eight-dimensional water reservoir problem were solved over three stages (T = 3). For the inventory forecasting problem. H = 10 for both N. 3. Inventory forecasting: Boxplots illustrate the distribution of absolute error for all 9 SDP solutions using strength 3 OAs with p = 11 (N = 1331). only one FF1ANN architecture for each design was employed to generate the plots in Figs. and for the water reservoir problem. and for the water reservoir problem. Below each boxplot. Although several FF1ANN architectures were tested. 3. For the OAs we set K = 9 for p = 11 and K = 11 . Inventory forecasting: Boxplots illustrate the distribution of absolute error for all 9 SDP solutions using strength 3 OAs with p = 13 (N = 2197). 3–8 For the inventory forecasting problem. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 • • • • • • • Absolute Error • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • ANN MARS MARS ANN MARS MARS ANN MARS MARS OA3G OA3G OA3G OA3N OA3N OA3N OA3P OA3P OA3P p13 p13 p13 p13 p13 p13 p13 p13 p13 H60 300 400 H60 400 H60 300 400 300 Fig. The second line indicates the type of OA. and the fourth line provides Mmax for MARS and the number of hidden nodes (H) for ANNs. the ﬁrst line indicates the use of ANNs or MARS. 6.84 C. / Computers & Operations Research 34 (2007) 70 – 90 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Absolute Error ANN MARS MARS ANN MARS MARS ANN MARS MARS OA3G OA3G OA3G OA3N OA3N OA3N OA3P OA3P OA3P p11 p11 p11 p11 p11 p11 p11 p11 p11 H40 200 300 H40 200 300 H40 200 300 Fig. Cervellera et al. OA3N = “neutral” strength 3 OA. H = 40 for N = 1331 and H = 60 for N = 2197. The third line speciﬁes p = 11.

specifying p = 11 for the OAs and N = 1331 for the OA–LH designs. LH2N. and similarly for LH1N. Simulations using the “reoptimizing policy” described in Chen [16] were conducted to determine the optimal decisions and to test the SDP solutions. Labeling below the boxplots uses the same notation as that in Fig. LH2G = second LH randomization based on a “good” strength 3 OA. speciﬁcally: OA3 notation as in Fig. MARS/OA3P/Mmax =200) and worst (ANN/OA3N. the ﬁrst line indicates the use of ANNs or MARS. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 • • • • • • • • • • • • • • • • • • • • • Absolute Error • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • ANN ANN MARS MARS OA3G LH2G OA3G LH2G p13 N2197 p13 N2197 H60 H60 400 400 K11 MARS ANN ANN MARS MARS MARS LH2N OA3N LH1P OA3P LH2P LH1P N2197 p13 N2197 p13 N2197 N2197 300 H60 H60 400 400 400 K17 K11 K17 Fig. a ﬁfth line appears for the OA–LH/MARS solutions indicating the number of eligible knots (K) utilized in the MARS algorithm. 6. The second line indicates the type of experimental design. The three best OA–LH-based solutions are shown beside the best OA-based solutions (in the left half). and similarly for the worst solutions (in the right half). 3. Included from Fig. MARS/OA3G/Mmax = 400) and worst (ANN/OA3N. MARS/OA3P/Mmax = 400 solutions. 5. 3 are the best (ANN/OA3P. LH1G = ﬁrst LH randomization based on a “good” strength 3 OA. 5. for p = 13 (the maximum allowable in both cases). / Computers & Operations Research 34 (2007) 70 – 90 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 85 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Absolute Error ANN ANN MARS MARS MARS ANN ANN MARS MARS MARS OA3P LH1G OA3P LH2G LH2N OA3N LH1N OA3P LH2G LH2P p11 N1331 p11 N1331 N1331 p11 N1331 p11 N1331 N1331 300 H40 H40 200 200 200 H40 H40 300 300 K35 K17 K9 K9 Fig. The third line indicates the size of the design.C. Inventory forecasting: Boxplots illustrate the distribution of absolute error for 10 SDP solutions using N = 1331. MARS/OA3P/Mmax =300) solutions. and LH2P. The fourth line provides Mmax for MARS and the number of hidden nodes (H) for ANNs. Inventory forecasting: Boxplots illustrate the distribution of absolute error for 10 SDP solutions using N = 2197 experimental design points. Below each boxplot. 4 are the best (ANN/OA3G. Cervellera et al. LH1P. Finally. . Included from Fig. For the OA–LH designs K was chosen as high as 50.

MARS/OA3P/p13) solutions. Inventory forecasting problem For each of 100 randomly chosen initial state vectors. a simulation of 1000 demand forecast sequences was conducted for the SDP solutions. the smallest mean cost achieved by a set of 90 SDP solutions was used as the “true” mean cost for computing the “errors” in the mean costs. The simulation output provided the 100 mean costs (corresponding to the 100 initial state vectors). 6. 7. For each initial state vector. Labeling below the boxplots uses the same notation as that in Fig. 3. each over 1000 sequences. / Computers & Operations Research 34 (2007) 70 – 90 10 9 8 7 Absolute Error 6 5 4 3 2 1 0 ANN MARS ANN MARS ANN MARS ANN MARS ANN MARS ANN MARS OA3G OA3G OA3N OA3N OA3P OA3P OA3G OA3G OA3N OA3N OA3P OA3P p13 p13 p13 p13 p13 p13 p11 p11 p11 p11 p11 p11 H10 100 H10 100 H10 100 H10 H10 100 100 H10 100 Fig. Water reservoir: Boxplots illustrate the distribution of absolute error for 12 SDP solutions. . Included from Fig.86 C. Cervellera et al. 10 9 8 7 Absolute Error 6 5 4 3 2 1 0 ANN ANN MARS OA3N LH1G LH1G p11 N1331 N1331 H10 H10 100 K50 MARS OA3N p13 100 ANN MARS ANN ANN LH1N LH1G OA3P LH1P N2197 N2197 p11 N1331 H10 100 H10 H10 K50 MARS LH1P N1331 100 K50 MARS OA3P p13 100 ANN MARS LH1G LH1N N2197 N2197 H10 100 K50 Fig. MARS/OA3N/p13) and worst (ANN/OA3P/p11. The average over the 100 “true” mean costs was approximately 62.1. Boxplots in Figs. 3–6 display the distributions (over the 100 initial state vectors) of these errors.2. 7 are the best (ANN/OA3N/p11. 8. 5. Water reservoir: Boxplots illustrate the distribution of absolute error for all 12 SDP solutions using strength 3 OAs with p = 11 (N = 1331) and p = 12 (N = 2197). Labeling below the boxplots uses the same notation as that in Fig.

They are organized left to right by the “goodness” of the OA design. the one that used the poor OA and Mmax =300. 7 displays all the OA-based SDP solutions. Water reservoir problem For each of 50 randomly chosen initial state vectors. while the “worst” is with the neutral OA. For the MARS solutions with p = 13 (N = 2197). i. is “MARS OA3P p13 100”. the worst solutions employed the poor OA. the one that used the neutral OA3 (there is actually another outlier above 15). All the other solutions for this size design are somewhat comparable. This is the “worst” solution among the FF1ANNs using OA–LH designs. 6. Cervellera et al.17]. . the “best” solution is “MARS OA3G p13 400”. having a signiﬁcantly longer box. which used the poor OA and Mmax = 400.C. The “best” one is “ANN OA3P p11 H40”. because the randomization used in generating them may have resulted in a poorly spaced realization. each over 100 sequences. but “MARS OA3N p13 100” is seen to have the lowest median.2. The best solution overall is “MARS LH2G N2197 400 K11”. the “worst” one is “ANN OA3N p11 H40”. which used the good OA and Mmax = 400. / Computers & Operations Research 34 (2007) 70 – 90 87 Figs. The average over the 50 “true” mean costs was approximately −47. The “errors” in mean costs were computed in the same manner as the inventory forecasting problem. Figs. 3 and 4. They are organized left to right by OA design size and by the space-ﬁlling quality of the OA design. Among the SDP solutions with p = 11 (N = 1331) the “worst” one is clearly “ANN OA3P p11 H = 10”. For the OA/FF1ANN solutions with p = 11 (N = 1331). Ideally. as was illustrated in Chen [16. except for the “worst” one. but “ANN OA3N p11 H = 10” is seen to have the lowest median. The OA/FF1ANN solution with the good OA is not terrible. the “best” solution is with the good OA. but it is visibly one of the “worst” solutions. using the poor OA. indicating that poor solutions can be avoided by not using a poor OA. 3 and 4. This agrees with the ﬁndings in the previous section. The “best” one is not as easy to identify. It should be noted in Figs. which used the second OA–LH realization from the good OA. sometimes it does not happen this way because the poor qualities are located in regions of the state space that are not critical to ﬁnding a good solution. while the “worst” is “MARS OA3P p13 400”. Fig. Among the SDP solutions with p = 13 (N = 2197) the “worst” one. but is worst than most of the OA/MARS solutions because of its higher median. The OA/MARS solutions with p =11 (N =1331) are all fairly close. 3 and 4 display all 18 of the OA-based SDP solutions.2. a simulation of 100 demand forecast sequences was conducted for the SDP solutions. although it is essentially equivalent to the other good OA/MARS solutions. For both sizes. there is one disconcerting solution. The simulation output provided the 50 mean costs (corresponding to the 50 initial state vectors).. but it is actually a good solution.e. 7 and 8 display the distributions (over the 50 initial state vectors) of these errors. The other MARS OA3P solution (with Mmax = 200) is used to represent the “best” OA/MARS solution. The OA/FF1ANN and OA/MARS solutions are repeated from Figs. For the FF1ANN solutions with p = 13 (N = 2197). which used the ﬁrst OA–LH realization from the neutral OA. 5 and 6 display the best and worst solutions for each of the four types of SDP solutions. It is troubling because it uses a LH design generated from the good OA. and K = 11. Boxplots in Figs. i. To some extent the goodness measures are no longer valid for the OA–LH designs. we would want good OAs to result in better solutions than poor OAs. however. that good OAs result in better SDP solutions on average.. and that is “MARS LH2G N1331 300 K35”.e. with the best solutions on the left and worst on the right. Note the solution for “ANN LH1N N1331 H40”. The interesting observation here is that the FF1ANNs performed better with the OA–LH designs. However. which is “MARS OA3P p11 300”. Mmax = 400.

A small advantage to MARS is the information on the relationships between the future value function and the state variables that may be gleaned from the ﬁtted MARS model.18(5):55–61. Operations Research 1993. Series C 2003. the water reservoir problem should not require many interaction terms in the value function approximation.24:1345–59. Gal S. Optimal integrated control of univoltine pest populations with age structure. good OAs reduced the possibility of a poor SDP solution. Special Issue on Applied Optimization Under Uncertainty 2004. White DJ. Cervellera et al. good solutions could be achieved using MARS models with either good OA or OA–LH designs and FF1ANN models with good OA–LH designs. Further real applications of Markov decision processes. Interfaces 1985. NY: Wiley. Interfaces 1988. thus. overall. Foufoula-Georgiou E. Tsai JCC. 7 along with the best and worst OA–LH/FF1ANN and OA–LH/MARS solutions. Water Resources Research 1979. Beck MB. Unlike the inventory forecasting problem. Shoemaker CA. For the SDP solutions using OA designs. Stochastic dynamic programming formulation for a wastewater treatment decision-making framework. so as to permit ﬂexibility in the approximation across different SDP stages. involving signiﬁcant interaction terms. Chen VCP. 1994. The ﬁgure is organized with the best solutions on the left and worst on the right. Water Resources Research 1988. New York. In addition. Dynamic programming. Bertsekas DP.41:484–500.88 C. Chen VCP. Kitanidis PK. Journal of the Royal Statistical Society. 7. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Bellman RE. Stedinger JR. we judged FF1ANNs and MARS to be very competitive with each other overall. Annals of Operations Research. For the water reservoir problem. Belmont. 2000.52(1):1–12. Princeton: Princeton University Press. Markov decision processes. 1957. a method for automatically selecting H for ANNs does not seem possible. Journal of Water Resources Planning and Management 1997. Gradient dynamic programming for stochastic optimal control of multidimensional water resources systems. Li Y.15(6):73–83. White DJ. Some improvement is seen using the OA–LH designs. while FF1ANN solutions noticeably improved with OA–LH designs. the increased univariate information from LH designs should be an advantage. Tsitsiklis JN. 8 displays the best and worst solutions from Fig. MA: Athena Scientiﬁc. Unfortunately. Concluding remarks The statistical modeling approach for numerically solving continuous-state SDP was used to simultaneously study OA–LH designs and an ANN-based implementation as alternatives to the OA/MARS method of Chen et al. Tejada-Guibert JA. Günther D. Johnson SA. / Computers & Operations Research 34 (2007) 70 – 90 Fig. Current research is developing a method to automatically select Mmax for MARS. the worst OA–LH/FF1ANN solutions are not bad. supporting the expectation that ANN models perform well with OA–LH designs. Chen J. Optimal management of a multireservoir water supply system. MARS performed similarly using both OA and OA–LH designs. the space-ﬁlling quality of the OA designs was considered. Culver TB. Puterman ML. Shoemaker CA. Johnson EL. For the inventory forecasting problem. Operations Research 1982. Like the ﬁndings in the previous section.15:737–48. [15] and Chen [16]. Dynamic optimal ground-water reclamation with treatment capital costs. Dynamic programming and optimal control. [12] .123:23–9. Solving for an optimal airline yield management policy via statistical learning. With regard to both computational SDP requirements and quality of SDP solutions. Numerical solution of continuous-state dynamic programs using linear and spline interpolation. Real applications of Markov decision processes.132:207–21. Shoemaker CA.30:40–61.

Data collection plans and meta models for chemical process ﬂowsheet simulators. [38] Allen T. ORSA Journal on Computing 1995. 2003. Comparison of two nonparametric estimation schemes: MARS and neural networks. Belmont. Welch WJ.35:264–74. [27] Li QS. [24] Labossiere P. Psichogios DC. Georgia Institute of Technology.13(3):149–53.22(9):1216–23. 231–61. Do K-A. Barton RR. Rao CR. [14] Bertsekas DP. Handbook of statistics: statistics in industry. Ruppert D. [31] Tang B. [26] Chen VCP. Shoemaker CA. [34] De Veaux RD. Industrial and Engineering Chemistry Research 2001.32(9):270–4. [23] Smets HMG. Ryan CG. p.34(3):371–86. Tsui K-L. SIAM Journal of Optimizations 2001. A review of design and modeling in computer experiments.12(12):1270–80. Sampling strategies for computer experiments: design and analysis. [33] Chen VCP.C.47:38–53. Statistical methods for deterministic biomathematical models. School of Industrial and Systems Engineering.40(24):5719–23. International Journal of Reliability and Applications 2001. Fang JQ. Soil Science Society of America Journal 2002. GA. Computational Statistics and Data Analysis 2000.5(1):35–42. Wong CK. Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming. Damping in buildings: its neural network model and AR model. Bioprocess Engineering 2000.35(1):157–62. August 2001. Orthogonal array-based Latin hypercubes. Journal of Reinforced Plastics and Composites 1993. Walsh MG. McClure R. Issues regarding artiﬁcial beural network modeling for reactors and fermenters. Lin DKJ. [28] Liu J. Turkkan N. Cervellera et al. [29] Pigram GM. Liu DK. Journal of Geochemical Exploration 1997. Macdonald TR. Fisher NI. [19] Carey WP. Combining non-parametric models with logistic regression: an application to motor vehicle injury data.59(3):233–49. Computers & Chemical Engineering 1993. Automatic learning using neural networks and adaptive regression. Neuro-dynamic programming. Prediction of the molecular weight for a vinylidene chloride/vinyl chloride resin during shear-related thermal degradation in air by using a back-propagation artiﬁcial neural network. In: Proceedings of the 53rd session of the international statistical institute. [16] Chen VCP. [21] Kuhnert PM. . B: Chemical 1992. [25] AbdelazizAY. [35] Pham DT. Multivariate adaptive regression splines (with discussion). / Computers & Operations Research 34 (2007) 70 – 90 89 [13] Eschenbach EA.47(1):33–41. Materials & Design 1992. 1998. Calibration of nonlinear solid-state sensor arrays using multivariate regression techniques. Journal of the American Statistical Association 1993. [37] Simpson TW.17(8):819–37.19:1–141. Korea. 1996. Application of MARS and orthogonal arrays to inventory forecasting stochastic dynamic programs. Kabiri K. Shoemaker CA.22:85–93. Journal of Quality Technology 2003.B9(2):113–22. Yee SS. Irving MR.12:322–44. Constructing meta-models for computer experiments. Use of neural network models to predict industrial bioreactor efﬂuent quality. Engineering Structures 2000. Atlanta. [22] Shepherd KD. [30] Kottapalli S. Neural network based representation of UH-60A pilot and hub accelerations. MA: Athena Scientiﬁc.2(3):209–40. Bogaerts WFL. Neural network-based adaptive out-of-step protection strategy for electrical power systems. Tsitsiklis JN. Measuring the goodness of orthogonal array discretizations for stochastic programming and stochastic dynamic programming. Parallel processing of stochastic dynamic programming for continuous state systems with linear interpolation. Failure prediction of ﬁbre-reinforced materials with neural networks. International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications 1997. Measurement and Control 1999. vol. Seoul. Amsterdam: Elsevier Science. In: Khattree R. editors.30:317–41. 22. Friedman JH. Rollins DK. [15] Chen VCP. Statistical techniques for the classiﬁcation of chromites in diamond exploration samples. Mansour MM. Computational Statistics and Data Analysis 1999. Journal of the American Helicopter Society 2002. Peat BJ.66(3):988–98. Ungar LH. Caffey H. [36] Palmer KD. Deriving corrosion knowledge from case histories: the neural network approach. [20] Grifﬁn WL. Bernshteyn MA. USA. Annals of Statistics 1991. Jeary AP. [18] Friedman JH.7:386–401. PhD dissertation. Sensors and Actuators. Chen W. Environmental Science and Technology 2001. Operations Research 1999. Development of reﬂectance spectral libraries for characterization of soil properties. [32] Chen VCP. [17] Chen VCP. El-ArabatyAM.88:1392–7. Allen JK. NosseirAI.

[40] Baglietto M.56:307–24.9:2–54. Titterington DM. [42] Heath DC. Cervellera C. Beck MB.23:508–24. 40–123. Cervellera C. 2001. Statistical science 1994. NJ: Prentice-Hall. Jensen JL. The roots of backpropagation: from ordered derivatives to neural networks and political forecasting. Water reservoirs management under uncertainty by approximating networks and learning from data. 1994. Barron RL. Technical report #989. McClelland JL. Sobel MJ. Statistical aspects of neural networks. Kendall WS. Journal of Statistical Planning and Inference 1996. Learning in neural networks: a statistical perspective.. Neural networks: a comprehensive foundation. p. Modelling the evolution of demand forecasts with application to safety stock analysis in production/distribution systems. Wegman EJ. [59] Ripley BD. Jackson PL. New York: Wiley. Wallis WD. 1986. [51] Hedayat AS. 192–203. [44] Lamond BF. Parallel distributed processing: explorations in the microstructure of cognition. An aggregate stochastic dynamic programming model of multireservoir Systems. to appear. New York: Wiley. [58] Barron AR. Englewood Cliffs. 1999.1:425–64. Sanguineti M. 1998. 318–62. [48] Rao CR.43: 771–80. In: Proceedings of the joint research conference on statistics in quality. Networks and chaos—statistical and probabilistic aspects.33:333–40. editors. [46] Yakowitz S. Water Resources Research 1997. [41] Hadley G. Bush KA. In: Topics on participatory planning and managing water systems. Constructions and combinatorial problems in design of experiments. McKinnon KIM. [54] Lippmann RP. Sanguineti M. [49] Bose RC. Orthogonal arrays of strength two and three. Menhaj M. [50] Raghavarao D. p. Hinton GE. p. Universal approximation bounds for superpositions of a sigmoidal function. 2005.17:1565–70. In: Proceedings of the 20th symposium on the interface. and technology. / Computers & Operations Research 34 (2007) 70 – 90 [39] Baglietto M. New York: Chapman & Hall. Annals of Statistics 1978.5:989–93. 1963. Statistical learning within a decision-making framework for more sustainable urban environments. industry. IEEE ASSP Magazine (1987) 4–22.. NY. 1991. School of Operations Research & Industrial Engineering. [61] Barron AR. [62] Rumelhart DE. Operations Research 1995. . Approximating networks for the solution of T-stage stochastic optimal control problems. Seattle. Neural networks: a review from a statistical perspective (with discussion). 1993. Zoppoli R. [45] Turgeon A. [53] Hedayat AS. [63] Werbos PK. [43] Archibald TW. Orthogonal fractional factorial designs. 1992. WA. Parisini T. Bulletin of the Calcutta Mathematical Society 1946.38:67–78. [57] White H. Training feedforward networks with the Marquardt algorithm. New York: Wiley. Dynamic programming applications in water resources. In: Proceedings of the IFAC workshop on adaptation and learning in control and signal processing. Williams RJ. Computer science and statistics. Hadamard matrices and their applications. Mulier F. Cervellera et al. Ithaca. [47] Chen VCP. An introduction to computing with neural nets. A decomposition method for the long-term scheduling of reservoirs in series. In: Barndorff-Nielsen OE. Zoppoli R. IEEE Transactions on Information Theory 1993. and methods. 1971. IEEE Transaction on Neural Networks 1994. Water Resources Research 1982.18:673–96. Water Resources Research 1981. NJ: Prentice-Hall. Cornell University. Learning from data: concepts theory.6:1184–238. Neural Computation 1989. [55] Haykin SS. Su G. [60] Cheng B. 1985. New York: Wiley. Whitin TM. Amsterdam: Elsevier Science Ltd.90 C. Thomas LC. In: Rumelhart DE. On difference schemes and orthogonal arrays of strength t. [52] Dey A. June 2000. editors. Hypercubes of strength ‘d’ leading to confounded designs in factorial experiments. Cambridge. MA: MIT Press. Stufken J. [56] Cherkassky V. Upper Saddle River.39:930–45. Learning internal representations by error propagation. Chen J. Exact and approximate solutions of afﬁne reservoirs models. Annals of Mathematical Statistics 1952. Analysis of inventory systems. [64] Hagan MT. Statistical learning networks: a unifying view. 2nd ed.

- 1131.pdfUploaded byAnonymous 8e2KnC
- Numerical Methods (Read-Only)Uploaded bytperea
- Arithmetic CodingUploaded byRakesh Inani
- Formula Book Multimedia InternetUploaded byPamelaAncelJuachon
- Answers to Notes Graph Translate Absolute Value Functions 1-31-13Uploaded byRocket Fire
- Dorrie Heinrich ProblemsUploaded byRaghuveer Chandra
- Svd InequalitiesUploaded bymauromanetti
- 05_Kali_Dasar_5sd10Uploaded bymirnasitikurnia
- F6_MS_0708_MEUploaded bySeefood Yo
- A COMPARISON STUDY FOR INTRUSION DATABASE.pdfUploaded byAnonymous N22tyB6UN
- Slides Lecture5Uploaded byBatoul Zargar
- תכן לוגי מתקדם- הרצאה 3 | הוספת שקופיות ב-2011Uploaded byRon
- Vissim Training - 12. Dynamic RoutesUploaded byBayu Kusumo Nugroho
- Research--Trace Ability With LSIUploaded byAndrew Denner
- fence_xenUploaded byjailcreator
- Add to Cart Not Go to CartUploaded bySai Kiran
- HV 3 Arthit TUUploaded byMohammedSaadaniHassani
- ReadME v1.01.txtUploaded byRhesa Wagiu
- DualPorosity ModelUploaded byJustin Perez
- SchoolUploaded bySelvakumar Sethuraj
- Martyn Hadoop AwsUploaded byNguyen Ba Nhiem
- 4. Mathematics - Ijmcar - A Study on Fixed Point Theorem - Prashant ChauhanUploaded byTJPRC Publications
- Checklist: Check a Cloud Computing provider for authenticityUploaded byRene Buest
- MscUploaded byAriel Lancaster
- Caem1 2013 AugUploaded byTony WANG
- sammy.jsUploaded byben7229
- Solution2_1_18Uploaded byArsy
- hasil prepost cerlefUploaded byDanyx Reistanza
- Robotics1_Midterm_Test_2016-17_16.11.18Uploaded byAland Bravo
- Module 4Uploaded byMogbekeloluwa Koye-Ladele