Sie sind auf Seite 1von 12

Advances in Water Resources 31 (2008) 13871398

Contents lists available at ScienceDirect

Advances in Water Resources


journal homepage: www.elsevier.com/locate/advwatres

Incorporating multiple observations for distributed hydrologic model calibration: An approach using a multi-objective evolutionary algorithm and clustering
Soon-Thiam Khu a,*, Henrik Madsen b, Francesco di Pierro a
a b

Centre for Water Systems, School of Engineering, Computer Science and Mathematics, University of Exeter, North Park Road, Exeter EX4 4QF, United Kingdom DHI, Water, Environment and Health, Agern All 5, DK-2970 Horsholm, Denmark

a r t i c l e

i n f o

a b s t r a c t
The use of distributed data for model calibration is becoming more popular in the advent of the availability of spatially distributed observations. Hydrological model calibration has traditionally been carried out using single objective optimisation and only recently has been extended to a multi-objective optimisation domain. By formulating the calibration problem with several objectives, each objective relating to a set of observations, the parameter sets can be constrained more effectively. However, many previous multi-objective calibration studies do not consider individual observations or catchment responses separately, but instead utilises some form of aggregation of objectives. This paper proposes a multi-objective calibration approach that can efciently handle many objectives using both clustering and preference ordered ranking. The algorithm is applied to calibrate the MIKE SHE distributed hydrologic model and tested on the Karup catchment in Denmark. The results indicate that the preferred solutions selected using the proposed algorithm are good compromise solutions and the parameter values are well dened. Clustering with Kohonen mapping was able to reduce the number of objective functions from 18 to 5. Calibration using the standard deviation of groundwater level residuals enabled us to identify a group of wells that may not be simulated properly, thus highlighting potential problems with the model parameterisation. 2008 Elsevier Ltd. All rights reserved.

Article history: Received 23 January 2007 Received in revised form 11 July 2008 Accepted 13 July 2008 Available online 26 July 2008 Keywords: Calibration Distributed modelling Multi-objective Self-organising map (SOM) Multiple observations Evolutionary algorithms Groundwater

1. Introduction Application of distributed hydrological modelling has put emphasis on the use of data for model calibration and validation (e.g. [40,32]). Use of distributed measurements of different state variables is essential in order to document the predictive capability and credibility of the model for prediction of internal state variables within a catchment. This calls for formulation of the calibration problem within a multi-objective context [30]. The idea of using multiple observations as sources of data for calibration and validation of hydrological models is not new. Recently, the US National Weather Service conducted an extensive comparative study on different distributed hydrologic models on several catchments in the US, known as the Distributed Model Inter-comparison Project (DMIP). The broad aim of this study was to determine the conditions to which distributed models are most suitable for implementation [45]. An interesting fact that came out from DMIP is that although interior observations were available, none of the participants took advantage of their distributed
* Corresponding author. E-mail address: s.t.khu@exeter.ac.uk (S.-T. Khu). 0309-1708/$ - see front matter 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.advwatres.2008.07.011

model to perform multi-site calibration using a multi-objective framework. Di Luzio and Arnold [28] undertook further calibration exercises on the same catchments with interior observations. However, they adopted a two stage process instead of a general, onestep multi-objective calibration approach. In the calibration of a groundwater model of 125 sub-catchments within the North Rhine-Westphalia region of Germany, Bogena et al. [3] proposed to correlate the baseow indices of the sub-catchments for the ease of calibration rather than solving the complete calibration problem within a multi-objective framework. As mentioned in McCabe et al. [32], multi-objective calibration is intrinsically different from single objective model calibration. While conventional single objective calibration usually tries to identify a set of model parameters based on the models ability to reproduce a single independent observation record, multi-objective calibration inherently recognises that models have multiple outputs [32]. For instance, Madsen [30] proposed a multi-objective framework for the automatic calibration of a distributed hydrologic catchment model, MIKE SHE [12], using both groundwater level and runoff observations. In this case, performance indices from individual groundwater wells were aggregated into one objective function and optimised with the performance index of the

1388

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

catchment runoff in a two-objective optimisation framework. Meixner et al. [34] applied a multi-objective algorithm, MOCOMUA [50], to calibrate a hydrochemical model with a total of 21 hydrologic and chemical criteria to evaluate the model performance. They tested different combinations of performance criteria and found that some combinations of four criteria gave better performance than others. van Griensven and Bauwens [13] proposed a methodology that handles multiple observations by converting multiple objectives functions into a single objective function, so that it is amenable to an existing single objective global optimisation algorithm. This is a very neat way to handle multiple objectives but it does not provide any information of the trade-off between different simulated outputs. Such trade-off information may provide the modeller with vital knowledge on the deciencies of the model parameterisation and model structure. Model calibration using multiple sources of data has several important advantages. First, it better constrains the calibration process, resulting in better dened model parameter estimates [35,44,17,18]. In addition, by incorporating new type of information in the calibration the model prediction uncertainty may be reduced (e.g. [27,19]). Parameter non-uniqueness in terms of equinality [2] can be partly attributed to single objective calibration where multiple model outputs (either temporal or spatial) are mapped to some form of singular index that is optimised. Multiobjective analysis may be seen as a way to unfold the equinality problem by looking at model performance with respect to different model responses [31]. From a calibration and optimisation perspective, a multi-objective formulation of a problem could have an attractive capability of creating a smooth transition for parts of the objective functions, a characteristic known as multi-objectivisation [24,48]. Such smoothing effect could allow the calibration algorithm to escape from multiple local optima, which is characteristic for many hydrological calibration problems (e.g. [10]). Although there is no mathematical proof that this is the case for water resources calibration problems, empirical studies have indicated accelerated search capabilities by converting a single objective problem to a multiple problem [41,23]. A further justication of adopting the multi-objective approach is that it allows the model results to be incorporated into a multicriteria decision support framework. The resultant output of the multi-objective calibration is some form of Pareto trade-off curve between different objective functions, and each combination of objective functions is a possible option or choice for the modeller or decision maker. But very often, the number of choices presented to the decision maker should be constrained to a limited number in order to assist the decision maker assessing the implication of each option. In Khu and Madsen [22], it has been shown that preference ordering is an effective method that could sieve through myriad options and promote the selection of a small group of good compromised solutions. However, the answer to the question of how to effectively use multiple observations for calibration remains elusive despite many years of research in hydrological modelling. This paper investigates how to incorporate multiple responses (multi-variable and multisite measurements) and multiple performance criteria within a multi-objective calibration framework. The paper proposes that each set of observations should be considered independently and formulated as a different objective function when performing calibration. However, such a radical formulation cannot be effectively solved by any current optimisation algorithm [20,39]. In order to effectively handle several spatially distributed observations, a new approach using a non-linear classier based on articial neural networks and a multi-objective genetic algorithm based on preference ordered ranking is proposed and applied to a distributed hydrological model.

2. Multi-objective calibration framework In a multi-objective context, model calibration can, in general, be performed for multiple responses. These responses can be divided into the following three groups [30]:  Multi-variable measurements. These refer to different types of observations within the modelling domain such as groundwater level, sub-catchment runoff, water quality parameters, soil moisture content in the unsaturated zone, etc. Usually distributed hydrological models simulate several of these variables. In such conditions, each of these measurements can be formulated as a performance objective for multi-objective calibration.  Multi-site measurements. These refer to the same type of variable observed or measured at different locations distributed within the modelling domain. Several variables simulated by hydrological models are site-specic (such as groundwater levels and soil moisture) and measurements of these variables can be formulated as a performance objective for multi-objective calibration. Besides point measurements it is also important to include runoff measurements in the calibration to evaluate the water balance simulations at sub-catchment level. Individual runoff measurements in the catchment can be formulated as a performance objective.  Multi-criteria modes. These refer to formulating different performance indices for either the same time series or for different parts of the time series. For example, different objective functions can be formulate that (i) measure various responses of the hydrological processes such as, e.g. the general water balance, peak ows, and low ows [29]; (ii) partition the observed hydrograph into different components [4]; or (iii) using different mathematical formulations to analyse residual errors, such as root-mean-square errors, maximum error, etc. [14]. The ASCE Task committee [1] provides a comprehensive list of different ways of evaluating model performance. Usually the performance measures to be used as objective functions in the calibration are derived from the single time series, such as the root-mean-squared error (RMSE), mean average error, Nash Sutcliffe coefcient [36], etc. Recently, Wealands et al. [49] highlighted several problems associated with current practices in assessing spatial predictions from distributed models. They proposed a number of comparative measures such as importance map, weighted local variance, category comparison and fuzzy map for comparing distributed model outputs with spatial observations. These comparative measures can all be taken into account if the calibration is performed within a multi-objective context. To solve the multi-objective calibration problem numerical optimisation tools can be applied. Mathematically, the multiobjective optimisation problem can be formulated as

Min fF 1 h; F 2 h; . . . ; F m hg h 2 H;

where m is the number of objective functions, Fi(h), i = 1, 2, . . . , m are the individual objective functions, and h is the set of model parameters to be optimised. We consider a constrained optimisation problem with a feasible parameter space H, which reects the priori information of the model parameters. Due to trade-offs between the different objectives, the solution to Eq. (1) will not, in general, be a single unique parameter set. Instead, it will consist of several non-dominated or Pareto optimal solutions according to the trade-offs. For a Pareto optimal solution none of the objectives can be further improved without deterioration of one or more of the other objectives.

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

1389

There are a number of computing algorithms which can handle multiple objective functions without the need to resort to any form of aggregation or weighting. Examples are a large class of algorithms collectively known as multi-objective genetic algorithms (MOGA) [11], multi-objective variants of the shufed complex evolution algorithm (the multi-objective complex evolution (MOCOM) algorithm [50] and the multi-objective shufed complex evolution metropolis (MOSCEM) algorithm [47]), and others [42]. These algorithms have recently been applied for multi-objective calibration of distributed hydrological models (e.g. [43,31,46]) and also for other types of rainfall runoff models [7,6,5].

Input raw data Cluster the raw data using Kohonen selforganising map

Estimate the no. of clusters (range) Formulate objective functions for each cluster of data

Run POGA to calibrate the model

3. Proposed methodology To solve the multi-objective optimisation problem we consider here multiple objective functions in terms of performance indices related to individual observed time series of hydrological variables and possible individual response modes. Thus, potentially we may have a very large number of objective functions to be optimised. State-of-the-art multi-objective optimisation algorithms incur serious performance deterioration when exploited to solve problems consisting of more than 34 objectives [38]. The reasons are twofold. Firstly, nding a good representation of the optimal trade-off curve of a problem consisting of a large number of objective functions requires an exponentially high number of functional evaluations, and this leads to computation times that are often unaffordable. Secondly, most of the traditional population-based algorithms for multi-objective optimisation suffer from a lack of discriminating power when assessing the relative quality of solutions that must be evaluated on a large set of objectives, and this has a serious impact on their performance. To effectively solve a multi-objective optimisation problem with a large number of objectives a two-step approach is here proposed. This includes: (i) classication of multi-site measurements into groups according to temporal dynamics using an articial neural networks (ANN); and (ii) calibration using an automatic scheme utilising the latest developments in multi-objective optimisation with many objectives. The approach is illustrated in Fig. 1 and explained herein. 3.1. Grouping of multi-site measurements To better constrain the model calibration it is, in general, preferable to include new data of a different variable rather than more data of the same variable, and new data at a different location rather than more data at existing locations (e.g. [33]). However, when considering multi-site measurements, each new data series does not provide independent, non-commensurable information. Thus, rather than considering each data series independently in the optimisation, one should utilise the dependency between the data series to diminish the inclusion of redundant information. It is therefore necessary to investigate the amount of useful information content present in the observation records [19]. The dependency between the observation records is also a useful characteristic that can be exploited for grouping some of the observations in order to reduce the complexity of the problem. As such, grouping of measurement sets can be seen as both crucial and pragmatic. Grouping could be performed in various ways, such as based on physical proximity, expert knowledge, or using data mining techniques. In this study, a data mining technique known as articial

Output: sets of compromise Pareto optimal parameters


Fig. 1. Flowchart of proposed methodology.

neural network is used. We have chosen an unsupervised ANN, the Kohonen self-organising map (SOM) [26], as a tool to assist in classifying the groundwater level observations from different wells since it has the capability of preserving the topological structure of the original data, and it creates a topology preserving map in the training process. A topological map is simply a mapping that preserves neighbourhood relations. The Kohonen SOM is a highly effective tool for visualizing highdimensional, complex data with inherent relationships between the various features comprising the data. As such, it can be used to extract salient, multi-scale features from the raw data and thereby constructing or automatic formatting data into clusters. For more information on Kohonen SOM and its applications, see Haykin [15]. 3.2. Multi-objective calibration using preference ordering The grouping of multi-site measurements will effectively reduce the number of objective functions to be included in the multi-objective optimisation. However, still the number of objective functions may be too large (say more than 34) for current multi-objective optimisation algorithms to be effectively applied. The multi-objective calibration is here performed using a newly developed multi-objective genetic algorithm known as preference order genetic algorithm (POGA) [37]. POGA has been found to be very effective in dealing with many objectives (i.e. more than three objective functions) both for hypothetical problems as well as for water resources calibration problems [22,21]. The basic working principles of POGA, i.e. the use of multiple generations where good solutions are probabilistically more often selected to exploit their genetic materials and form new good candidate solutions, are similar to those of the elitist non-dominated sorted genetic algorithm, NSGA-II, [8], a well-known algorithm that has been successfully used to solve many difcult test problems and complex applications. POGA differs from NSGA-II in the way elitism, i.e. the maintenance of good solutions across generations, is performed. As opposed to NSGA-II, which uses Pareto dominance to assess the relative quality of solutions, POGA resorts to Preference Ordering. Preference Ordering is a generalisation of Pareto dominance, however it is more stringent, and therefore more effective in achieving a better grading of a set of solutions to a problem that consists of many objective functions. With Preference Ordering Pareto optimal solutions that are also Pareto optimal in different subspace combinations of the objective functions space are preferred. For instance, consider a three-objective optimisation problem. A Pareto optimal point in the three-objective space that is also a Pareto optimal point in all of the three subspaces of

1390

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

two-objective combinations has higher efciency than (or dominates) three-objective Pareto optimal points that are only Pareto optimal in two of the three subspaces. Similarly, the points that are Pareto optimal in two of the three subspaces of two objectives dominate points that are Pareto optimal in only one of the subspaces, etc. The use of Preference Ordering has shown to have a considerable impact on the convergence properties of the optimisation algorithm, which is less affected by the lack of discriminating power of commonly applied multi-objective ranking procedures (such as those used in NSGA-II). For a detailed description of the POGA algorithm the reader is referred to di Pierro et al. [37]. The concept of Preference Ordering and its use in model calibration is described in Khu and Madsen [22].

4. Application example: catchment and model setup 4.1. Catchment The proposed calibration approach was used to calibrate the MIKE SHE model applied to the Danish Karup catchment, which has previously been used in Madsen [30] for a two-objective calibration, considering an aggregate performance measure of all groundwater level measurements and a performance measure of the runoff at the catchment outlet. The Karup catchment has an area of 440 km2 and is located in the western part of Denmark. The catchment elevation varies from about 20 m to 100 m. The geology is relatively homogeneous with highly permeable sand and gravel deposits and small lenses of moraine clay. The aquifer is mainly unconned and varies in thickness from about 10 m at the western and central part to more than 90 m at the upstream eastern water divide. The depth of the unsaturated zone varies from 25 m at the eastern water divide to less than 1 m in the wetland areas along the river. The land use consists of agriculture (67%), forest (18%), heath (10%), and wetland areas (5%). The catchment is drained by the Karup River and about 20 tributaries. A more detailed description of the Karup catchment can be found in Refsgaard [40]. Daily precipitation data are available from nine measurement stations in the catchment, and data from these stations are spatially distributed in the model using Thiesen polygons. Daily potential evapotranspiration and average temperature data are available from one measurement station and are used as spatial homogeneous input in the model. For each of the four vegetation types time series of leaf area index and root depth are available [40]. The available measurements for calibration consist of groundwater level data sampled every 2 weeks from 35 locations in the catchment and daily discharge data from four stations in the river system, including the runoff at the catchment outlet. Similar to the calibration performed by Madsen [30], groundwater level data from 17 wells as well as runoff data from the catchment outlet are used (Fig. 2) for the multi-objective calibration in this study. The remaining 18 groundwater wells are used for validation of the multi-site classication procedure (see below). Data in the period 1 January 197131 December 1974 are used in the calibration. To minimise the effect from the initial conditions for calculation of the objective functions, a 2-year warm-up period is applied in the simulations. For validation, data in the period 1 January 197531 December 1977 are used. 4.2. Model setup and parameterisation MIKE SHE is a exible, integrated hydrological modelling system that combines different process-oriented modelling components within the same modelling framework. For each compo-

Fig. 2. Map of Karup catchment showing the locations of runoff stations and groundwater wells.

nent several model descriptions are available ranging from complex, physically-based descriptions that solve the governing partial differential equations to simple, conceptual models. In the present application the following model descriptions are applied: (i) The KristensenJensen model for calculating actual evapotranspiration [25], (ii) 2D diffusive wave approximation of the Saint Venant equations for overland ow, (iii) Muskingum routing for ow in the river system, (iv) 3D Boussinesq equation for ow in the saturated zone, (v) 1D Richards equation for vertical ow in the

Table 1 Model parameters included in the calibration and parameter bounds Model component Saturated zone Model parameter Hydraulic conductivity of soil type 1 Hydraulic conductivity of soil type 3 Saturated hydraulic conductivity of soil 1 Van Genuchten Nparameter of soil 1 Van Genuchten aparameter of soil 1 Saturated hydraulic conductivity of soil 2 Van Genuchten Nparameter of soil 2 Van Genuchten aparameter of soil 2 Drainage level Drainage coefcient Leakage coefcient Symbol Kh1 Unit m/s Lower limit 0.00005 Upper limit 0.005

Kh3

m/s

0.0001

0.01

Unsaturated zone

Ks1 N1

m/s m
1

0.000001 1.2 0.05 0.00005 1.2 0.01 1.3 1 108 1 108

0.0001 2.5 0.5 0.005 2.5 0.1 0.8 1 106 1 106

a1
Ks2 N2

m/s m1 m m/s m/s

a2
DrainLevel DrainCoef LeakCoef

Drainage Riveraquifer interaction

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398


1.2
Well Group A

1391

1
Normalised water levels

0.8

0.6

0.4

0.2

0
/7 1 /7 2 /7 2 /7 3 /7 4 /7 5 /7 6 01 /0 3/ 73 01 /1 2/ 74 /0 6 /0 1 /0 8 /1 0 /0 5 /0 7 /0 2 /0 9 01 /7 6

01

01

01

01

01

01

date

1.2
Well Group B

1
Normalised water levels

0.8 0.6 0.4 0.2 0


71 73 73 74 74 72 72 75 76 6/ 1/ 8/ 0/ 3/ 5/ 2/ 7/ 2/

01

/0

/0

/0

/0

/1

/0

/1

/0

/0

01

01

01

01

01

01

01

01

01

date

1.2
Well Group C

1
Normalised water levels

0.8

0.6

0.4

0.2

0
75 76 /7 /7 /7 /7 /7 /7 /7 /0 7/ /0 2/ /0 9/ 01
76 2/ 01 /0 9/ 76

06

01

08

03

10

05

01 /

01 /

01 /

01 /

01 /

01 /

01 /

12

01

date

1.2
Well Group D

1
Normalised water levels

0.8

0.6

0.4

0.2

0
74 71 72 72 73 73 74 2/ 6/ 1/ 8/ 3/ 0/ 5/ 7/ 75

/0

/0

/0

/0

/1

/0

01

01

01

01

01

01

01

01

date

Fig. 3. Normalised water levels in the four groups of the groundwater wells using the Kohonen SOM classication scheme (grey lines: calibration wells; red dotted lines: validation wells). (For interpretation of the references in colour in this gure legend, the reader is referred to the web version of this article.)

01

/0

/1

/0

01

76

01

/0

9/

76

1392

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

Table 2 Classication of wells according to observed dynamics in piezometric levels Groups A B C D Calibration wells identication number 9, 22, 39, 78 5, 12, 21, 24, 64 27, 35, 37, 45, 69 49, 55, 56 Validation wells identication number 8, 38, 41, 72 11 6, 34, 44, 52, 62, 63 25, 36, 46, 47, 51, 54, 66

STD_Well5 1.0

0.5 RMSE_20.05 STD_Well9

0.0

unsaturated zone, (vi) Linear reservoir model for ow in drains, (vii) Darcy equation for the riveraquifer interaction, and (viii) A degree-day approach for snow melt. For a detailed description of the MIKE SHE modelling system, the reader is referred to Graham and Butts [12]. The model is discretised in a 1 1 km horizontal grid. For the saturated zone modelling the geological conceptualisation is taken from the Danish National Water Resources model (DK-model) which is dened in 1 1 km grids and 10 m thick layers [16]. For each grid element a soil type is assigned (a total of ve soil types are identied for the Karup catchment), and the hydraulic properties of these soils are included in the calibration. For the unsaturated zone the parameterisation used by Refsgaard [40] is applied. This includes denition of two soil proles for the entire catchment, each dened with two soil types (a top layer ranging from 55 to 100 cm and a homogenous layer below). For each of the resulting four soil types van Genuchten retention and conductivity curve parameters are specied, and the parameters of these equations are included in the calibration. The Karup River and the main tributaries are included in the river model. To describe riveraquifer interaction a thin permeable layer is assumed between the river and the main aquifer. The leakage coefcient that characterises this layer is assumed homogenous in the catchment and is included in the calibration. The wetland areas are drained by ditches and drain pipes which are modelled conceptually using a linear reservoir description in each cell. The drainage level (relative to ground surface) and the time constant of the linear reservoir model are assumed homogeneous in the catchment and are included in the calibration. The empirical parameters in the KristensenJensen model for the evapotranspiration calculation are based on experience values [40] and are not subject to calibration.

STD_Well49

STD_Well27

Fig. 5. Radar plot of normalised values for the ve- and nine-objective calibration problem. STD_Well5, STD_Well9, STD_Well27 and STD_Well49 are standard deviation of residuals for the different wells representing, respectively, group A, B, C and D. RMSE_20.05 is the RMSE of runoff at the catchment outlet. (grey lines: 74 Pareto optimal parameters for the ve-objective calibration, i.e. set A; red dotted lines: preferred parameters according to preference ordering for the ve-objective calibration problem, i.e. set B; black lines: preferred parameters according to preference ordering for set nine-objective calibration problem, i.e. set C). (For interpretation of the references in colour in this gure legend, the reader is referred to the web version of this article.)

A preliminary sensitivity analysis was carried out to identify the most sensitive parameters [9]. Based on this analysis 11 parameters were included in the multi-objective calibration (see Table 1). 4.3. Classication of groundwater wells In order to reduce the complexity of the multi-objective calibration problem, the 17 calibration wells were classied into distinctive groups according to the uctuations of observed piezometric head using the Kohonen SOM classication procedure described above. To facilitate the classication, the groundwater level data were transformed using the ground level as the datum and normalised using the mean water level in each well. The Kohonen SOM procedure was able to classify the 17 calibration wells into four groups (Fig. 3). This Kohonen network was later used to sort out

1.0

0.8

Normalised values

0.6

0.4

0.2

0.0

on st

ph a_ 1

ha _2

_3

_1

Kh _1

Ks _1

_2

nL e

Le ak C

in C

ra i

ra

Parameters
Fig. 4. Normalised parameters for Pareto optimal sets (grey lines: 74 Pareto optimal parameters for the ve-objective calibration, i.e. set A; red dotted lines: preferred parameters according to preference ordering for the ve-objective calibration problem, i.e. set B; black lines: preferred parameters according to preference ordering for set nine-objective calibration problem, i.e. set C). (For interpretation of the references in colour in this gure legend, the reader is referred to the web version of this article.)

Al p

Al

Ks _2

ve

oe

Kh

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

1393

a 2.50
RMSE_runoff_20.05 2.00

b 2.50
RMSE_runoff_20.05 2.00 1.50

1.50 1.00 0.50

1.00 0.50

0.00 0.00 0.50 1.00 1.50 STD_Well group A

0.00 0.00 0.50 STD_Well group B 1.00

c 2.50
RMSE_runoff_20.05 2.00

d 2.50
RMSE_runoff_20.05 2.00 1.50

1.50 1.00 0.50

1.00 0.50

0.00 0.00 0.20 0.40 0.60

0.00 0.00 0.20 0.40 0.60 0.80

STD_Well group C

STD_Well group D

e 0.90
0.80 STD_Well group B 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0.00 0.50 1.00 1.50

f
STD_Well group C

0.60 0.50 0.40 0.30 0.20 0.10 0.00 0.00 0.50 1.00 1.50

STD_Well group A

STD_Well group A

g 0.70
0.60 STD_Well group D 0.50 0.40 0.30 0.20 0.10 0.00 0.00 0.50 1.00 1.50

h 0.60
0.50 STD_Well group C 0.40 0.30 0.20 0.10 0.00 0.00 0.20 0.40 0.60 0.80 1.00

STD_Well group A

STD_Well group B

i
STD_Well group B

0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20

j
STD_Well group C

0.60 0.50 0.40 0.30 0.20 0.10

0.10 0.00 0.00 0.20 0.40 0.60 0.80 STD_Well group D 0.00 0.00 0.20 0.40 0.60 0.80 STD_Well group D

Fig. 6. Various plots showing different combinations of any two of the ve objectives (dots: all Pareto efcient points, i.e. set A; circles: set B preferred points; crosses: set C preferred points).

1394

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

the remaining 18 validation wells into one of these groupings. The reason for doing so is to conrm that the trained Kohonen network was indeed able to perform the classication of groundwater wells. The resultant groupings are shown in Table 2, and Fig. 3 shows the graphical representations of the water levels in the validation wells normalised by the mean water level in each well. From Fig. 3, the water level dynamics of each group is quite distinct: group A wells are less dynamic compared to other groups as there are only two major trends; from June 1971 to December 1974, and from July 1975 to January 1977. Group B wells are similar to group A wells

except for some slight uctuations at around February 1976. Group C and group D wells behave similarly in that they uctuate quite a lot during the study period but, in general, group D wells uctuate with a larger magnitude compared to group C wells. 4.4. Formulation of objective functions based on well groupings Once the wells have been classied, a representative well from each group is selected for use in the calibration. Since the dynamics of the wells within each group are similar to each other, the overall

runoff@ location 20.05 12 11 10 9 8 discharge (m3/s) 7 6 5 4 3 2 1 0 1970

1971

1972 time (year)

1973

1974

Fig. 7. Hydrograph at catchment outlet compared with observations for all 74 optimal simulations (crosses: observed values; grey lines: 74 Pareto optimal simulations).

Fig. 8. Simulation results of water levels in some calibration groundwater wells compared with observations (crosses: observed values; grey lines: 74 Pareto optimal simulations, i.e. set A; blue lines: set B preferred solutions; black lines: set C preferred solutions). (For interpretation of the references in colour in this gure legend, the reader is referred to the web version of this article.)

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

1395

results are not sensitive to the well selected for calibration. Analysis of the model residuals from the calibrated model at the different wells also support this nding. In this paper, wells 9, 5, 27 and 49 are selected as calibration wells to represent groups A, B, C and D, respectively. A total of nine objective functions were formulated: two for each group of wells and one for the runoff measurements at the catchment outlet. The objective functions were dened as: (i) root-mean-square error (RMSE) of the groundwater levels; (ii) standard deviations of the groundwater level residuals; (iii) RMSE of the runoff at the catchment outlet. The use of the standard deviations of residuals as objective functions to evaluate groundwater level simulation is motivated by the fact that large biases may exist due to scaling problems, i.e. the dissimilarity between the measurement scale (at a point) and the modelling scale (on a grid) and the associated heterogeneity within the model grid. The standard deviation of model residuals measures the dynamic behaviour of the model response, and hence when used in the calibration allows ignoring the bias. The bias problem is important to take into account when comparing any point measurement with a grid-based model response. On the other hand, catchment aggregated values such as catchment (or sub-catchment) runoff may be directly compared since they are referring to the same scale (the catchment or sub-catchment scale). Two different calibration setups were tested using POGA, one using ve objective functions (from (ii) and (iii)) and the other with all the nine objective functions. This was done in order to evaluate the effect of including more objective functions in the calibration to better constrain the parameter optimisation. 4.5. Setup of optimisation algorithm The 11 parameters of the simulation model were encoded as binary variables in POGA. Here, binary coding was preferred to real

coding because preliminary sensitivity-type analysis on algorithmic parameterisation was not viable due to the considerable simulation time and the authors had prior expertise on parameterising the binary-coded optimisation algorithm herein used on similar problems. The variables with a feasible range covering several decades (see Table 1) were logarithmic transformed to better represent the search space. After a number of trial runs with different selection and recombination operators, the bit wise tournament selection was implemented together with the uniform crossover and uniform mutation operators. The population size, the probability of crossover and mutation were set to 200, 0.5 and 1/200, respectively. The maximum number of iterations was set to 100. The simulation time of each model setup required approximately 5 min on a Pentium dual core 2.0 GHz, leading to a total duration of the optimisation process of approximately 1650 h. This prevented running a series of independent optimisation runs to assess and lter out the impact of randomness affecting POGA (mainly in the form of probability of occurrence of the genetic operators) on the results. 5. Results and discussion In the ve-objective calibration, there were 74 sets of parameter combinations that satisfy the condition of Pareto efciency. Fig. 4 shows the variations of the set of Pareto optimal parameters (set A) scaled with respect to the feasible ranges given in Table 1. When the conditions of preference ordering were applied to distinguish between these 74 sets of parameters, ve sets of parameters were identied as non-dominated (set B, indicated as dotted red lines in Fig. 4). Fig. 5 shows a radar plot with all 74 sets of the ve objective function values standardised with respect to their maximum values. It can be seen that the objective functions STD_Well9, STD_Well5 and STD_Well27 (representing standard deviations of well groups A, B and C, respectively) behaved very much in tandem with each other, i.e. changing a parameter set will either increase or decrease all these three objective functions most of the time.

Fig. 9. Simulation results of water levels in some validation groundwater wells compared with observations (crosses: observed values; grey lines: 74 Pareto optimal simulations, i.e. set A; blue lines: set B preferred solutions; black lines: set C preferred solutions). (For interpretation of the references in colour in this gure legend, the reader is referred to the web version of this article.)

1396

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

RMSE_Well9 1.0 0.8 0.6 0.4 RMSE_20.05 0.2 0.0 RMSE_Well5

RMSE_Well49

RMSE_Well27

Fig. 10. Radar plot of normalised values for RMSE of wells and runoff. RMSE_Well9, RMSE_Well5, RMSE_Well27 and RMSE_Well49 are RMSE for the different wells representing, respectively, group A, B, C and D. RMSE_20.05 is the RMSE of runoff at the catchment outlet (grey lines: 74 Pareto optimal parameters, i.e. set A; red dotted lines: set B preferred parameters; black lines: set C preferred parameters). (For interpretation of the references in colour in this gure legend, the reader is referred to the web version of this article.)

It can be seen that the preferred set B parameters (dotted red lines) have very good performance in general with small and similar RMSE values for runoff and standard deviations for selected groundwater wells. However, as shown in Fig. 4, there is still a large variability in some parameter values in set B. In other words, these parameters are quite ill-dened with respect to the objective functions selected. It is expected that when more performance measures are considered as objective functions, the calibration problem will be better constrained, resulting in better identiable parameters.

In the nine-objective calibration where both RMSE and standard deviation of groundwater levels as well as the RMSE of runoff were considered, 3 sets of parameters were identied using POGA (set C, shown in black lines in Fig. 4) and their ranges were more narrow compared to those from the ve-objective calibration. This indicates that the preferred solutions are better dened compared to the ve-objective function case. The ranges of some of these parameters (Kh1, Kh3, DrainConst, LeakCoef, N1, Ks1) were considerably narrower than those in the ve-objective calibration. This shows that through POGA, we were able to identify 7 out of 11 parameters. There are still considerable variations in some parameters (such as DrainLevel, a2, N2 and Ks2). Fig. 5 also shows the normalised objective function values of this calibration (in back lines) compared to those from the ve-objective calibration. Fig. 6 shows the different combinations of two objectives for the ve-objective calibration problem. It can be seen clearly that preference ordering offers a useful approach to sieve through all the Pareto efcient points and select points that are good compromise solutions in each of the two-objective combinatorial plots (Fig. 6a j). However, when compared with the preferred solutions from the nine-objective calibration, most of the preferred ve-objective solutions are dominated by the preferred nine-objective solutions except for those in Fig. 6a, b and e. Hence, the use of more objective functions has clear benets in discriminating very good solutions from the preferred solutions based on fewer objective functions. Fig. 7 compares the simulated runoff at the catchment outlet for the 74 Pareto optimal parameter sets of the ve-objective calibration (set A) with the observed hydrograph. The range of simulated runoff is seen to bracket well the observed runoff. Fig. 8 compares the groundwater level simulations for the set A Pareto optimal solutions with the observations in four wells used for calibration. It can be seen that for all wells, the dynamics of the water levels can be captured reasonably well by all Pareto optimal parameter sets. This also applies to the groundwater levels in other calibration wells. Fig. 9 compares the set A groundwater level simulations with observations for some validation wells. Since

Fig. 11. Simulation results compared with observations in a groundwater well that exhibits phase error (crosses: observed values; grey lines: 74 Pareto optimal simulations, set A; blue lines: set B preferred solutions; black lines: set C preferred solutions). (For interpretation of the references in colour in this gure legend, the reader is referred to the web version of this article.)

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398

1397

these wells were not used for calibration, we would expect that the results are not as good as for calibration wells, but nevertheless good. The RMSE of the validation wells are around 1.191.44 (m)-compared to 0.981.26 (m) for the calibration wells. At some well locations (see e.g. well 27 in Fig. 8), there are considerable differences between simulated and observed water levels. These differences may be up to about 3 m. In the Karup catchment, the groundwater table is characterized by a high spatial gradient, up to about 3.5 m per km. Since model results are representative of a 1 1 km grid scale, the errors in simulated groundwater levels are within the acceptable limits when scaling uncertainties are taken into account. The radar plot of RMSE of groundwater levels (Fig. 10) clearly indicates that there is considerable difculty in reducing the RMSE of well group C represented by well 27. On the other hand, Fig. 5 shows a small standard deviation of the preferred solutions for well 27, indicating a good dynamic description (see also Fig. 8). It can also be noted that there are a few groundwater wells (calibration wells 49 and 55, and validation well 51) that cannot be adequately simulated in terms of temporal dynamics, and the model results always contain some form of phase error as shown in Fig. 11. These wells all belong to group D and are located at the upstream part of a tributary in the western part of the catchment (see Fig. 2). The phase shift is also illustrated in the relatively poor standard deviation results for this group of wells compared to the other well groups (Fig. 5). Thus, this may indicate errors in the model related to the parameterisation or, more generally, model structural errors in this part of the model domain which should be further investigated. 6. Conclusions A new multi-objective calibration approach incorporating multi-variable, multi-site and multi-response measurements has been proposed in this paper. The methodology consists of two steps: (i) grouping of multi-site measurements into clusters using Kohonen SOM, and (ii) multi-objective optimisation with POGA, which is particularly powerful for dealing with many objectives. The methodology has been demonstrated on calibration of a MIKE SHE model setup of the Karup catchment in Denmark. Measurements of groundwater levels and runoff were considered in the multi-objective context. The groundwater observations were taken from different measurement locations within the catchment, and two different criteria for assessing goodness-of-t (i.e. RMSE and standard deviations of simulation residuals) were considered. It was found that clustering played a vital role in the methodology by: (i) reducing the number of representative observation wells by grouping wells in clusters of similar behaviour, and (ii) reducing the dimensionality of the multi-objective calibration problem. Hence, it provided a balanced approach between using a large number of observation well data with individual objective functions and a single objective function that aggregates all data into one performance measure. It was also found that the wells were not clustered solely based on geographical locations but also according to dynamical behaviour of the observed water levels. Although through clustering, we were able to reduce the number of objective functions from 18 to 5 (considering only standard deviations of the simulated groundwater residuals) and from 35 to 9 (considering also RMSE of groundwater levels), automatic calibration would not be possible without the use of a powerful multi-objective optimisation algorithm, POGA. Overall, the results obtained were very encouraging in that the dynamics in the groundwater level observations can be modelled adequately together with the catchment runoff. At the same time, the multi-objective calibration approach provides the modeller a

choice of calibrated parameter sets, which the modeller can select based on other considerations. For some of the simulated groundwater levels that exhibit phase shifts, and which cannot be reduced by the proposed methodology, one should re-examine the observations and the applied model parameterisations of the different process descriptions. This was, however, not further elaborated in the current project. References
[1] ASCE Task Committee. Denition of criteria of evaluation of watershed models. J Irrig Drain Eng, ASCE 1993;119(3):42942. [2] Beven K, Binley AM. The future of distributed models: model calibration and uncertainty prediction. Hydrol Process 1992;6:27998. [3] Bogena H, Kunkel R, Schobel T, Schrey HP, Wendland F. Distributed modelling of groundwater recharge at the macroscale. Ecol Model 2005;187:1526. [4] Boyle DP, Gupta HV, Sorooshian S. Toward improved calibration of hydrologic models: combining the strengths of manual and automatic methods. Water Resour Res 2000;36(12):366374. [5] Cheng C-T, Zhao M, Chau KW, Wu X. Using genetic algorithm and TOPSIS for Xinanjing model calibration with a single procedure. J Hydrol 2006;316(1 4):12940. [6] Cheng C-T, Wu X, Chau KW. Multiple criteria rainfall-runoff model calibration using a parallel genetic algorithm in a cluster of computer. Hydrol Sci J 2005;50(6):106988. [7] Cheng C-T, Oh C, Chau KW. Combining a fuzzy optimal model with a genetic algorithm to solve multiobjective rainfall-runoff model calibration. J Hydrol 2002;268(14):7286. [8] Deb K, Pratap A, Agarwal S, Meyarivan T. A fast elitist non-dominated sorting genetic algorithm for multi-objective optimisation: NSGA-II. Report No. 200001. Indian Institute of Technology, Kanpur Genetic Algorithms Laboratory; 2000. p. 20. [9] Dornes PF. Calibration strategy for parameter estimation in the MIKE SHE modelling system. Master of Science thesis. International Institute for Infrastructural, Hydraulic and Environmental Engineering, IHE, DELFT, The Netherlands. [10] Duan QY, Sorooshian S, Gupta V. Effective and efcient global optimisation for conceptual rainfall-runoff models. Water Resour Res 1992;28(4):101531. [11] Fonseca CM, Fleming PJ. Multiobjective optimisation. In: Handbook of evolutionary computation. Oxford University Press, IOP Publishing Ltd.; 1995. [12] Graham DN, Butts MB. Flexible, integrated watershed modelling with MIKE SHE. In: Singh VP, Frevert DK, editors. Watershed models, 2006. p. 24572. [13] van Griensven A, Bauwens W. Multiobjective autocalibration for semidistributed water quality models. Water Resour Res 2005;39(12):1348. doi:10.1029/2003WR002284. [14] Gupta HV, Sorooshian S, Yapo PO. Toward improved calibration of hydrological models: multiple and non-commensurable measures of information. Water Resour Res 1998;34(4):75163. [15] Haykin S. Neural networks: a comprehensive foundation. Prentice Hall; 1999. [16] Henriksen HJ, Troldborg L, Nyegaard P, Sonnenborg TO, Refsgaard JC, Madsen B. Methodology for construction, calibration and validation of a national hydrological model for Denmark. J Hydrol 2003;280(14):5271. [17] Horritt M. Calibration of a two-dimensional nite element ood ow model using satellite imagery. Water Resour Res 2000;36(11):327991. [18] Hunter NM, Bates P, Horritt M, de Roo A, Werner M. Utility of different data types for calibrating ood inundation models within a GLUE framework. Hydrol Earth Syst Sci 2005;9(4):41230. [19] Khadam IM, Kaluarachchi JJ. Use of soft information to describe the relative uncertainty of calibration data in hydrologic models. Water Resour Res 2004;40:W11505. doi:10.1029/2003WR002939. [20] Khare V, Yao X, Deb K. Performance scaling of multi-objective evolutionary algorithms. In: Fonseca Carlos M, Fleming Peter J, Zitzler Eckart, Deb Kalyanmoy, Thiele Lothar, editors. Evolutionary multi-criterion optimization. Second international conference, EMO 2003, Lecture notes in computer science, vol. 2632. Faro, Portugal: Springer; 2003. p. 37690. [21] Khu S-T, diPierro F, Savic D, Djordjevic S, Walters GA. Incorporating spatial and temporal information for urban drainage model calibration: an approach using preference ordering genetic algorithm. Adv Water Resour 2006;29(8). doi:10.1016/j.advwatres.2005.09.009. [22] Khu S-T, Madsen H. Multiobjective calibration with Pareto preference ordering: an application to rainfall-runoff model calibration. Water Resour Res 2005;41(3). doi:10.1029/2004WR003041. [23] Khu S-T, Keedwell E. Introducing choices (exibility) in upgrading of water distribution network: The New York City tunnel network example. Eng Opt 2005;37(3):291305. [24] Knowles J, Watson A, Corne D. Reducing local optima in single-objective problems by multiobjectivization. In: 1st International conference on evolutionary multi-criterion optimization, 2001. p. 26882. [25] Kristensen KJ, Jensen SE. A model for estimating actual evapotranspiration from potential evapotranspiration. Nordic Hydrol 1975;6:7088. [26] Kohonen T. Self-organized formation of topologically correct feature maps. Biol Cybern 1982;43:5969.

1398

S.-T. Khu et al. / Advances in Water Resources 31 (2008) 13871398 computation (CEC2003), vol. 3. Canberra, Australia: IEEE Press; 2003. p. 2066 73. Refsgaard JC. Parameterisation, calibration and validation of distributed hydrological models. J Hydrol 1997;198:6997. Savic DA. Single-objective vs. multiobjective optimisation for integrated decision support. Integrated assessment and decision support. In: Rizzoli AE, Jakeman AJ, editors. Proceedings of the 1st Biennial Mtg. of int. env. modelling and software society, vol. 1. Switzerland: Lugano; 2001. p. 712. Savic DA, Khu S-T. Applications of evolutionary computing in hydrological sciences. In: Anderson MG, editor. Encyclopaedia of hydrological sciences. John Wiley publishers; 2005. p. 1778. Schoups G, Hopmans JW, Young CA, Vrugt JA, Wallender WW. Multi-criteria optimization of a regional spatially-distributed subsurface water ow model. J Hydrol 2005;311:2048. Seibert J, Uhlenbrook S, Leibundgut C, Haldin S. Multiscale calibration and validation of a conceptual rainfall-runoff model. Phys Chem Earth, Part B: Hydrol Oceans Atmos 2000;25(1):5964. Smith MB, Seo D-J, Koren VI, Reed SM, Zhang Z, Duan Q, et al. The distributed model intercomparison project (DMIP): motivation and experiment design. J Hydrol 2004;298:426. Tang Y, Reed P, Wagener T. How effective and efcient are multiobjective evolutionary algorithms at hydrologic model calibration? Hydrol Earth Syst Sci 2006;10:289307. Vrugt JA, Gupta HV, Bastidas LA, Bouten W, Sorooshian S. Effective and efcient algorithm for multi-objective optimization of hydrologic models. Water Resour Res 2003;39(8):1214. doi:10.1029/2002WR001746. Watanabe S, Sakakibara K. The effectiveness of multiobjective optimizer in single-objective optimization environment. GECCO conference, June 2529, Washington, DC, USA. ACM 1-59593-010-8/05/0006, 2005. Wealands SR, Grayson RB, Walker JP. Quantitative comparison of spatial elds for hydrological model assessment some promising approaches. Adv Water Resour 2005;28:1532. Yapo PO, Gupta HV, Sorooshian S. Multi-objective global optimization for hydrologic models. J Hydrol 1998;204(14):8397.

[27] Kuczera G, Mroczkowski M. Assessment of hydrologic parameter uncertainty and the worth of multi-response data. Water Resour Res 1998;35:14819. [28] Di Luzio M, Arnold JG. Formulation of a hybrid calibration approach for a physically based distributed model with NEXRAD data input. J Hydrol 2004;298:13654. [29] Madsen H. Automatic calibration of a conceptual rainfall-runoff model using multiple objectives. J Hydrol 2000;235:27688. [30] Madsen H. Parameter estimation in distributed hydrological catchment modelling using automatic calibration with multiple objectives. Adv Water Resour 2003;26:20516. [31] Madsen H, Khu ST. On the use of Pareto optimization for multi-criteria calibration of hydrological models, calibration and reliability in groundwater modelling: from uncertainty to decision making. In: Bierkens MFP, Gehrels JC, Kovar K, editors. Proceedings of ModelCARE 2005, The Hague, The Netherlands, June 2005. IAHS Publication No. 304, 2006. p. 939. [32] McCabe MF, Franks SW, Kalma JD. Calibration of a land surface model using multiple data sets. J Hydrol 2005;302:20922. [33] McLaughlin D, Townley LR. A reassessment of the groundwater inverse problem. Water Resour Res 1996;32(5):113161. [34] Meixner T, Bastidas LA, Gupta H, Bales RC. Multicriteria parameter estimation for models of stream chemical composition. Water Resour Res 2002;38(3):1027. doi:10.1029/2000WR000112. [35] Mroczkowski M, Raper GP, Kuczera G. The quest for more powerful validation of conceptual catchment models. Water Resour Res 1997; 33(10):232535. [36] Nash JE, Sutcliffe J. River ow forecasting through conceptual models, Part I: a discussions of principles. J Hydrol 1970;10:28290. [37] di Pierro F, Khu S-T, Savic DA. An investigation on preference ordering ranking scheme in multiobjective evolutionary optimization, IEEE Trans Evol Comput 2007;11(1):1745. doi:10.1109/TEVC.2006.876362. [38] di Pierro F. Evolutionary many-objectives optimisation and applications to water resources engineering. PhD dissertation. University of Exeter, 2006. [39] Purshouse RC, Fleming PJ. Evolutionary multi-objective optimisation: an exploratory analysis. In: Proceedings of the 2003 congress on evolutionary

[40] [41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]