Sie sind auf Seite 1von 17

Journal of Hydrology 434435 (2012) 7894

Contents lists available at SciVerse ScienceDirect

Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol

Comparing methods for estimating ow duration curves at ungauged sites


D.J. Booker , T.H. Snelder
National Institute of Water and Atmospheric Research, PO Box 8602, Riccarton, Christchurch, New Zealand

a r t i c l e i n f o s u m m a r y

Article history: Flow duration curves (FDCs) are a useful tool for characterising hydrological regimes and ow variability.
Received 26 July 2011 FDCs observed at 379 gauging stations located across New Zealand were analysed with the aim of inves-
Received in revised form 8 January 2012 tigate how parameterisation and generalisation combine to inuence the accuracy of empirically pre-
Accepted 14 February 2012
dicted FDCs at ungauged sites. The appropriateness of four strategies for estimating FDCs was
Available online 22 February 2012
This manuscript was handled by Andras
compared: (a) parameterise then generalise; (b) parameterise then regionalise then generalise; (c) para-
Bardossy, Editor-in-Chief, with the meterise and generalise together; and (d) FDC substitution. These strategies were deployed using various
assistance of Erwin Zehe, Associate Editor combinations of methods for calculating parameters that describe the shape of FDCs (polynomial expres-
sions and probability distribution functions) and then methods for estimating these parameters at unga-
Keywords: uged sites using available catchment characteristics (stepwise linear regression and random forests). A
Flow duration curves parameterise and generalise together strategy was devised by applying a mixed-effects approach. A
Ungauged sites jack-knife cross-validation procedure was used to provide an independent test of each method for esti-
Probability distribution functions mating the FDC at ungauged sites. For parameterise then regionalise strategies, it was found that the
Random forests combination of parameterisation method and generalisation method together, rather than either in iso-
Mixed-effects lation, was important in determining overall performance. Results indicated that predictive capability
Dissimilarity modelling
varied between methods and across exceedence percentiles. The mixed-effects approach provided the
most parsimonious method for estimating FDC at ungauged sites. A method using the generalised
extreme value probability distribution that was generalised using random forests was the most accurate
method of estimating ow duration curves at ungauged sites across New Zealand.
2012 Elsevier B.V. All rights reserved.

1. Introduction the FDC at ungauged sites. Examples of attempts to estimate FDCs


at ungauged sites include those applied in Canada (LeBoutillier and
The ow duration curve (FDC) is a tool used to describe hydro- Waylen, 1993), France (Sauquet and Catalogne, 2011), Greece (Mim-
logical regimes. The FDC represents the relationship between mag- ikou and Kaemaki, 1985), India (Singh et al., 2001), Italy (Franchini
nitude and frequency of ow by dening the proportion of time for and Suppo, 1996; Castellarin et al., 2004), Turkey (Cigizoglu and Bay-
which any discharge is equalled or exceeded (Vogel and Fennessey, azit, 2000), Philippines (Quimpo et al., 1983), Portugal (Croker et al.,
1994). FDCs represent useful graphical and analytical tools for 2003), South Africa (Smakhtin et al., 1997), Switzerland (Ganora
evaluating ow variability at a particular site. Information con- et al., 2009), Taiwan (Yu et al., 2002), United Kingdom (Holmes
tained within the FDC can be used for water resource assessments et al., 2002) and United States (Fennessey and Vogel, 1990). Many
including hydropower design schemes (Warnick, 1984), reliability of these empirical approaches comprise two steps. First, the shapes
of water supply (McMahon, 1993), water quality assessments (Vo- of many FDCs are described by calculating values of statistical
gel and Fennessey, 1995) and the evaluation of river habitats parameters, a process referred to herein as parameterisation. Sec-
(Booker and Dunbar, 2004). ond, the shape parameters are related to catchment characteristics.
FDCs can be calculated from historical discharge time-series This allows estimation of FDCs at ungauged sites. In this paper we re-
where records of sufcient length are available. However, these re- fer to this process as generalisation. Therefore, empirical methods
cords are only available where the required data have been observed for estimation of FDCs at ungauged sites typically comprise param-
at gauging stations, and water resource assessments are often re- eterisation and generalisation.
quired for ungauged locations (Nathan and McMahon, 1992). This Various methods have been used to estimate FDCs for ungauged
creates a need for estimation of hydrological statistics including sites, including: regional regression approaches (e.g., Nathan and
McMahon, 1992); dening regional prediction curves (e.g., Smakh-
tin et al., 1997); mapping and interpolation of dimensionless ow
Corresponding author. Tel.: +64 (0)3 348 8987; fax: +64 (0)3 348 7891. indices (e.g., Vandewiele and Elias, 1995; Arnell, 1995); deriving
E-mail addresses: d.booker@niwa.co.nz (D.J. Booker), t.snelder@niwa.co.nz (T.H. properties of synthetic time-series (e.g., Clausen et al., 1994); and
Snelder).

0022-1694/$ - see front matter 2012 Elsevier B.V. All rights reserved.
doi:10.1016/j.jhydrol.2012.02.031
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 79

empirical orthogonal functions in combination with a hydrological


regionalisation (Sauquet and Catalogne, 2011). Smakhtin (2001)
provides a detailed review of these procedures. FDCs for ungauged 100 km
sites can also be estimated by substituting the unknown FDC with
an available observed FDC. For example, Ganora et al. (2009) used a N
dissimilarity method whereby a matrix describing dissimilarities
between all pairs of FDCs was related to a matrix describing dis-
similarities between the catchment characteristics of all pairs of
gauging stations. This study focused on empirical methods for esti-
mating FDCs at ungauged sites, rather than process-based methods
that seek to transfer parameters of physically-based hydrological
models (e.g., Wagener and Wheater, 2006; Brdossy, 2007; Buyta-
ert and Bevan, 2009). The overall aim was to investigate how
parameterisation and generalisation combine to inuence the
accuracy of empirically predicted FDCs at ungauged sites. In order O
40 S
to full this aim the appropriateness of several empirical methods
for estimating FDCs at ungauged sites were investigated. The spe-
cic objectives of the study were to: (a) compare the accuracy and
parsimony of several methods of statistical description of FDCs ob-
served at many gauging stations across New Zealand; (b) compare
methods for parameterising these statistical descriptions from
catchment characteristics for application at ungauged sites; (c)
identify the most accurate method of FDC-substitution for applica-
tion at ungauged sites; and (d) quantify the uncertainty associated
with estimated FDCs for ungauged sites. Cool Dry
Cool Wet
Cool Extremely Wet
2. Data description Warm Dry
Warm Wet
2.1. Hydrological data and ow duration curves Warm Extremely Wet

O
FDCs were calculated using mean daily ows observed at 379 175 E
gauging stations with publicly available records of ve full years
or longer. Only gauging stations whose catchments were natural
or had only minimal abstraction and impoundment were included Fig. 1. Map showing the locations and climate category of the gauging stations used
in the analysis. See Snelder et al. (2005a) for further details on in this study.
gauging station selection. These gauging stations were located
throughout New Zealand (Fig. 1), and represented a wide range
of hydrological conditions (Table 1, Table 2). The observed time- 3. Strategies
series did not all cover the same time periods (Fig. 2). To allow
comparisons between the shapes of FDCs, each was standardised Four strategies were employed for estimating ow duration
by dividing by the mean ow for that gauging station. curves at ungauged sites: (a) parameterise then generalise (PG);
(b) parameterise then regionalise then generalise (PRG); (c) para-
meterise and generalise together (PGT); and (d) FDC substitution.
2.2. Catchment characteristics Variations in the statistical approaches within these strategies lead
to the denition of 19 methods which are fully described below
A GIS representation of the New Zealand river network com- and summarised in Table 3.
prising 550,000 segments, their unique upstream catchments and
an associated database of catchment characteristics were used to
provide information for each gauging station. The catchment char- 3.1. Parameterise then generalise
acteristics include a range of categorical and continuous variables
(e.g., Table 1) (Snelder and Biggs, 2002; Snelder et al., 2004; Leath- PG strategies comprise two separate steps. In the rst step pre-
wick et al., 2011). The GIS river network and associated databases dened characteristics of each observed FDC are quantied by cal-
have previously been used to dene a hierarchical classication of culating parameter values using parametric methods. This step can
New Zealands rivers called the River Environment Classication be accomplished using regression methods (Fennessey and Vogel,
(REC; Snelder and Biggs, 2002). These databases provide invento- 1990) or by calculating linear moments (Hosking, 1990). In the sec-
ries for river resource analysis and management purposes (Snelder ond step the values calculated for each parameter for all observed
and Hughey, 2005; Leathwick et al., 2011; Clapcott et al., 2010, FDCs are related to catchment characteristics. This provides a
2011). They have also been used to create nationwide models for method for estimating FDCs at ungauged sites from known catch-
estimating ow statistics such as ood ows (Pearson and McKer- ment characteristics.
char, 1989), low ows (Pearson, 1995) and mean ow (Woods
et al., 2006) at ungauged sites using relationships between these 3.2. Parameterise then regionalise then generalise
hydrological metrics and catchment characteristics. Snelder et al.
(2005b) showed that grouping river segments by nested categori- Hydrological regionalisations attempt to classify river network
cal subdivisions of climate and topography, known as the Source- locations into groups with similar hydrological characteristics
of-Flow grouping factor (Table 1), provided an a priori hydrological (Burn, 1997). Regionalisations can be created either by dening
regionalisation. regions a priori (e.g., Snelder and Biggs, 2002) or by classifying
80 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

Table 1
Summary of the dening characteristics, categories and category membership criteria that combine to dene Source-of-Flow groupings within the REC.

Dening characteristic Categories Notation Category membership criteria


Climate Warm-extremely-wet WX Warm: mean annual temperature P 12 C
Warm-wet WW Cool: mean annual temperature < 12C
Warm-dry WD Extremely Wet: mean annual effective precipitationa P 1500 mm
Cool-extremely-wet CX Wet: mean annual effective precipitation > 500 and < 1500 mm
Cool-wet CW Dry: mean annual effective precipitation 6 500 mm
Cool-dry CD
Topography Glacial-mountain GM GM: M and % permanent ice > 1.5%
Mountain M M: > 50% annual rainfall volume above 1000 m
Hill H H: 50% rainfall volume between 400 and 1000 m
Low-elevation L L: 50% rainfall below 400 m
Lake Lk Lk: Lake inuence indexb>0.033
a
Effective precipitation = annual rainfall  annual potential evapotranspiration.
b
See Snelder and Biggs (2002) for a description.

Table 2 catchment characteristics and then nding the nearest gauging


Source-of-Flow groupings within the REC for gauging stations used in this study. station to the ungauged site; and dissimilarity modelling (e.g.,
Climate Topography Ganora et al., 2009).
Glacial- Mountain Hill Lowland Lake
mountain
Warm Extremely 0 0 1 2 0 4. Methods
Wet
Warm Wet 0 0 0 68 0 4.1. Parameterise then generalise
Warm Dry 0 0 0 12 0
Cool Extremely Wet 8 14 39 6 13
Two different statistical approaches to parameterising an FDC
Cool Wet 2 28 73 31 4
Cool Dry 0 1 40 36 1 were compared for the rst step in PG strategies. These approaches
were: (a) tting of various linear regression models; and (b) calcu-
lating linear moments and then generating an FDC using various
probability distribution functions. For standardised ows the rst
natural ow regimes according to their similarity with respect to
linear moment ,l1, represents the mean and is always equal to 1,
hydrological indices (e.g., Isik and Singh, 2008; Snelder et al.,
the second moment , l2, represents the variance, the third moment,
2009; Sadri and Burn, 2011). PRG strategies are those that apply
lca, represents skewness and the fourth moment, lkur, represents
separate parameterise then generalise procedures within groups
kurtosis (Hosking, 1990; Vogel and Fennessey, 1993). For the sec-
of a pre-dened hydrological regionalisation.
ond step in PG strategies, two statistical approaches for generalisa-
tion were compared. These approaches were: (a) stepwise linear
3.3. Parameterise and generalise together regression; (b) random forests.

One difculty with PG strategies, regardless of regionalising,


arises because the parameters used to describe the FDCs are not 4.1.1. Linear regression models
necessarily independent, but they may be generalised indepen- Several polynomial equations were tted to the observed FDCs.
dently. This can lead to combinations of parameter values being For each of the 379 observed FDCs, j, discharge, Q, was standard-
calculated for new sites that, when combined, produce spurious ised by dividing by the mean and then logged to the base 10. Log
FDC shapes. The same phenomena may occur in physically-based standardised Q was modelled as a function of exceedance percen-
hydrological modelling where interpolation of individual parame- tile by applying ve linear regression models each with a different
ter values can lead to unreasonable model parameters and results formulation (Table 4). For these linear regressions percent of time
(Brdossy, 2007). One method for dealing with uncertainties in the ow was not exceeded was transformed to be normal-reduced, Zi
parameter transformation process is to relate parameter behaviour (Fennessey and Vogel, 1990). Use of Zi  Zmin (where Zmin was a
to catchment properties and intercatchment similarities by creat- constant of 4.66) allowed a base 10 log transformation of the al-
ing model ensembles (Buytaert and Bevan, 2009). In the case of ready normal-reduced exceedance percentiles. Use of Zi  1 al-
empirical estimation of FDCs, independent generalisation of lowed the FDC to be centred within the range of percentiles, and
parameter values could produce FDCs that are not monotonically therefore the possibility for odd polynomial terms to describe
increasing. One method of overcoming this difculty is to employ asymptotes at both high and low ows. A Zi  1 transformation
a PGT strategy, such as a mixed-effects approach (Pinheiro and centres the FDC around the 16th ow percentile. In further discus-
Bates, 2000) in which all FDCs are parameterised and generalised sion Eqs. (1)(5) (Table 4) are referred to as the Linear, Log, Cubic,
together within the same procedure though specication of both Squared and Comb3 models respectively. Each linear model was t
the xed and random effects. using ordinary least squares linear regression with the assump-
tions of normal errors and constant variance (Chambers, 1992).
3.4. FDC substitution The different linear models were compared to determine which
of Eqs. (1)(5) provided the most accurate, yet parsimonious model
An alternative approach to estimating FDCs for ungauged sites of the observed FDCs. For each site and each model the Akaike
is to substitute with an observed FDC. Various methods can be information criterion (AIC), residual standard error and adjusted
used to identify which observed FDC should be used as the substi- r-squared were calculated and compared. AIC is a measure of the
tute. These methods include: nding the nearest gauging station to trade-off between degrees of freedom and t of the model (Akaike,
the ungauged site; isolating gauges that might have similar 1973).
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 81

terms on the basis of AIC alone has been shown to be somewhat


liberal in its choice of terms, a value of k = 4 was used for the multi-
ple of the number of degrees of freedom used for the penalty in the
stepwise procedure (Venables and Ripley, 2002). Minimal ade-
350

quate models obtained from stepwise reduction of Eq. (6) for each
observed FDC are referred to as Step models in further discussion.

4.1.2. Probability distribution functions


Seven probability distribution functions were identied as
300

possible methods for generating FDCs as dened by the distribu-


tion of standardised discharge. These probability distribution
functions were based on various pre-dened probability distribu-
tion functions and their transformations (Table 5), which are
themselves calculated from linear moments (Laio et al., 2009).
For each of the 379 observed FDCs, the AIC and the Anderson Dar-
250

ling Criteria (ADC) were calculated for each probability distribu-


tion following the method of Laio et al. (2009). These two
different model selection criteria were both designed to assess
the appropriateness of a distribution to represent hydrological
frequency data. The distribution of both AIC and ADC calculated
for each probability distribution and for each observed FDC were
200

compared. This comparison was made for all FDCs together and
Sites

after having grouped FDCs by Source-of-Flow in order to assess


the appropriateness of each probability distribution function to
summarise FDC characteristics across a range of hydrological
settings.
The majority of the 379 observed FDCs exhibited permanent
150

ow, but 33 contained zero ows. The presence of zero ows can
be problematic for probability distribution functions requiring
log transformations. In these cases zero ows were set to be one
tenth of lowest observed non-zero ow for each FDC.

4.1.3. Generalisation
100

Each of the parameters from each of the various linear regres-


sion models (Eqs. (1)(5); Table 4), along with the linear moments
of raw ow data (required for Eqs. (7)(10); Table 5) and linear
moments of the log transformed ow data (required for Eqs.
(11)(13); Table 5) was modelled independently as a function of
available catchment characteristics. This provided several methods
50

for estimating FDCs at ungauged sites. For each parameter the dis-
tribution of values tted to the observed FDCs was modelled as a
function of a suite of available independent continuous variables
describing the physical characteristics of the upstream catchment
(Table 6). These variables were chosen to include characteristics
that were likely to inuence hydrological processes, but no at-
1950 1960 1970 1980 1990 2000 2010 tempt was made to explicitly model any hydrological processes.
Year Two different statistical methods were used to model each tted
parameter separately. To avoid any subjectivity in the tting pro-
Fig. 2. Time-periods for which data were available from each gauging station. Note cess we choose to use only fully automated statistical techniques.
seven records started before 1950.
The rst generalisation method used stepwise multiple-linear
regression. This method was used to identify the minimal adequate
As an additional aid to nding the most appropriate national- linear model of each parameter as a function of the candidate
scale equation from the candidate equations, a saturated model explanatory variables including all two-way interactions (Table 6).
was also dened that included many terms that could possibly As above, AIC with a value of k = 4 was used to apply a penalised
combine to best model the FDC: log likelihood method to evaluate the trade-off between degrees
of freedom and t of the model as more explanatory parameters
logQ i =Q a0 a1 Z i  1 a2 logZ i  Z min a3 Z i are added into it (Crawley, 2002; Venables and Ripley, 2002). Com-
pletion of this process created a statistical model that could be
 Z min 2 a4 Z i  13 6
used for estimating each parameter at an ungauged location.
Standard forwards and backwards stepwise linear regression The second generalisation method used random forests (Brei-
was applied to Eq. (6) to identify the minimal adequate model from man, 2001). This method uses machine-learning to combine many
the terms included in this saturated model for each FDC separately. regression trees to produce more accurate regressions (Cutler et al.,
The Akaike information criterion (AIC; Akaike, 1973) was used to 2007). Random forests were used to model each parameter as a
apply a penalised log likelihood method to evaluate the trade-off function of the explanatory variables (Table 6). A Random Forest
between degrees of freedom and t of the model as explanatory model comprises an ensemble of regression trees (a forest) from
parameters are added or removed (Crawley, 2002). As selecting which a nal prediction is based on the predictions averaged over
82 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

Table 3
Descriptions of methods used to estimate FDCs.

Strategy Method name Description


Parameterise then generalise Linear random forest Parameterise using a straight line regression of log standardised ow against normal-reduced
percentage of time that ow is not exceeded. Generalisation using random forests
Linear step Parameterise using a straight line regression of log standardised ow against normal-reduced
percentage of time that ow is not exceeded. Generalisation using stepwise reduction of linear
models with two-way interactions
Comb3 RandomForest Parameterise using a polynomial regression of log standardised ow against normal-reduced
percentage of time that ow is not exceeded, which includes a quadratic and log term.
Generalisation using random forests
Comb3 Step Parameterise using a polynomial regression of log standardised ow against normal-reduced
percentage of time that ow is not exceeded, which includes a quadratic and log term.
Generalisation using stepwise reduction of linear models with two-way interactions
GEV RandomForest Parameterise by calculating linear moments of standardised ow. Generalise using random
forests. Generate FDC using the GEV probability distribution function
GEV Step Parameterise by calculating linear moments of standardised ow. Generalisation using
stepwise reduction of linear models with two-way interactions. Generate FDC using the GEV
probability distribution function
LP3 RandomForest Parameterise by calculating linear moments of log standardised ow. Generalise using random
forests. Generate FDC using the LP3 probability distribution function
LP3 Step Parameterise by calculating linear moments of log standardised ow. Generalisation using
stepwise reduction of linear models with two-way interactions. Generate FDC using the LP3
probability distribution function
Parameterise then regionalise then generalise Linear Reg Then Gen Parameterise using a straight line regression of log standardised ow against normal-reduced
percentage of time that ow is not exceeded. Regionalise using Source-of-Flow classication.
Generalisation by calculating the average in each class
Comb3 Reg Then Gen Parameterise using a polynomial regression of log standardised ow against normal-reduced
percentage of time that ow is not exceeded, which includes a quadratic and log term.
Regionalise using Source-of-Flow classication. Generalisation by calculating the average in
each class
GEV Reg Then Gen Parameterise by calculating linear moments of standardised ow. Regionalise using Source-of-
Flow classication of the REC. Generalisation by calculating the average in each class. Generate
FDC using the GEV probability distribution function.
LP3 Reg Then Gen Parameterise by calculating linear moments of log standardised ow. Regionalise using Source-
of-Flow classication. Generalisation by calculating the average in each class. Generate FDC
using the LP3 probability distribution function
Parameterise and generalise together Mixed Apply a mixed-effects model. Fixed effects are a polynomial of log standardised ow against
normal-reduced percentage of time that ow is not exceeded, which includes a quadratic and
log term. Random effects are stream order within Source-of-Flow category
FDC substitution Nearest REC Find the nearest site (Euclidean distances) within the same Source-of-Flow class
Distance matrix Substitute using the FDC with the least dissimilar FDC shape parameters.
Dissimilarity Substitute using the FDC that a dissimilarity model predicts to be the least dissimilar, where
dissimilarities between in FDC shape parameters are estimated from dissimilarities between
catchment characteristics
Random gauge Substitute using a randomly selected FDC

Table 4 all trees (Breiman, 2001; Cutler et al., 2007). A random forest mod-
Denitions of equations used to model ow duration curves. Where Z is normal- el is created by drawing several bootstrap samples from the origi-
reduced percent of time ow is exceeded, Q is ow and c is 1. nal training data and tting a single classication tree to each
Acronym Linear function Equation nos. sample. Independent predictions (i.e. independent of the model t-
ting procedure) are made for each tree from the observations that
Linear logQ i =Q a0 a1 Z i  c (1)
were excluded from the bootstrap sample (the OOB samples).
Log logQ i =Q a0 a1 logZ i  Z min (2)
Cubic (3)
These predictions are aggregated over all trees (the OOB predic-
logQ i =Q a0 a1 Z i  c a2 Z i  c3
tions) and provide an estimate of the predictive performance of
Squared logQ i =Q a0 a1 Z i  c a2 Z i  Z min 2 (4)
Comb3 (5) the model for new cases (Breiman, 2001). By-products of the ran-
logQ i =Q a0 a1 Z i  c
a2 Z i  Z min 2 a3 Z i  c3
dom forest calculations include measures of variable importance,
which are evaluated by randomly permuting each predictor vari-
able in turn and predicting the response for the OOB observations.
The decrease in prediction performance is the measure of

Table 5
Denitions of probability distribution functions used to model ow duration curves (after Laio et al. (2009)).

Distribution description Acronym Cumulative distribution function (G) or probability distribution function (g) Equation nos.
Gumbel or extreme value type I GUMBEL G(x, #) = exp[-exp(-(x - #1)/#2)] (7)
p h i
Normal or Gaussian NORM gx; # 1= 2p#2 exp 1=2x  #1 =#2 2 (8)

Generalised extreme value GEV Gx; # exp1  #3 x  #1 =#2 1=#3  (9)


Gamma or Pearson type III P3 gx; # 1=j#2 jC#3 x  #1 =#2 #3 1 expx  #1 =#2  (10)
Frechet or log transformed Gumbel EV2 Log transformation of Eq. (7) (11)
Log transformed NORM LN Log transformation of Eq. (8) (12)
Log transformed P3 LP3 Log transformation of Eq. (10) (13)
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 83

Table 6 questions to be answered than PG strategies, which do not con-


Variables used to generalise the shape of ow duration curves for prediction at sider the nested nature of the data (Gelman and Hill, 2007; Booker
ungauged locations.
and Dunbar, 2008; Booker, 2010). In this context, the results take
Variable name Description the form of xed effects and random effects. Fixed effects model
LogusAnRainVar Coefcient of variation of annual catchment rainfall (m) the overall response of Q in relation to Z. Random effects model
(logged) variation around the overall response. Mixed-effects models were
usAvTCold Mean minimum July air temperature (C) formulated using Eq. (1) to specify xed effects, and allowed vari-
usAvTWarm Mean January air temperature (C)
LogusRainDays10 Catchment rain days, greater than 10 mm/month (days/
ations in each of the coefcients of Eq. (1) to vary within a hierar-
year) (logged) chical grouping structure:
LogusRainDays50 Catchment rain days, greater than 50 mm/month (days/
year) (logged) Level 1percentile i logQ ijkl =Q jkl a0 a1 Z i  c eikl
usPET Annual potential evapotranspiration of catchment (mm)
LogusCatElev Average elevation in the upstream catchment (m) Level 2 (Source-of-Flow) k
(logged) intercept a0kl a0l U 0kl
usLake Lake index (dimensionless) slope a1kl a1l U 1kl
LogusCalc Catchment average of calcium (ordinal scale) (logged)
usHard Catchment average of hardness, induration (ordinal intercept random component U 0kl  N0; s20
scale) slope random component U1kl  N0; s21
usParticleSize Catchment average of particle size (ordinal scale)
usAveSlope Catchment average of slope (m/m) Level 3 (stream order) l
usSolarRadWin Catchment average of winter solar radiation (W/m2) intercept a0l a0 U 0l
usSolarRadSum Catchment average of summer solar radiation (W/m2)
LogCATCHAREA Catchment area (m2) (logged) slope a1l a1 U 1l
ORDER Stream order (Strahler stream order)
intercept random component U 0l  N0; s20
slope random component U 1l  N0; s21
Covariances and residual
importance of the original variable. Importance represents the con- covariance cov U 0kl ; U 1kl s
tribution to accuracy of independent predictions for each explana- covariance cov U 0l ; U 1l m
tory variable and is equivalent to the error resulting from dropping
residual jklm  nN0; r2 14
a term from a linear model. Each random forest was developed by
growing 500 trees. As the number of trees (k) increases the gener- Several different hierarchical grouping structures could have been
alisation error always converges, it was assumed that 500 was suf- used to describe between-station variability within the FDC data,
ciently high to ensure convergence. but we used a grouping structure which nested stream order, k,
within the REC Source-of-Flow grouping factor, l (Table 1). It was
4.2. Parameterise then regionalise then generalise hypothesised that this grouping structure was likely to represent
differences in FDC shapes across the landscape, because Source-
The PG method could have been applied separately within of-Flow incorporates both climatic and topographic information
hydrological regions as pre-dened by a modied version of the and because hydrological differences are likely between catchments
Source-of-Flow classes (Table 1). However, running regressions of different sizes (e.g., Wiltshire, 1986; Burn, 1997). This model was
(using either linear regression or random forests) with less data compared with several reduced models by removing various ran-
points than independent variables is nonsensical. Therefore, sta- dom effects. The formulation of the random effects was assessed
tions were grouped into classes and the mean of each required by comparing AIC (Akaike, 1973) calculated for each model. P-val-
FDC shape parameter calculated for each group. These group-mean ues for Wald tests on individual (xed effects) parameters were also
parameters were then used to generate a representative FDC for examined.
each group. Stations were grouped into hydrological regions as de-
ned by the Source-of-Flow classes (Table 2). For this procedure 4.4. FDC substitution
the Source-of-Flow classes were modied such that no group con-
tained less than ve stations. To do this: one Cool Dry Lake was 4.4.1. Nearest in REC class
placed with the Cool Wet Lakes; one Cool Dry Mountain was For this method, the unknown FDC at an ungauged site was
placed with the Cool Wet Mountains; two Cool Wet Glacials were substituted with an observed FDC by identifying the nearest gaug-
placed with the Cool Wet Mountains; and one Warm Extremely ing station belonging to the same Source-of-Flow grouping. The
Wet Hill and two Warm Extremely Wet Lowlands were placed nearest gauging station was dened from Euclidean distances cal-
with the Warm Wet Lowlands. culated from geographic coordinates. This method therefore used
information on mapped locations of sites in combination with an
4.3. Parameterise and generalise together a priori regionalisation of sites based on their catchment character-
istics (i.e. Table 1).
Linear multilevel models (Snijders and Bosker, 1999) also
known as mixed-effects models (Pinheiro and Bates, 2000) were 4.4.2. Dissimilarity modelling
applied to parameterise a linear formulation describing observed Dissimilarity modelling involves the regression of a response
FDCs. Mixed-effects models allow consideration of a nested hierar- matrix containing dissimilarities between FDCs calculated from
chy of variation in the response variable, and explanatory variables all gauging stations, and several explanatory matrices containing
at various levels of that hierarchy. Multilevel models account for the equivalent dissimilarities between gauging stations based on
correlation of data within each level of the hierarchy. Correct de- catchment characteristics (Legendre et al., 1994). Specically, for
grees of freedom are retained and unbalanced data (e.g., unequal n gauging stations there are n(n  1)/2 dissimilarities that corre-
numbers of sites within classes) do not bias results. spond to the off-diagonal entries in a dissimilarity matrix. We de-
Although the results of multilevel models are initially more dif- ned dissimilarities between all pairs of Linear moments (l2, lca and
cult to interpret, they provide more information and allow more lkur) calculated from observed FDCs using the Manhattan distance
84 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

metric. Although more complex regression methods can be used Table 7


(e.g., Ferrier et al., 2007; Lichstein, 2007), we used multiple linear Percentage of terms maintained in the model by the stepwise procedure by Source-of-
Flow. The intercept, a0, was retained in all cases.
regression to relate the FDC dissimilarities to the dissimilarities
calculated from the catchment characteristics (Table 6). Thus, the Source-of-Flow Parameter
model expressed the FDC dissimilarity (l) as a function of the grouping
a1(Z  1) a2log(Z  Zmin) a3(Z  Zmin)2 a4(Z  1)3
catchment characteristic dissimilarities in the form; Cool Dry Hill 88 92 92 85
Cool Dry Lake 100 100 100 100
l b1 b1 g1 b2 g2 b3 g3    bn gn 15 Cool Dry Lowland 94 97 86 94
Cool Dry Mountain 100 100 100 100
where gi are dissimilarities derived from the ith catchment Cool Extremely Wet 50 88 100 100
Glacial
characteristic.
Cool Extremely Wet 92 95 100 100
An improvement to the dissimilarity modelling method of Hill
(Ganora et al., 2009) was included to account for non-linear rela- Cool Extremely Wet 100 77 92 100
tionships between FDC dissimilarity and dissimilarities in catch- Lake
ment characteristics. The formulation shown in Eq. (15) models Cool Extremely Wet 100 100 100 100
Lowland
dissimilarities between FDCs as linear combinations of dissimilar- Cool Extremely Wet 93 79 93 100
ities between catchment characteristics. However, the rate of Mountain
change in dissimilarity between gauging stations is likely to vary Cool Wet Glacial 100 100 100 100
in a non-linear manner with respect to changes in catchment char- Cool Wet Hill 90 93 89 93
Cool Wet Lake 100 100 100 50
acteristics. For example, dissimilarity between FDCs is likely to in-
Cool Wet Lowland 94 90 87 100
crease rapidly with change in catchment size for small catchments Cool Wet Mountain 86 96 96 82
but this rate of change is likely to decrease as catchments become Warm Dry Lowland 100 75 92 100
larger. The dissimilarity model tting procedure tested a range of Warm Extremely 100 100 100 100
parametric transformations of the catchment characteristic dis- Wet Hill
Warm Extremely 50 100 100 100
similarities and chose those with the strongest correlation with
Wet Lowland
the response. These transformations consisted of applying a log Warm Wet Lowland 87 90 84 87
to the base 10, applying a double log transformation and expansion
of the original values either by raising them to the power of 2 or 3
(Snelder et al., 2009).
Model tting used a forward stepwise procedure to iteratively mine the justiable number of predictors to include in the model
add catchment characteristics to the model. A ve-fold cross vali- and prevent over tting (Snelder et al., 2009). In this CV procedure,
dation (CV) was used to determine model complexity, i.e., to deter- ve mutually exclusive subsets of the full dataset each containing

Stepwise

Comb3
Model name

Squared

Cubic

Log

Linear

-1000 -500 0
AIC

Stepwise

Comb3
Model name

Squared

Cubic

Log

Linear

0.70 0.75 0.80 0.85 0.90 0.95 1.00


2
Adjusted r

Fig. 3. Distributions of AIC and adjusted r-squared for all sites (n = 379) and various model formulations.
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 85

Warm Extremely Wet_Lowland Warm Wet_Lowland


PDF Order
Generalized extreme-value 1
0.6 Pearson type III 2
Generalized Pareto 3
0.4 Generalized logistic 4
Log normal 5
6
0.2 7

Cool Wet_Lowland Cool Wet_Mountain Warm Dry_Lowland Warm Extremely Wet_Hill

0.6

0.4

0.2

Cool Extremely Wet_Mountain Cool Wet_Glacial Cool Wet_Hill Cool Wet_Lake

0.6
L-kurtosis

0.4

0.2

Cool Extremely Wet_Glacial Cool Extremely Wet_Hill Cool Extremely Wet_Lake Cool Extremely Wet_Lowland

0.6

0.4

0.2

Cool Dry_Hill Cool Dry_Lake Cool Dry_Lowland Cool Dry_Mountain

0.6

0.4

0.2

0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8
L-skewness

Fig. 4. L-kurtosis against L-skewness for each standardised ow duration curve by Source-of-Flow class and stream order (n = 379). Lines represent various probability
distribution functions.

20% of the stations were selected randomly. For each cross valida- each FDC, a substitute FDC was chosen by nding the paired FDCs
tion subset, models were tted to the remaining 80% of the data by with the smallest dissimilarity. These estimated FDCs represent
incrementally increasing model complexity. At each increment, those that would have been estimated for ungaunged sites assum-
predictions were made for the 20% of withheld stations for which ing that the catchment characteristics could explain all differences
the predictive performance of the model was evaluated. The between all pairs of linear moments. In other words: the best
complexity of the nal model was determined as the number of estimation of the FDC that could have possibly been achieved using
predictors that produced the maximum average predictive perfor- dissimilarity modelling. This method is referred to as the FDC Dis-
mance. The tted model was then used to calculate the catchment tance Matrix method in further analysis.
characteristic dissimilarities between new sites and all possible
substitute gauging stations and the gauging station whose dissim-
ilarity was least was used as the substitute FDC. 4.4.4. Random substitution
For comparison with the above statistical approaches, a further
4.4.3. Distance matrix method for estimating the FDC at an ungauged site was dened.
A Manhatten distance matrix representing differences between For this method, the unknown FDC at an ungauged site was substi-
all pairs of FDC linear moments (l2, lca and lkur) was calculated. For tuted with a randomly chosen observed FDC. This method was
86 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

Lake (n = 18) Lowland (n = 155) Mountain (n = 43)


P3
NORM

Probability distribution function


LP3
LN
GUMBEL
GEV
EV2
ALL (n = 379) Glacial (n = 10) Hill (n = 153)
P3
NORM
LP3
LN
GUMBEL
GEV
EV2
-2 -1 0 1 2 3
10 10 10 10 10 10 10-2 10-1 100 101 102 103 10-2 10-1 100 101 102 103
ADC

Lake (n = 18) Lowland (n = 155) Mountain (n = 43)


P3
NORM
Probability distribution function

LP3
LN
GUMBEL
GEV
EV2
ALL (n = 379) Glacial (n = 10) Hill (n = 153)
P3
NORM
LP3
LN
GUMBEL
GEV
EV2

-2000 0 2000 4000 6000 -2000 0 2000 4000 6000 -2000 0 2000 4000 6000
AIC

Fig. 5. AndersonDarling criteria (ADC) and Akaike Information Criteria (AIC) for different probability distributions by hydrological Source-of-Flow.

used to provide a measure of estimation performance through RMSDj represents a measure of the overall difference between ob-
comparison with the other methods. served and estimated FDC over all exceedance percentiles for each,
j, FDC. Zero values for either observed or estimated ows were ex-
4.5. Model testing cluded from calculation of RMSD. Therefore, nj is the number of per-
centiles with positive values for both observed and estimated FDCs.
A jack-knife cross-validation procedure (Efron, 1982) was used For the second method, error bands around each ow percentile
to provide a test of each method for estimating the FDC at unga- were calculated by subtracting log estimated standardised ow
uged sites. For each method, this cross-validation procedure was from log observed standardised ow for each FDC estimation meth-
applied by leaving out all data associated with each of the 379 od for each FDC for each ow percentile. Again, zero values for
gauging stations and then estimating the FDC for the left-out gaug- either observed or estimated ows could not be included in this
ing station using data from all remaining gauging stations. The re- analysis. These error bands were plotted around the mean FDC
sults from this procedure produced estimates of each FDC for each calculated over all FDCs. As zero ows could not be included in
method as if that gauging station were an ungauged site (Ganora either the rst or second methods for model testing, the third meth-
et al., 2009). These jack-knifed comparisons allowed an assessment od was designed to assess correspondence between observed and
of both the robustness and reliability of each method for FDC esti- estimated proportion of time for which ows were zero. Observed
mation at ungauged sites (Castellarin et al., 2004). and estimated proportion of time for which ows were zero was
After having calculated these jack-knifed estimated FDCs, three compared.
methods were employed for assessing different aspects of corre- Many of the FDC estimation methods did not permit zero ows
spondence between observed and estimated FDCs. The rst meth- to be estimated. We set a threshold of standardised ow at a very
od was to calculate root-mean-square-deviance (RMSD) for each low value of 0.0001, below which all estimated standardised ows
FDC estimation method for each FDC. were set to zero. Only 9% of sites had any standardised observed
v ows that were below this value. The proportion of sites with zero
u nj !
u X log Q obs ij  log Q est ij 2 ows was also 9%.
RMSDj t 16 Where possible FDCs were also calculated using parameters
nj
i1 tted directly from the observed FDC data (i.e. with no general-
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 87

Logl2 Logl2 Logl2


RandomForest Stepwise PRG

4
4

4
3
3

3
2
2

2
1
1

1
0
0

0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4

Linear a1 Linear a1 Linear a1


RandomForest Stepwise PRG
1.5

1.5
1.5
1.0
1.0

1.0
0.5
0.5

0.5
0.0
0.0

0.0
Predicted

0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5
l2 l2 l2
RandomForest Stepwise PRG
0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Comb3 a1 Comb3 a1 Comb3 a1
RandomForest Stepwise PRG
0.00.51.01.52.02.53.0

0.00.51.01.52.02.53.0
3
2
1
0

0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 0.0 0.5 1.0 1.5 2.0 2.5 3.0
Fitted

Fig. 6. Fitted versus jack-knifed predictions of various example parameters used to describe the shape of FDCs calculated using different methods. PRG refers to the
parameterise then regionalise then generalise method.

isation or regionalisation). These tted FDCs represent those de- a cubic term provided a more appropriate formulation of more FDCs,
rived using linear models and probability distribution functions and therefore a better tting formulation for the FDCs. This nding
that would have been achieved assuming a perfect method for was supported by a reduction in residual standard error for 94% of
generalising all required parameters. In other words: the best the FDCs when modelled using the Comb3 model rather than the
estimation of the FDC that could have possibly been achieved Squared model. AIC was reduced most when a stepwise procedure
using PG methods. These FDC estimates are referred to as Fitted was used because this procedure is designed to minimise AIC.
in further analysis. Models resulting from stepwise reduction of Eq. (6) for each ob-
served FDC showed that there was some between-FDC variation in
5. Results the combination of terms that best tted the shape of the log-nor-
mal FDC (Table 7). The most complicated stepwise models in-
5.1. Parameterise then generalise cluded all ve possible terms, however many reduced models
excluded the log term. Table 7 showed no systematic patterns in
5.1.1. Linear regression models the inclusion of each of the terms with Source-of-Flow categories.
Summary statistics describing the tted performance of linear These results indicated that most, but not all, of the 379 ob-
model formulations (Eqs. (1)(6)) to each FDC were compared served FDCs were better tted by more complicated linear models.
(Fig. 3). Adjusted r2 increased consistently across FDCs as terms were The greatest reduction in AIC was made when squared and cubic,
added to model formulations. AIC was reduced as terms were added but not log, terms were used together. The Comb3 model (Eq.
across the suite of model formulations. Reduced AIC for the Comb3 (5)) was therefore identied as a candidate national-scale model
model in comparison with the Squared model indicated that adding from amongst the available linear models (Eqs. (1)(5)).
88 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

Table 8
Summary statistics for mixed-effects models of log standardised ow against normal-reduced exceedance percentile. Each model has a different structure for the random effects.
Values in brackets indicate standard errors.

Parameter Symbol Value


Model 0 Model 1 Model 2 Model 3 Model 4 Model 5
Fixed effects
Intercept a0 0.143 (0.0082) 0.135 (0.0130) 0.144 (0.0071) 0.143 (0.0073) 0.178 (0.0310) 0.141 (0.0005)
Slope a1 0.370 (0.0288) 0.354 (0.0278) 0.370 (0.0282) 0.353 (0.0277) 0.386 (0.0004) 0.352 (0.0297)
Random effects
Source-of-Flow
Intercept s0 0.025 0.026 0.030 0.031
Slope s1 0.109 0.118 0.106 0.118
Corr s 0.874 0.982 0.634 0.495
Stream order
Intercept t0 0.046 0.099 0.131
Slope t1 0.094 0.100 0.126
Corr t 0.136
Residual e 0.216 0.225 0.218 0.237 0.252 0.238
AIC 82805.866 52929.033 79265.201 16232.443 29563.896 13367.834
BIC 82708.331 52853.173 79189.340 16167.420 29607.245 13324.485
Log likelihood 41411.933 26471.517 39639.600 8122.222 14777.948 6687.917

Table 9 Nearest Gauge REC Random Site


Rank of predictor variable importance for each random forest. 10
Predictor variable Response parameter
9
Linear Comb3 Q Log Q
a0 a1 a0 a1 a2 a3 l2 lca l1 l2 lca 8
Log (Catchment area (m2))

LogusAnRainVar 1 14 2 13 12 1 10 3 11 13 13
usAvTCold 7 5 12 2 4 8 6 8 4 4 4 7
usAvTWarm 5 4 5 4 9 5 5 11 8 11 6
LogusRainDays10 3 8 4 11 14 3 4 5 12 7 2 6
LogusRainDays50 12 12 8 10 8 11 14 12 7 6 8
usPET 2 7 7 12 15 3 8 9 6 9 10
Dissimilarity Distance Matrix
LogusCatElev 4 1 6 9 4 5 1 1 1 1 6 10
usLake 7 10 16 7 11 14 12 4 3 3 15
LogusCalc 13 13 11 16 10 7 16 15 13 14 8
9
usHard 15 11 14 7 3 10 12 14 14 12 12
usParticleSize 7 2 15 1 7 12 2 2 2 2 3
usAveSlope 10 9 9 3 2 13 3 6 10 9 11 8
usSolarRadWin 5 6 1 5 6 9 7 13 5 8 5
usSolarRadSum 11 3 3 6 1 2 9 9 8 5 1 7
LogCATCHAREA 14 15 10 15 16 16 11 7 16 15 14
OrderNumeric 16 16 13 14 13 15 15 16 15 16 16 6 Different class
Same class

6 7 8 9 10 6 7 8 9 10
5.1.2. Probability distribution functions Log (Catchment area (m2))
There were strong interrelations between the L-moments calcu-
lated from each FDC (Fig. 4). The appropriateness of various prob- Fig. 7. Comparison of catchment areas, and source of ow category for FDCs being
substituted (x-axis) and substitute FDCs (y-axis) chosen using various methods.
ability functions can be assessed by comparing empirical patterns
between L-moments and theoretical relationships dened by the
probability distribution functions (Hosking and Wallis, 1997). This ate distributions for the 379 observed FDCs (Fig. 5). AIC values
is because probability distribution functions are dened by rela- indicated that FDCs from rivers predominately fed by lakes were
tionships between L-moments (Eqs. (7)(10)) or log transforma- equally well described by all distributions other than the NORM
tions of these data (Eqs. (11)(13)) (Table 5: Laio et al., 2009). distribution. However, ADC distributions indicated that these FDCs
For example, a relatively small change in L-kurtosis with changes were better described by the LP3 and GEV distributions. When all
in L-skewness are indicative of a Pearson type III (P3: Eq. (10)) dis- sites were considered, regardless of Source-of-Flow class, there
tribution. A greater rate of change or more curvature, as in Fig. 4, was great overlap in AIC values calculated for the Frechet (EV2),
are indicative of a generalised extreme value (GEV: Eq. (9)) or a GEV, LN, LP3 and P3 distributions. There was less overlap in the
log Pearson Type III (LP3) distribution respectively (Ganora et al., distribution of ADC, with the LP3 and GEV exhibiting the lowest
2009). Visual inspection of Fig. 4 indicates that a relatively tight spread of ADC values. The LP3 and GEV distributions reduced
relationship between L-kurtosis and L-skewness for the 379 ob- ADC more than any other distribution for 49% and 25% of the ob-
served FDCs, with different river types (as dened by Source-of- served FDCs respectively. Further analysis therefore concentrated
Flow) occupy different locations along this relationship. There on the LP3 and GEV probability distribution functions, as both
was a lack of systematic differences in the relationship between AIC and ADC values indicated that these were candidates for prob-
L-kurtosis and L-skewness across stream order. ability functions from which generalised FDCs could be generated
Both AIC and ADC values indicated that the Gumbel (Eq. (9)) (Fig. 5). PG and PRG of linear moments calculated from raw stand-
and normal (NORM: Eq. (10)) distributions were the least appropri- ardised FDC data was implemented to enable calculation of GEV
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 89

Random Gauge
Dissimilarity
Distance matrix
Nearest REC
Mixed
Fitted Mixed
LP3 Reg Then Gen
LP3 Step
Model description
LP3 RandomForest
Fitted LP3
GEV Reg Then Gen
GEV Step
GEV RandomForest
Fitted GEV
Comb3 Reg Then Gen
Comb3 Step
Comb3 RandomForest
Fitted Comb3
Linear Reg Then Gen
Linear Step
Linear RandomForest
Fitted Linear

0.0 0.2 0.4 0.6 0.8 1.0 1.2


Root mean square deviance in predicted log
standardised flow

Fig. 8. Root-mean-square-deviance in predicted log standardised ow for all sites (n = 379) across all ow percentiles (n = 1001 unless zero ow are either predicted or
observed) using various estimation methods. Reg Then Gen refers to parameterise then regionalise then generalise strategies, where Linear, Comb3, GEV or LP3 refers to the
parameterisation method.

Warm Dry Warm Wet Warm Extremely Wet


7

2
Stream Order

1
Cool Dry Cool Wet Cool Extremely Wet
7

0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0


Root mean square deviance in predicted log standardised flow

Fig. 9. Root-mean-square-deviance in predicted log standardised ow across all ow percentiles (n = 1001 unless zero ow are either predicted or observed) by climate
category and stream order using the GEV Random Forest method.

distributions, and linear moments calculated from the log trans- 5.1.3. Generalisation
formed standardised FDC data was implemented to enable calcula- Jack-kning procedures were used to re-calculate each
tion of LP3 distributions. parameter required for: (a) the Linear model; (b) the Comb3
90 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

model; (c) L-moments of raw observed FDC data; and (d) L-mo- Results indicated that there were differences in importance be-
ments of logged observed FDC data. Each parameter was calculated tween explanatory variables within each response (Table 9). There
from catchment characteristics (Table 6) using two jack-knifed were some consistent patterns between modelled responses, for
regression methods: (a) stepwise linear regression with two-way example, upstream catchment elevation (LogusCatElev) had the
interactions; and (b) random forests. Comparison between tted highest importance for ve of the eleven response variables and
and jack-knifed calculated parameter values showed that parame- particle size in the upstream catchment (usParticleSize) had the
ter values for more sites were more accurately calculated using second highest importance for six of the eleven responses. How-
random forests, but that extreme values were more accurately cal- ever, variation in rainfall (LogusAnRainVar), summer radiation
culated by linear stepwise models. This is because random forests (usSolarRadSum) and potential evapotranspiration (usPET) also
are unable to predict outside of the range of observed values had high importance for some responses. This indicates that differ-
(Fig. 6). ent FDC characteristics may be linked with different catchment
Variable importance was calculated for each explanatory vari- characteristics and that there may be considerable interactions be-
able for each response that was generalised using random forests. tween explanatory predictors within these random forests models.

Fitted Linear Linear RandomForest Linear Step Linear Reg Then Gen

1
0
-1
-2

Fitted Comb3 Comb3 RandomForest Comb3 Step Comb3 Reg Then Gen

1
0
-1
-2
Observed Log (Q/Qbar) - predicted Log (Q/Qbar)

Fitted GEV GEV RandomForest GEV Step GEV Reg Then Gen

1
0
-1
-2

Fitted LP3 LP3 RandomForest LP3 Step LP3 Reg Then Gen

1
0
-1
-2

Fitted Mixed Mixed Nearest REC Distance matrix

1
0
-1
-2

Dissimilarity Random Gauge

1 80th percentile
90th percentile
0 95th percentile
-1 Mean FDC
-2

0 20 40 60 80 100 0 20 40 60 80 100
Percent of time that flow is not exceeded

Fig. 10. Observed minus predicted standardised ow (n = 379) at each ow percentile (n = 1001) plotted around the averaged observed FDC. Note that points with either zero
observed or predicted ows cannot be plotted. Reg Then Gen refers to parameterise then regionalise then generalise strategies, where Linear, Comb3, GEV or LP3 refers to the
parameterisation method.
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 91

5.2. Parameterise then regionalise then generalise have best matched the observed FDCs in comparison with the
other methods used to apply a PG strategy. However, after gener-
Comparison between tted parameter values and those calcu- alisation, RMSD was lower for the Linear model and GEV distribu-
lated from the jack-knifed estimations after regionalisation, tion than for the Comb3 model and LP3 distribution respectively,
showed that generalisation by averaging within Source-of-Flow regardless of generalisation method. This indicated that, although
groups did not result in improved parameter predictions (Fig. 6). the Comb3 model and LP3 distribution were better able to describe
This suggests that, although separating the analysis by river type the shapes of the observed FDCs, generalisation of the parameters
may have isolated FDCs into groups with more consistent relation- required for these methods was less reliable than was the case for
ships between FDC shape parameters and catchment characteris- both the Linear model and the GEV distribution.
tics, there was still considerable within group variation in For the PG strategies, RMSD was reduced most when parame-
parameter values (Fig. 6). ters were generalised using random forests, rather than stepwise
linear models, regardless of FDC parameterisation method. Gener-
5.3. Parameterise and generalise together alisation using random forests also out-performed the PRG strat-
egy, regardless of FDC parameterisation method.
The appropriateness of several mixed-effects models for esti- RMSD was slightly more for the mixed-effects model than for
mating logged standardised Q as a function Z was assessed by com- the PG strategies that were generalised using random forests.
paring model summaries. Six mixed-effects models, each with a However, the mixed-effects model required less input information
different formulation of the random-effects, are shown here as (only Source-of-Flow and stream order) as opposed to the random
examples (Table 8). AIC, BIC (Bayesian Information Criterion: Sch- forest methods which took all variables listed in Table 6 as input.
warz, 1978) and model residuals indicated that inclusion of stream RMSD for the mixed-effects method was greater than that for the
order nested within Source-of-Flow for both terms in Eq. (1) cre- dissimilarity method and the nearest in REC class method.
ated improvements in explanatory power in comparison with RMSD was not the same across river sizes or catchment types
employing either of these grouping factors in isolation. Likelihood (e.g., Fig. 9). For the GEV Random Forest method, RMSD was gener-
ratio tests showed that Model 0 (Eq. (14)) had signicantly greater ally less for larger rivers and rivers in wetter climates. Although
explanatory power than any of the reduced models (p < 0.001). inequality in sample size makes it difcult to draw rm conclu-
Comparisons between the xed and random effects for Model 0 sions, these results indicate that links between ow variability
indicated that there were considerable variations in a0 and a1 be- and catchment characteristics are less consistent for smaller rivers
tween stream orders within Source-of-Flow groups (Table 8). and for catchments with drier climates.
These results also indicated differences in model performance with Errors bands were calculated across percentiles and plotted in
changes in the formulation of random-effects, with Model 0 out- relation to the average of the 379 observed FDCs. Results indicated
performing other formulations. P-values for Wald tests on individ- that error was not the same across exceedance percentiles (Fig. 10).
ual parameters (intercept: F = 1106, p < 0.0001, slope: F = 164, Estimated standardised FDCs exhibited wider error bands at lower
p < 0.001) showed that both xed effects were signicant. Model ows than at higher ows when plotted in log space. Results
0 was therefore used as an example PGT strategy in further agreed with calculated RMSD in that there was less error for the t-
analysis. ted Comb3 model in comparison with either the tted LP3 or tted
GEV models. This was even the case for very low ows, where the
5.4. FDC substitution tted Comb3 model still performed well. The GEV model tended to
overestimate ows in the medium range and underestimate lower
Fig. 7 indicates the degree to which each FDC-substitution ows, whereas the LP3 method did not exhibit systematic over- or
method was able to select substitute FDCs from gauging stations under-estimation. Errors increased considerably after jack-knifed
with matching catchment characteristics in terms of Source-of- generalisation for all four methods that employed PG strategies.
Flow category (Table 1) and catchment area. The Nearest in REC This was particularly the case when linear stepwise regression
class method identied substitute FDCs by selecting the FDC from with two-way interactions was used to generalise the Comb3 mod-
the nearest gauging station from within the same Source-of-Flow el, which exhibited nearly as much error as choosing to substitute
category. Since no additional criteria were used some FDCs were with a random gauge. Results suggested that, in this case, random
substituted with FDCs whose gauging stations had very different forests produced smaller errors in comparison to stepwise linear
catchment areas. The Distance matrix method identied pairs regression.
of FDCs with the most similar L-moments. Spread away from the Errors for the methods used to employ a PRG strategy were an
one-to-one line and the prevalence of FDCs paired with FDCs from improvement over replacement with an FDC from a random gauge,
gauging stations with different Source-of-Flow classes indicates but errors were still considerable, especially at lower ows. This
that similar FDC shapes can result from different catchment char- shows that averaging within Source-of-Flow classes only captures
acteristics. The Dissimilarity method produced a tight t around limited between-site variation in the shapes of FDCs. This nding
the one-to-one line and many FDCs were paired with FDCs from matches well with the comparison of various formulations of the
gauging stations with the same Source-of-Flow classes. This was random-effects within mixed-effects models, which showed con-
because the dissimilarity method is designed to identify differ- siderable improvements when stream order was nested within
ences in FDCs that can be explained by differences in catchment Source-of-Flow (Table 8).
characteristics. After jack-kning, errors associated with the mixed-effects
predictions were comparable with the other methods (Fig. 10).
5.5. Model testing However, the mixed-effects model tended to overestimate low
ows in a few FDCs with steep slopes. This indicates both a
RMSD between observed and estimated log standardised Q was departure from a linear relationship in log-normal space, and
calculated from various FDC estimation methods (Fig. 8). Results that information on just Source-of-Flow together with stream or-
from tted models indicated that, the Comb3 model and LP3 distri- der, could be used to explicate between-FDC variation for many,
bution were better able to represent the observed FDCs in compar- but not all sites.
ison with both the Linear model and the GEV distribution. Results The FDC distance matrix method represents the best results
suggested that, given perfect generalisation, the LP3 model would that would be achieved if a model explaining all differences in
92 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

Fitted Linear Linear RandomForest Linear Step Linear Reg Then Gen

60 Counts

40
361
20

0
Fitted Comb3 Comb3 RandomForest Comb3 Step Comb3 Reg Then Gen 244

60

40
123
20

0
Fitted GEV GEV RandomForest GEV Step GEV Reg Then Gen
Estimated percentage of time with zero flow

62

60

40
31
20

0
Fitted LP3 LP3 RandomForest LP3 Step LP3 Reg Then Gen 16
60

40
9
20

0
Fitted Mixed Mixed Nearest REC Distance matrix 5
60

40
3
20

0
Dissimilarity Random Gauge 2
60

40
1
20

0
0 20 40 60 0 20 40 60
Observed percentage of time with zero flow

Fig. 11. Observed versus predicted (using various methods) proportion of zero ows for 379 FDCs.

FDC L-moments with differences in catchment characteristics was Neither comparison of RMSD nor plotting of errors across per-
employed within a dissimilarity model (i.e. a perfect dissimilarity centiles allowed assessment of the ability of each estimation meth-
model). This is equivalent to the tted performance for the dissim- od to estimate zero ows. Where estimated zero ows were those
ilarity model. The difference in errors between FDCs estimated calculated to be below 0.0001. This was because these assessments
using the distance matrix method and the dissimilarity method were made in log space, and zero ows cannot be logged. The num-
can be attributed to the degree to which the dissimilarity model ber of observed zero ow days for each FDC was compared with
cannot explain differences in FDC L-moments using differences in that calculated by the various methods (Fig. 11). There was strong
catchment characteristics. Results indicate that, had the dissimilar- correspondence between observed zero ow days and those calcu-
ity model been able to identify all pairs of stations whose FDCs lated using the Fitted Comb3 model, Fitted LP3 distribution, and to
were most similar using catchment characteristics, this method a lesser extent the Fitted GEV distribution. A considerable loss in
would have out-performed all other methods for the majority performance was evident after jack-knifed generalisation, with lit-
(e.g., 80%) of sites. However, the performance of the particular dis- tle difference in the performance of the random forest method in
similarity method implemented here was similar to the relatively comparison to the stepwise linear method. However, the general-
simple Nearest in REC method (Fig. 10). ised GEV and LP3 methods as well as the Nearest in REC and
D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894 93

dissimilarity methods were all considerable improvements on pre- ows (Pearson and McKerchar, 1989), low ows (Pearson, 1995)
viously published results for estimating number of zero ow days and ow variability (Jowett and Duncan, 1990) across New Zealand
in New Zealand (Pearson, 1995). Methods that employed PRG have been previously linked to precipitation, potential evaporation
strategies consistently under-estimated the number of zero ow and catchment area, with more variable ow regimes present in
days. This is further evidence to support the presence of within higher elevation, wetter and smaller catchments.
Source-of-Flow class variation in FDC characteristics. All methods Stepwise linear regression for generalisation performed poorly
that assumed linearity of the FDC in log-normal space, including in comparison with random forests. This may be because complex
the mixed-effects model, showed an inability to estimate any zero non-linear relationships and high-order interactions exist between
ows. FDC parameters and combinations of catchment characteristics.
After jack-knifed generalisation, the Comb3 method performed
poorly in comparison with the GEV and LP3 methods. This may
6. Discussion have been because each parameter was generalised independently,
even though they were not independent of each other. In contrast,
This work indicated that, after jack-kning, the shapes of some the mixed-effects model was able to estimate sets of parameters
FDCs were hard to estimate regardless of methods used. This may together and therefore provide estimated FDCs that better
be caused by a combination of several factors. First, an attempt was matched the observed data. For this reason, the performance of
made to use data from catchments that were reasonably natural, the mixed-effects model was similar to that for the other methods
but despite these efforts, the dataset may have contained sites despite the mixed-effects model requiring less input information.
whose hydrology regimes were inuenced by human activities In this case, stream order nested within Source-of-Flow were used
(e.g., abstraction or water storage). Second, data on catchment to specify random-effects within mixed-effects models to illustrate
characteristics were extracted from available national-scale dat- that mixed-effects could be used as a PRT strategy. Various alterna-
abases. Improved spatial resolution may have produced more tive grouping factors could have been used to specify the random-
accurate representations of the true catchment characteristics, effects. For example, a geology categorisation of the gauging sta-
and therefore produced more accurate estimates of FDCs. Third, tions could have been added. The REC database also contains cate-
some hydrological processes may be difcult to generalise empir- gorical data describing the dominant valley landform and
ically given available information on catchment characteristics, landcover (Snelder and Biggs, 2002). Alternative combinations of
the number of sites, and range of hydrological conditions repre- random-effects were not trailed to avoid further increasing the de-
sented by the dataset. Fourth, it was assumed that the sample of grees of freedom within the mixed-effects model, and to avoid de-
379 gauging stations was broadly representative of the range of bate regarding the nested nature or these categorical variables.
catchments found throughout New Zealand. It was assumed that
these gauging stations represented various locations along a con- 7. Conclusion
tinuum of catchments and hydrological regimes found across
New Zealand. Therefore a jack-kning procedure was used to There are several different strategies and many methods that
quantify errors for each FDC as if it was an ungauged site, rather can be used to estimate FDCs at ungauged sites. For parameterise
than repeatedly holding out sets of sites, as would be done for a then regionalise strategies, it was found that the combination of
cross-validation procedure (Picard and Cook, 1984). This may not parameterisation method and generalisation methods together,
have been the case. For example, larger catchments may have been rather than either in isolation, was important in determining the
over-represented. Fifth, records of various lengths and covering overall performance. Results indicated that predictive performance
various time periods were used. This is not ideal since extreme varied between methods and across exceedence percentiles. The
events are more likely to appear in longer records. In this study, mixed-effects approach provided the most parsimonious method
consistency of record length and period of record coverage were for estimating FDC at ungauged sites. A method using the general-
sacriced in order to include more FDCs with greater spatial cover- ised extreme value probability distribution function generalised
age. It was assumed that any errors caused by variation in record using random forests was the most accurate method of estimating
length did not bias comparison between FDC parameterisation ow duration curves at ungauged sites across New Zealand.
methods, or generalisation methods.
Throughout the analysis we concentrated on the log-normal
Acknowledgements
standardised FDCs. All estimated standardised FDCs could be trans-
formed into units of m3 s1 by raising to the power 10 and multi-
This research was funded by the New Zealand Ministry of
plying by mean ow. Estimates of mean ow at ungauged sites in
Science and Innovation, Environmental Flows Programme
New Zealand are available (Woods et al., 2006). Thus our strategies
(C01X1004). We thank Maurice Duncan, Eric Sauquet and an anon-
assume that an accurate method of estimating mean ow at unga-
ymous referee for comments on earlier drafts of this manuscript.
uged sites was available. However, any errors in estimating mean
ow would affect the estimated FDC.
References
Although purely empirical approaches that did not include any
attempt to include physical processes were used, many of the pat- Akaike, H., 1973. Information theory as an extension of the maximum likelihood
terns that were found were physically meaningful. For example, l2 principle. In: Petrov, B.N., Csaki, F. (Eds.), Second International Symposium on
(the second linear moment) and a1 (the slope of the ow against Information Theory, Akademiai Kiado, Budapest, pp. 267281.
Arnell, N.W., 1995. Grid mapping of river discharge. J. Hydrol. 167, 3956.
percentile) both decreased with increasing catchment area and ele- Brdossy, A., 2007. Calibration of hydrological model parameters for ungauged
vation. This is indicative of the less ashy ow regimes expected in catchments. Hydrol. Earth Syst. Sci. 11, 703710.
larger and lower elevation catchments. Furthermore, of the avail- Booker, D.J., 2010. Predicting width in any river at any discharge. Earth Surf. Proc.
Land. 35, 828841.
able explanatory variables, catchment elevation, catchment area,
Booker, D.J., Dunbar, M.J., 2004. Application of Physical HAbitat SIMulation
potential evapotranspiration and the number of rain days with (PHABSIM) modelling to modied urban river channels. River Res. Appl. 20,
rainfall over 10 mm generally had high importance for the random 167183.
forest models of each parameter. This is consistent with previous Booker, D.J., Dunbar, M.J., 2008. Predicting river width, depth and velocity at
ungauged sites in England and Wales using multilevel models. Hydrol. Process.
broad-scale descriptions of hydrology across New Zealand (e.g., 22, 40494057.
Toebes and Palmer, 1969; Woods et al., 2006). Patterns in ood Breiman, L., 2001. Random forests. Machine Learning 45, 1532.
94 D.J. Booker, T.H. Snelder / Journal of Hydrology 434435 (2012) 7894

Burn, D.H., 1997. Catchment similarity for regional ood frequency analysis using Mimikou, M., Kaemaki, S., 1985. Regionalization of ow duration characteristics. J.
seasonality measures. J. Hydrol. 202, 212230. Hydrol. 82, 7791.
Buytaert, W., Bevan, K., 2009. Regionalization as a learning process. Water Resour. Nathan, R.J., McMahon, T.A., 1992. Estimating low ow characteristics in ungauged
Res. 45, W11419. doi:10.1029/2008WR007359. catchments. Water Resour. Manage 6, 85100.
Castellarin, A., Galeati, G., Brandimarte, L., Montanari, A., Brath, A., 2004. Regional Pearson, C.P., 1995. Regional frequency analysis of low ows in New Zealand rivers.
ow-duration curves: reliability for ungauged basins. Adv. Water Resour. 27, J. Hydrol. (NZ) 30, 5364.
953965. Pearson, C.P., McKerchar, A.I., 1989. Flood estimation a revised procedure. Trans.
Chambers, J.M., 1992. Linear models. In: Chambers, J.M., Hastie, T.J. (Eds.), Statistical Inst. Profess. Eng. New Zeal. 16 (2/CE), 5965.
Models in S. Wadsworth & Brooks/Cole. Picard, R., Cook, D., 1984. Cross-validation of regression models. J. Am. Statist. Assoc.
Cigizoglu, H.K., Bayazit, M., 2000. A generalized seasonal model for ow duration 79, 575583.
curves. Hydrol. Process. 14, 10531067. Pinheiro, J.C., Bates, D.M., 2000. Mixed-Effects Models in S and S-Plus. Springer-
Clapcott, J.E., Young, R.G., Goodwin, E.O., Leathwick, J.R., 2010. Exploring the Verlag, New York, p. 541.
response of functional indicators of stream health to land-use gradients. Quimpo, R.G., Alejandrino, A.A., McNally, T.A., 1983. Regionalised ow duration
Freshw. Biol. 55, 21812199. curves for Philippines. J. Water Resour. Plann. Manage. (ASCE) 109, 320330.
Clapcott, J.E., Collier, K.J., Death, R.G., Goodwin, E.O., Harding, J.S., Kelly, D., Sadri, S., Burn, D.H., 2011. A fuzzy C-means approach for regionalization using a
Leathwick, J.R., Young, R.G., 2011. Quantifying relationships between land-use bivariate homogeneity and discordancy approach. J. Hydrol.. doi:10.1016/
gradients and structural and functional indicators of stream ecological integrity. j.jhydrol.2011.02.027.
Freshw. Biol.. doi:10.1111/j.1365-2427.2011.02696.x. Sauquet, E., Catalogne, C., 2011. Comparison of catchment grouping methods for
Clausen, B., Young, A.R., Gustard, A., 1994. Modelling the impact of groundwater ow duration curve estimation at ungauged sites in France. Hydrol. Earth Syst.
abstractions on low-river ow. In: Seuna, P., Gustard, A., Arnell, N.W., Cole, G.A. Sci. 15, 24212435.
(Eds.), FRIEND: Flow Regimes from International Experimental and Network Schwarz, G., 1978. Estimating the dimension of a model. Ann. Statist. 6, 461464.
Data, IAHS Publication No. 221, pp. 7785. Singh, R.D., Mishra, S.K., Chowdhary, H., 2001. Regional ow-duration models for
Crawley, M.J., 2002. Statistical Computing: An Introduction to Data Analysis Using large number of ungauged Himalayan catchments for planning microhydro
S-Plus. Wiley, Chichester, UK, 761pp. projects. J. Hydrol. Eng. 6, 310316.
Croker, K.M., Young, A.R., Zaidman, M.D., Rees, H.G., 2003. Flow duration curve Smakhtin, V.U., 2001. Low ow hydrology: a review. J. Hydrol. 240, 147186.
estimation in ephemeral catchments in Portugal. Hydrol. Sci. J. 48, 427439. Smakhtin, V.Y., Hughes, D.A., Creuse-Naudine, E., 1997. Regionalization of daily ow
Cutler, D.R., Edwards, T.C., Beard, K.H., Cutler, A., Hess, K.T., Gibson, J., Lawler, J.J., characteristics in part of the Eastern Cape, South Africa. Hydrol. Sci. J. 42, 919
2007. Random forests for classication in ecology. Ecology 88, 27832792. 936.
Efron, B., 1982. The Jackknife, The Bootstrap and Other Resampling Plans. Society for Snelder, T.H., Biggs, B.J.F., 2002. Multi-scale river environment classication for
Industrial and Applied Mathematics, Philadelphia, PA. water resources management. J. Am. Water Resour. Assoc. 38, 12251240.
Fennessey, N.M., Vogel, R.M., 1990. Regional ow-duration curves for ungauged Snelder, T.H., Hughey, K.F.D., 2005. On the use of an ecological classication to
sites in Massachusetts. J. Water Resour. Plann. Manage. (ASCE) 116, 531549. improve water resource planning in New Zealand. Environ. Manage. 36, 741
Ferrier, S., Manion, G., Elith, J., Richardson, K., 2007. Using generalized dissimilarity 756.
modelling to analyse and predict patterns of beta diversity in regional Snelder, T.H., Cattaneo, F., Suren, A.M., Biggs, B.J.F., 2004. Is the river environment
biodiversity assessment. Divers. Distrib. 13, 252264. classication an improved landscape-scale classication of rivers? J. North Am.
Franchini, M., Suppo, M., 1996. Regional analysis of ow duration curves for a Benthol. Soc. 23, 580598.
limestone region. Water Resour. Manage. 10, 199218. Snelder, T.H., Woods, R., Biggs, B.J.F., 2005a. Improved eco-hydrological
Ganora, D., Claps, P., Laio, F., Viglione, A., 2009. An approach to estimate classication of rivers. River Res. Appl. 21, 609628.
nonparametric ow duration curves in ungauged basins. Water Resour. Res. Snelder, T.H., Leathwick, J.R.L., Dey, K.L., 2005b. Denition of the Multivariate
45, W10418. doi:10.1029/2008WR007472. Environment River Classication: Freshwater Environments of New Zealand.
Gelman, A.G., Hill, J., 2007. Data Analysis Using Regression and Multilevel/ NIWA Client Report: CHC2005-049. Department of Conservation, New Zealand.
Hierarchical Models. Cambridge University Press, Cambridge, 625pp. Snelder, T.H., Lamouroux, N., Leathwick, J.R., Pella, H., Sauquet, E., Shankar, U., 2009.
Holmes, M.G.R., Young, A.R., Gustard, A., Grew, R., 2002. A region of inuence Predictive mapping of the natural ow regimes of France. J. Hydrol. 373, 5767.
approach to predicting ow duration curves within ungauged catchments. Snijders, T.A.B., Bosker, R.J., 1999. Multilevel Analysis: An Introduction to Basic and
Hydrol. Earth Syst. Sci. 6, 721731. Advanced Multilevel Modelling. Sage, London, p. 266.
Hosking, J.R.M., 1990. L-moments: analysis and estimation of distributions using Toebes, C., Palmer, B.R., 1969. Hydrological Regions of New Zealand. New Zealand
linear combinations of order statistics. J. Roy. Statist. Soc. B (Methodol.) 52, Ministry of Works, Miscellaneous Hydrological Publication No. 4, p. 45.
105124. Vandewiele, G.L., Elias, A., 1995. Monthly water balance of ungauged catchments
Hosking, J.R.M., Wallis, J.R., 1997. Regional Frequency Analysis. Cambridge obtained by geographical regionalization. J. Hydrol. 170, 277291.
University Press, Cambridge, UK. Venables, W.N., Ripley, B.D., 2002. Modern Applied Statistics with S, fouth ed.
Isik, S., Singh, V.P., 2008. Hydrologic regionalization of watersheds in Turkey. J. Springer, New York, p. 495.
Hydrol. Eng. 13, 824834. Vogel, R.M., Fennessey, N.M., 1993. L moment diagrams should replace product
Jowett, I.G., Duncan, M.J., 1990. Flow variability in New Zealand rivers and its moment diagrams. Water Resour. Res. 29, 17451752.
relationship to in-stream habitat and biota. New Zeal. J. Marine Freshwater Res. Vogel, R.M., Fennessey, N.M., 1994. Flow duration curves I: new interpretation and
24, 305317. condence intervals. J. Water Resour. Plann. Manage. (ASCE) 120, 485504.
Laio, F., Di Baldassarre, G., Montanari, A., 2009. Model selection techniques for the Vogel, R.M., Fennessey, N.M., 1995. Flow duration curves II: a review of applications
frequency analysis of hydrological extremes. Water Resour. Res. 45, W07416. in water resources planning. J. Am. Water Resour. Assoc. 31, 10291039.
doi:10.1029/2007WR006666. Wagener, T., Wheater, H.S., 2006. Parameter estimation and regionalization
Leathwick, J.R., Snelder, T., Chadderton, W.L., Elith, J., Julian, K., Ferrier, S., 2011. Use continuous rainfall-runoff models including uncertainty. J. Hydrol. 320, 132
of generalised dissimilarity modelling to improve the biological discrimination 154.
of river and stream classications. Freshw. Biol. 56, 2138. Warnick, C.C., 1984. Hydropower Engineering. Prentice-Hall, Inc., Englewood Cliffs,
LeBoutillier, D.V., Waylen, P.R., 1993. A stochastic model of ow duration curves. New Jersey, pp. 5973.
Water Resour. Res. 29, 35353541. Wiltshire, E.E., 1986. Regional ood frequency analysis I: homogeneity statistics.
Legendre, P., Lapointe, F.J., Casgrain, P., 1994. Modeling brain evolution from Hydrol. Sci. J. 31, 321333.
behavior: a permutational regression approach. Evolution 48, 14871499. Woods, R.A., Hendrikx, J., Henderson, R.D., Tait, A.B., 2006. Estimating mean ow of
Lichstein, J.W., 2007. Multiple regression on distance matrices: a multivariate New Zealand rivers. J. Hydrol. (NZ) 45, 95110.
spatial analysis tool. Plant Ecol. 188, 117131. Yu, P.S., Yang, T.C., Wang, Y.C., 2002. Uncertainty analysis of regional ow duration
McMahon, T.A., 1993. Hydrologic design for water use. In: Maidment, D.R. (Ed.), curves. J. Water Resour. Plann. Manage. (ASCE) 128, 424430.
Handbook of Hydrology. McGraw-Hill, New York.

Das könnte Ihnen auch gefallen