Sie sind auf Seite 1von 34

Hydrologic Statistics

Reading: Chapter 11, Sections 12-1 and


12-2 of Applied Hydrology

04/04/2006
2
Probability
A measure of how likely an event will occur
A number expressing the ratio of favorable
outcome to the all possible outcomes
Probability is usually represented as P(.)
P (getting a club from a deck of playing cards) = 13/52 = 0.25 = 25 %
P (getting a 3 after rolling a dice) = 1/6
3
Random Variable
Random variable: a quantity used to represent
probabilistic uncertainty
Incremental precipitation
Instantaneous streamflow
Wind velocity
Random variable (X) is described by a probability
distribution
Probability distribution is a set of probabilities
associated with the values in a random variables sample
space
5
Sampling terminology
Sample: a finite set of observations x
1
, x
2
,.., x
n
of the random
variable
A sample comes from a hypothetical infinite population
possessing constant statistical properties
Sample space: set of possible samples that can be drawn from a
population
Event: subset of a sample space
Example
Population: streamflow
Sample space: instantaneous streamflow, annual
maximum streamflow, daily average streamflow
Sample: 100 observations of annual max. streamflow
Event: daily average streamflow > 100 cfs
6
Summary statistics
Also called descriptive statistics
If x
1
, x
2
, x
n
is a sample then

=
=
n
i
i
x
n
X
1
1
( )
2
1
2
1
1

=
n
i
i
X x
n
S
2
S S =
X
S
CV =
Mean,
Variance,
Standard
deviation,
Coeff. of variation,
for continuous data
o
2
for continuous data
o for continuous data
Also included in summary statistics are median, skewness, correlation coefficient,
8
Graphical display
Time Series plots
Histograms/Frequency distribution
Cumulative distribution functions
Flow duration curve
9
Time series plot
Plot of variable versus time (bar/line/points)
Example. Annual maximum flow series

0
100
200
300
400
500
600
1905 1908 1918 1927 1938 1948 1958 1968 1978 1988 1998
Year
A
n
n
u
a
l

M
a
x

F
l
o
w

(
1
0
3

c
f
s
)
Colorado River near Austin
0
100
200
300
400
500
600
1900 1900 1900 1900 1900 1900 1900
Year
A
n
n
u
a
l

M
a
x

F
l
o
w

(
1
0
3

c
f
s
)
10
Histogram
Plots of bars whose height is the number n
i
, or fraction
(n
i
/N), of data falling into one of several intervals of
equal width
0
10
20
30
40
50
60
70
80
90
100
0 50 100 150 200 250 300 350 400 450 500
Annual max flow (10
3
cfs)
N
o
.

o
f

o
c
c
u
r
e
n
c
e
s
Interval = 50,000 cfs
0
10
20
30
40
50
60
0
5
0
1
0
0
1
5
0
2
0
0
2
5
0
3
0
0
3
5
0
4
0
0
4
5
0
5
0
0
Annual max flow (10
3
cfs)
N
o
.

o
f

o
c
c
u
r
e
n
c
e
s
Interval = 25,000 cfs
0
5
10
15
20
25
30
0 50 100 150 200 250 300 350 400 450 500
Annual max flow (10
3
cfs)
N
o
.

o
f

o
c
c
u
r
e
n
c
e
s
Interval = 10,000 cfs
Dividing the number of occurrences with the total number of points will give Probability
Mass Function
12
Using Excel to plot histograms
1) Make sure Analysis Tookpak is added in Tools.
This will add data analysis command in Tools
2) Fill one column with the data, and another with
the intervals (eg. for 50 cfs interval, fill
0,50,100,)
3) Go to ToolsData AnalysisHistogram
4) Organize the plot in a presentable form
(change fonts, scale, color, etc.)
13
Probability density function
Continuous form of probability mass function is probability
density function
0
10
20
30
40
50
60
70
80
90
100
0 50 100 150 200 250 300 350 400 450 500
Annual max flow (10
3
cfs)
N
o
.

o
f

o
c
c
u
r
e
n
c
e
s
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 100 200 300 400 500 600
Annual max flow (10
3
cfs)
P
r
o
b
a
b
i
l
i
t
y
pdf is the first derivative of a cumulative distribution function
15
Cumulative distribution function
Cumulate the pdf to produce a cdf
Cdf describes the probability that a random variable is less
than or equal to specified value of x
0
0.2
0.4
0.6
0.8
1
0 100 200 300 400 500 600
Annual max flow (10
3
cfs)
P
r
o
b
a
b
i
l
i
t
y
P (Q 50000) = 0.8
P (Q 25000) = 0.4
19
Hydrologic extremes
Extreme events
Floods
Droughts
Magnitude of extreme events is related to their
frequency of occurrence


The objective of frequency analysis is to relate the
magnitude of events to their frequency of
occurrence through probability distribution
It is assumed the events (data) are independent and
come from identical distribution


occurence of Frequency
1
Magnitude
20
Return Period
Random variable:
Threshold level:
Extreme event occurs if:
Recurrence interval:
Return Period:
Average recurrence interval between events equalling or
exceeding a threshold
If p is the probability of occurrence of an extreme
event, then

or

T
x X >
T
x
X
T
x X > = of ocurrences between Time t
) (t E
p
T E
1
) ( = = t
T
x X P
T
1
) ( = >
21
More on return period
If p is probability of success, then (1-p) is the
probability of failure
Find probability that (X x
T
) at least once in N years.
N
N
T
T T
T
T
T
p years N in once least at x X P
years N all x X P years N in once least at x X P
p x X P
x X P p
|
.
|

\
|
= = >
< = >
= <
> =
1
1 1 ) 1 ( 1 ) (
) ( 1 ) (
) 1 ( ) (
) (
22
Hydrologic data
series
Complete duration series
All the data available
Partial duration series
Magnitude greater than base value
Annual exceedance series
Partial duration series with # of
values = # years
Extreme value series
Includes largest or smallest values in
equal intervals
Annual series: interval = 1 year
Annual maximum series: largest
values
Annual minimum series : smallest
values
23
Return period example
Dataset annual maximum discharge for 106
years on Colorado River near Austin
0
100
200
300
400
500
600
1905 1908 1918 1927 1938 1948 1958 1968 1978 1988 1998
Year
A
n
n
u
a
l

M
a
x

F
l
o
w

(
1
0
3

c
f
s
)
x
T
= 200,000 cfs
No. of occurrences = 3
2 recurrence intervals
in 106 years
T = 106/2 = 53 years

If x
T
= 100, 000 cfs
7 recurrence intervals
T = 106/7 = 15.2 yrs

P( X 100,000 cfs at least once in the next 5 years) = 1- (1-1/15.2)
5
= 0.29
24
Probability distributions
Normal family
Normal, lognormal, lognormal-III
Generalized extreme value family
EV1 (Gumbel), GEV, and EVIII (Weibull)
Exponential/Pearson type family
Exponential, Pearson type III, Log-Pearson type
III

25
Normal distribution
Central limit theorem if X is the sum of n
independent and identically distributed random variables
with finite variance, then with increasing n the distribution of
X becomes normal regardless of the distribution of random
variables
pdf for normal distribution

2
2
1
2
1
) (
|
.
|

\
|

=
o

t o
x
X
e x f
is the mean and o is the standard
deviation
Hydrologic variables such as annual precipitation, annual average
streamflow, or annual average pollutant loadings follow normal distribution
26
Standard Normal distribution
A standard normal distribution is a normal
distribution with mean () = 0 and standard
deviation (o) = 1
Normal distribution is transformed to
standard normal distribution by using the
following formula:
o

=
X
z
z is called the standard normal variable
27
Lognormal distribution
If the pdf of X is skewed, its not
normally distributed
If the pdf of Y = log (X) is
normally distributed, then X is
said to be lognormally
distributed.


x log y and x
y
x
x f
y
y
= >
|
|
.
|

\
|

= , 0
2
) (
exp
2
1
) (
2
2
o

t o
Hydraulic conductivity, distribution of raindrop sizes in storm
follow lognormal distribution.
28
Extreme value (EV) distributions
Extreme values maximum or minimum
values of sets of data
Annual maximum discharge, annual minimum
discharge
When the number of selected extreme values
is large, the distribution converges to one of
the three forms of EV distributions called Type
I, II and III
29
EV type I distribution
If M
1
, M
2
, M
n
be a set of daily rainfall or streamflow,
and let X = max(Mi) be the maximum for the year. If
M
i
are independent and identically distributed, then
for large n, X has an extreme value type I or Gumbel
distribution.
Distribution of annual maximum streamflow follows an EV1
distribution
o
t
o
o o o
5772 . 0
6
exp exp
1
) (
= =
(

|
.
|

\
|

=
x u
s
u x u x
x f
x
30
EV type III distribution
If W
i
are the minimum streamflows
in different days of the year, let X =
min(W
i
) be the smallest. X can be
described by the EV type III or
Weibull distribution.
0 k , x
x x k
x f
k k
> >
(
(

|
.
|

\
|

|
.
|

\
|
|
.
|

\
|
=

o
o o o
; 0 exp ) (
1
Distribution of low flows (eg. 7-day min flow)
follows EV3 distribution.
31
Exponential distribution
Poisson process a stochastic
process in which the number of
events occurring in two disjoint
subintervals are independent
random variables.
In hydrology, the interarrival time
(time between stochastic hydrologic
events) is described by exponential
distribution
x
1
x e x f
x
= > =



; 0 ) (
Interarrival times of polluted runoffs, rainfall intensities, etc are described by
exponential distribution.
32
Gamma Distribution
The time taken for a number of
events (|) in a Poisson process is
described by the gamma distribution
Gamma distribution a distribution
of sum of | independent and
identical exponentially distributed
random variables.
Skewed distributions (eg. hydraulic
conductivity) can be represented using
gamma without log transformation.
function gamma x
e x
x f
x
= I >
I
=

; 0
) (
) (
1
|

| |
33
Pearson Type III
Named after the statistician Pearson, it is also
called three-parameter gamma distribution. A
lower bound is introduced through the third
parameter (c)
function gamma x
e x
x f
x
= I >
I

=

;
) (
) (
) (
) ( 1
c
|
c
c | |
It is also a skewed distribution first applied in hydrology
for describing the pdf of annual maximum flows.
34
Log-Pearson Type III
If log X follows a Person Type III distribution,
then X is said to have a log-Pearson Type III
distribution
c
|
c
c | |
> =
I

=

x log y
e y
x f
y
) (
) (
) (
) ( 1

Das könnte Ihnen auch gefallen