Вы находитесь на странице: 1из 32

04/04/2006

Hydrologic Statistics

Reading: Chapter 11, Sections 12-1 and


12-2 of Applied Hydrology
Probability
A measure of how likely an event will occur
A number expressing the ratio of favorable
outcome to the all possible outcomes
Probability is usually represented as P(.)
P (getting a club from a deck of playing cards) = 13/52 = 0.25 = 25 %
P (getting a 3 after rolling a dice) = 1/6

2
Random Variable
Random variable: a quantity used to represent
probabilistic uncertainty
Incremental precipitation
Instantaneous streamflow
Wind velocity
Random variable (X) is described by a probability
distribution
Probability distribution is a set of probabilities
associated with the values in a random variables sample
space

3
Sampling terminology
Sample: a finite set of observations x1, x2,.., xn of the random
variable
A sample comes from a hypothetical infinite population
possessing constant statistical properties
Sample space: set of possible samples that can be drawn from a
population
Event: subset of a sample space
Example
Population: streamflow
Sample space: instantaneous streamflow, annual
maximum streamflow, daily average streamflow
Sample: 100 observations of annual max. streamflow
Event: daily average streamflow > 100 cfs
5
Hydrologic extremes
Extreme events
Floods
Droughts
Magnitude of extreme events is related to their
frequency of occurrence
1
Magnitude
Frequency of occurence
The objective of frequency analysis is to relate the
magnitude of events to their frequency of
occurrence through probability distribution
It is assumed the events (data) are independent and
come from identical distribution

6
Return Period
Random variable: X
Threshold level: xT

Extreme event occurs if: X xT


Recurrence interval: Time between ocurrences of X x
T

Return Period: E ( )
Average recurrence interval between events equalling or
exceeding a threshold
If p is the probability of occurrence of an extreme
event, then E ( ) T 1
p

or 1
P ( X xT )
T 7
More on return period
If p is probability of success, then (1-p) is the
probability of failure
Find probability that (X xT) at least once in N years.

p P ( X xT )
P( X xT ) (1 p )
P( X xT at least once in N years) 1 P ( X xT all N years)
N
1
P( X xT at least once in N years) 1 (1 p ) 1 1
N

T
8
Hydrologic data
series
Complete duration series
All the data available
Partial duration series
Magnitude greater than base value
Annual exceedance series
Partial duration series with # of
values = # years
Extreme value series
Includes largest or smallest values in
equal intervals
Annual series: interval = 1 year
Annual maximum series: largest
values
Annual minimum series : smallest
values
9
Return period example
Dataset annual maximum discharge for 106
years on Colorado River near Austin
xT = 200,000 cfs
600
No. of occurrences = 3
500
Annual Max Flow (10 3 cfs)

2 recurrence intervals
400
in 106 years
300
T = 106/2 = 53 years
200

100 If xT = 100, 000 cfs


0 7 recurrence intervals
1905 1908 1918 1927 1938 1948 1958 1968 1978 1988 1998

Year T = 106/7 = 15.2 yrs

P( X 100,000 cfs at least once in the next 5 years) = 1- (1-1/15.2)5 = 0.29


10
Summary statistics
Also called descriptive statistics
If x1, x2, xn is a sample then

1 n
Mean, X xi m for continuous data
n i 1

xi X
1 n
Variance, S
2
s2 for continuous data
n 1 i 1

Standard S S2 s for continuous data


deviation,
S
Coeff. of variation, CV
X

Also included in summary statistics are median, skewness, correlation coefficient,


11
Time series plot
Plot of variable versus time (bar/line/points)
Example. Annual maximum flow series

600

500
Annual Max Flow (10 3 cfs)

400

300

200

100

0
1900
1905 1908 1900
1918 19001938
1927 1900 1958
1948 1900
1968 1978 1900
1988 1900
1998

Year
Year

Colorado River near Austin


13
Histogram
Plots of bars whose height is the number ni, or fraction
(ni/N), of data falling into one of several intervals of
equal width
30
60
100
90
50
25
80
Interval = 50,000 cfs
occurences
of occurences
No. ofoccurences

70
40
20
60 Interval
Interval = 25,000
= 10,000 cfscfs
30
15
50
40
No. of

20
10
No.

30
1020
5
10
0
00
0 0 0 0 0 0 0 0 350 00 400 50 450 00 500
0 05 50 50
10 100100
15 150
15020 200
20025 250
25030 300
300 35 350 4 400 4 450 5 500
3 33
Annual
Annualmm
Annual m ax
ax
ax flow
flow (10
flow(10 cfs)
(10cfs)cfs)

Dividing the number of occurrences with the total number of points will give Probability
Mass Function 14
Probability density function
Continuous form of probability mass function is probability
density function
0.9
100
90
0.8
80
0.7
occurences

70
0.6
Probability

60
0.5
50
0.4
40
No. of

0.3
30
0.2
20
0.1
10
00
0 0 50
100 100 150
200 200 300
250 300 400350 400500 450 500
600
3 3
Annualmm
Annual axaxflow
flow(10
(10 cfs)
cfs)

pdf is the first derivative of a cumulative distribution function


16
Cumulative distribution function
Cumulate the pdf to produce a cdf
Cdf describes the probability that a random variable is less
than or equal to specified value of x

1
P (Q 50000) = 0.8
0.8

P (Q 25000) = 0.4
Probability

0.6

0.4

0.2

0
0 100 200 300 400 500 600
Annual m ax flow (103 cfs)

18
Probability distributions
Normal family
Normal, lognormal, lognormal-III
Generalized extreme value family
EV1 (Gumbel), GEV, and EVIII (Weibull)
Exponential/Pearson type family
Exponential, Pearson type III, Log-Pearson type
III

22
Normal distribution
Central limit theorem if X is the sum of n
independent and identically distributed random variables
with finite variance, then with increasing n the distribution of
X becomes normal regardless of the distribution of random
variables
pdf for normal distribution
2
1 xm
1
2 s
f X ( x) e
s 2
m is the mean and s is the standard
deviation

Hydrologic variables such as annual precipitation, annual average


streamflow, or annual average pollutant loadings follow normal distribution
23
Standard Normal distribution
A standard normal distribution is a normal
distribution with mean (m) = 0 and standard
deviation (s) = 1
Normal distribution is transformed to
standard normal distribution by using the
following formula:
X m
z
s
z is called the standard normal variable
24
Lognormal distribution
If the pdf of X is skewed, its not
normally distributed
If the pdf of Y = log (X) is
normally distributed, then X is
said to be lognormally
distributed.
1 ( y m y )2
f ( x) exp x 0, and y log x
xs 2 2s y
2

Hydraulic conductivity, distribution of raindrop sizes in storm


follow lognormal distribution.

25
Extreme value (EV) distributions
Extreme values maximum or minimum
values of sets of data
Annual maximum discharge, annual minimum
discharge
When the number of selected extreme values
is large, the distribution converges to one of
the three forms of EV distributions called Type
I, II and III

26
EV type I distribution
If M1, M2, Mn be a set of daily rainfall or streamflow,
and let X = max(Mi) be the maximum for the year. If
Mi are independent and identically distributed, then
for large n, X has an extreme value type I or Gumbel
distribution.
1 x u x u
f ( x) exp exp

6sx
u x 0.5772

Distribution of annual maximum streamflow follows an EV1


distribution 27
EV type III distribution
If Wi are the minimum streamflows
in different days of the year, let X =
min(Wi) be the smallest. X can be
described by the EV type III or
Weibull distribution.

k x
k 1
x k
f ( x) exp x 0; , k 0

Distribution of low flows (eg. 7-day min flow)


follows EV3 distribution.

28
Exponential distribution
Poisson process a stochastic
process in which the number of
events occurring in two disjoint
subintervals are independent
random variables.
In hydrology, the interarrival time
(time between stochastic hydrologic
events) is described by exponential
distribution
x 1
f ( x ) e x 0;
x

Interarrival times of polluted runoffs, rainfall intensities, etc are described by


exponential distribution.
29
Gamma Distribution
The time taken for a number of
events (b) in a Poisson process is
described by the gamma distribution
Gamma distribution a distribution
of sum of b independent and
identical exponentially distributed
random variables.

b x b 1e x
f ( x) x 0; gamma function
( b )
Skewed distributions (eg. hydraulic
conductivity) can be represented using
gamma without log transformation.
30
Pearson Type III
Named after the statistician Pearson, it is also
called three-parameter gamma distribution. A
lower bound is introduced through the third
parameter (e)
b ( x e ) b 1 e ( x e )
f ( x) x e ; gamma function
( b )

It is also a skewed distribution first applied in hydrology


for describing the pdf of annual maximum flows.

31
Log-Pearson Type III
If log X follows a Person Type III distribution,
then X is said to have a log-Pearson Type III
distribution
b ( y e ) b 1 e ( y e )
f ( x) y log x e
( b )

32

Вам также может понравиться