Вы находитесь на странице: 1из 24

FREQUENCY ANALYSIS

Basic Problem:
To relate the magnitude of extreme events to
their frequency of occurrence through the use
of probability distributions.

FREQUENCY ANALYSIS
Basic Assumptions:
(a) Data analyzed are to be statistically independent
& identically distributed - selection of data (Time
dependence, time scale, mechanisms).
(b) Change over time due to man-made (eg.
urbanization) or natural processes do not alter the
frequency relation - temporal trend in data
(stationarity).

FREQUENCY ANALYSIS
Practical Problems:
(a) Selection of reasonable & simple distribution.
(b) Estimation of parameters in distribution.
(c) Assessment of risk with reasonable accuracy.

REVIEW OF BASIC CONCEPTS


Probabilistic
Outcome of a hydrologic event (e.g., rainfall amount & duration; flood
peak discharge; wave height, etc.) is random and cannot be predicted with
certainty.
Terminologies
Population
The collection of all possible outcomes relevant to the process of
interest. Example:
(1) Max. 2-hr rainfall depth: all non-negative real numbers;
(2) No. of storm in June: all non-negative integer numbers.
Sample
A measured segment (or subset) of the population.

REVIEW OF BASIC CONCEPTS


Terminologies
- Random Variable
A variable describable by a probability distribution which
specifies the chance that the variable will assume a particular
value.
Convention: Capital letter for random variables (say, X)
whereas the lower case letter (say, x) for numerical realization
that the random variable X will take.
Example:
X = rainfall amount in 2 hours (a random variable) ;
x = 100.2 mm/2hr (realization).
- Random variables can be
- discrete (eg., no. of rainy days in June) or
- continuous (eg., max. 2-hr rainfall amount, flood discharge)..

REVIEW OF BASIC CONCEPTS


Terminologies
Frequency & Relative Frequency
o For discrete random variables:
o Frequency is the number of occurrences of a specific event. Relative frequency
is resulting from dividing frequency by the total number of events. e.g.

n = no. of years having exactly 50 rainy days; N = total no. of years.


Let n=10 years and N=100 years. Then, the frequency of having
exactly 50 rainy days is 10 and the relative frequency of having
exactly 50 rainy days in 100 years is n/N = 0.1.
o For continuous random variables:
o Frequency needs to be defined for a class interval.
o A plot of frequency or relative frequency versus class intervals is called
histogram or probability polygon.
o As the number of sample gets infinitely large and class interval length
approaches to zero, the histogram will become a smooth curve, called
probability density function.

REVIEW OF BASIC CONCEPTS


Terminologies
Probability Density Function (PDF)
For a continuous random variable, the PDF must satisfy

f ( x) dx 1

and f(x) 0 for all values of x.


For a discrete random variable, the PDF must satisfy

p( x) 1
all x

and 1 p(x) 0 for all values of x.

REVIEW OF BASIC CONCEPTS


Terminologies
- Cumulative Distribution Function For a continuous random variable,
Pr( X xo )

xo

f ( x) dx

For a discrete random variable, by


Pr( X x o) =

all xi xo

p ( xi )

Statistical Properties of Random Variables


Population - Synonymous to sample space, which describes
the complete assemblage of all the values representative of a
particular random process.
Sample - Any subset of the population.
Parameters - Quantities that are descriptive of the population
in a statistical model. Normally, Greek letters are used to
denote statistical parameters.
Sample statistics (or simply statistics): Quantities calculated
on the basis of sample observations.

Statistical Moments of Random Variables


Descriptors commonly used to show statistical
properties of a RV are those indicative
(1) Central tendency;
(2) Dispersion;
(3) Asymmetry.
Frequently used descriptors in these three categories
are related to statistical moments of a RV.
Two types of statistical moments are commonly used
in hydrosystem engineering applications:
(1) product-moments and
(2) L-moments.

Product-Moments

rth-order product-moment of X about any reference point X=xo is defined, for continuous case,
as

r
r
r

E X xo = x xo f x x dx = x xo dFx x

whereas for discrete case,


K
E X xo

= xk xo p x xk
k=1

where E[] is a statistical expectation operator.


In practice, the first three moments (r=1, 2, 3) are used to describe the central tendency,
variability, and asymmetry.
Two types of product-moments are commonly used:
Raw moments: r'=E[Xr] rth-order moment about the origin; and
Central moments: r=E[(Xx)r] = rth-order central moment
Relations between two types of product-moments are:
r

r = 1 C r,i xi 'r i
i= 0

'r = C r,i xi r i
i= 0

where Cn,x = binomial coefficient = n!/(x!(n-x)!)


Main disadvantages of the product-moments are:
(1) Estimation from sample observations is sensitive to the presence of outliers; and
(2) Accuracy of sample product-moments deteriorates rapidly with increase in the order of the
moments.

Mean, Mode, Median, and Quantiles

Expectation (1st-order moment) measures central tendency of random variable X


E X = x =

x f x dx = x dF x = 1 F x
x

dx

Mean () = Expectation = 1 = location of the centroid of PDF or PMF.


Two operational properties of the expectation are useful:
K
K

E ak X k = ak k

k=1

k=1

in which k=E[Xk] for k = 1,2, , K.


For independent random variables,
E

X
=
k k
k=1
k=1

Mode (xmo) - the value of a RV at which its PDF is peaked. The mode, xmo, can be obtained by
f x
solving
=0

x xmo

Median (xmd) - value that


splits the distribution into two equal halves, i.e,
x
Fx xmd = f x x dx = 0.5
md

Quantiles - 100pth quantile of a RV X is a quantity xp that satisfies P(X xp) = Fx(xp) = p


A PDF could be uni-modal, bimodal, or multi-modal. Generally, the mean, median, and mode
of a random variable are different, unless the PDF is symmetric and uni-modal.

Uni-modal and bi-modal distributions


fx(x)

x
(a) Uni-modal distribution
fx(x)

x
(b) Bi-modal distribution

Variance, Standard Deviation, and


Coefficient of Variation

Variance is the second-order central moment measuring the spreading of a


RV over its range,
Var X = 2 = E X

2
x

=
2

f xx dx
2

Standard deviation (x) is the positive square root of the variance.


Coefficient of variation, x=x/x, is a dimensionless measure; useful for
comparing the degree of uncertainty of two RVs with different units.
Three important properties of the variance are:

(1) Var[c] = 0 when c is a constant.


(2) Var[X] = E[X2] E2[X]
For multiple independent random variables,

Var

k=1

X k = ak2 k2
k=1

where ak =a constant and k = standard deviation of Xk, k=1,2, ..., K.

Skewness Coefficient
Measures asymmetry of the PDF of a random variable
Skewness coefficient, x, defined as

3 E X
x = 1. 5 =
x3
2

The sign of the skewness coefficient indicates the degree of


symmetry of the probability distribution function.
Pearson skewness coefficient
1 =

x x mo
x

In practice, product-moments higher than 3rd-order are less used


because they are unreliable and inaccurate when estimated from
a small number of samples
See Table for equations to compute the sample product-moments.

Relative locations of mean, median, and mode


for positively-skewed, symmetric, and
negatively-skewed distributions.
fx(x)
fx(x)

(a) Positively Skewed, x>0

(b) Symmetric, x=0

xmo xmd x

xxmo=xmd

fx(x)
(c) Negatively skewed, x<0

x
xmd xmo

Product-moments of random variables


Moment
First

Measure of
Central
Location

Definition
Mean, Expected value
E(X)=x
Variance, Var(X)=2= x2

Continuous Variable

x f ( x) dx
x

(x x ) fx (x) dx
2
x

Second

Dispersion

Third

Asymmetry

3
( x x ) f x ( x) dx

x = 3 / x3
4

(x )
x

Fourth

Peakedness

Kurtosis, x
Excess coefficient, x

x = 4 / x4

xx

x xk p( xk )

x xi / n

all x's

2
2
xk x Px xk
x all x ' s

x = xx

Skewness coefficient, x

Sample Estimator

x Var ( X )

Standard deviation, x
x Var ( X )

Coefficient of variation,x x = xx
Skewness

Discrete Variable

f x ( x ) dx

1
xi x
n 1

Cv s x

x = 3 / x3

g = m3 / s3

4 ( xk x ) 4 px ( xk ) m4

x = 4 / x4

xx

m3

all x's

1
xi x
n 1

3 ( xk x )3 px ( xk )
all x's

s2

n
xi x
(n 1)(n 2)

n (n 1)
xi x
(n 1)(n 2)(n 3)

k = m4 / s4

Kurtosis (x)
Measure of the peakedness of a distribution.
Related to the 4th central product-moment as

4 E X
x = 2 =
x4
2

For a normal RV, its kurtosis is equal to 3. Sometimes,


coefficient of excess, x=x3, is used.
All feasible distribution functions, skewness coefficient and
kurtosis must satisfy

x2 1 x

Some Commonly Used Distributions


NORMAL DISTRIBUTION

f N x | x , x2 =

1
2 x

1
exp 2

x x

, for < x <



Standardized Variable:
Z

Z has mean 0 and standard deviation 1.

Some Commonly Used Distributions


STANDARD NORMAL DISTRIBUTION:
0.5
0.4

(z ) 0.3
0.2
0.1
0
-3

-2

-1

Some Commonly Used Distributions


LOG-NORMAL DISTRIBUTION
f LN x | ln x , ln2 x

(a) x = 1.0

x=0.
3
x=0.
6
0

x=1.3
3
x

4
0.7

x =1.65

0.6

(b) x = 1.30

0.5
0.4

x =2.25

fLN(x)

1.6
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0

1 ln( x) -
1
ln x
=
exp -
,x>0
ln x
2 ln x x
2

0.3

x =4.50

0.2
0.1
0

3
x

fLN(x)

Some Commonly Used Distributions


Gumbel (Extreme-Value Type I) Distribution

x
F EV 1 x | , = exp exp

x
= 1 exp exp +

for maxima

f EV 1 x | , =

for minima

x
x
1
exp

exp

1
exp +

exp

fEV1(y)
0.4
Max
Min
0.3

0.2

0.1

0
-4

-3

-2

-1

x
+

for maxima

for minima

Some Commonly Used Distributions


Log-Pearson Type 3 Distribution
f P3 x | , , =

- x /

0.30

0.25

0.20
fG(x) 0.15

0.10
0.05
0.00
0

10

12

14

Some Commonly Used Distributions


Log-Pearson Type 3 Distribution
1
f LP 3 x | , , =
x

ln x

- ln( x ) /

with >0, xe when >0 and with >0, xe when <0

Вам также может понравиться