Parameter Estimation For 3-Parameter Pareto

Hydrological Sciences -Journal- des Sciences Hydrologiques,40,2, April 1995
165
Parameter estimation for 3-parameter

generalized pareto distribution by the
principle of maximum entropy (POME)
V. P. SINGH & H. GUO
Department of Civil Engineering, Louisiana State University, Baton Rouge,
Louisiana 70803-6405, USA
Abstract The principle of maximum entropy (POME) is employed to
derive a new method of parameter estimation for the 3-parameter
generalized Pareto (GP) distribution. Monte Carlo simulated data are
used to evaluate this method and compare it with the methods of
moments (MOM), probability weighted moments (PWM), and maximum
likelihood estimation (MLE). The parameter estimates yielded by the
POME are either superior or comparable for high skewness.
Estimation des paramtres d'une loi de Pareto gnralise
trois paramtres par la mthode du maximum d'entropie
Rsum Nous avons utilis le principe du maximum d'entropie en vue
d'tablir une nouvelle mthode d'estimation des paramtres de la
distribution de Pareto gnralise trois paramtres. Des donnes
synthtiques gnres selon une procdure de Monte Carlo ont t
utilises pour valuer cette mthode et pour la comparer aux mthodes
des moments, des moments pondrs et du maximum de vraisemblance.
L'estimation des paramtres s'appuyant sur le principe du maximum
d'entropie est prfrable ou comparable celle des autres mthodes en
particulier lorsque l'asymtrie est forte.
GENERALIZED PARETO DISTRIBUTION

Consider a random variable Y with the standard exponential distribution. Let
a random variable Xbe defined as X = b{\ exp(-aY))/a, where a and b are
parameters. Then the distribution of X is the 2-parameter generalized Pareto
distribution. If c is the threshold or lower bound of X, then the distribution of
X is the 3-parameter generalized Pareto (GP) distribution which can be
expressed as:
F(x) = 1 - 1 - a(x c)
b
= 1 - exp
x c
a jt
a =0
(la)
(lb)
where c is a location parameter, b is a scale parameter, a is a shape parameter,

Open for discussion until 1 October 1995
166
V. P. Singh & N. Guo
and F{x) is the distribution function. The probability density function (PDF) of
the GP distribution is given by:
m = 11b
aix c)
b
xc
b
exp
(2a)
a * 0
(2b)
The Pareto distributions are obtained for a < 0. Figure 1 shows the PDF
for c = 0, b = 1.0, and various values of a. Pickands (1975) has shown that
the GP distribution given by equation (1) occurs as a limiting distribution for
excesses over thresholds if and only if the parent distribution is in the domain
of attraction of one of the extreme value distributions. The GP distribution
reduces to the 2-parameter GP distribution for c = 0, the exponential distribution for a 0 and c = 0, and the uniform distribution on [0, b] for c = 0
and (3 = 1.
(b)
z
o
rj
=>
0.5-
<7)
2o -4
0.3
0.0
0.2
0.4
0.6
08
1.0
1.2
1.4
1.6
Line: a = 0.5; plus: a = 0.75;

star: a = 1.0; and dash: a = 1.25
0.0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
2.0
Line: a = - 0 . 1 ; dash: a = - 0 . 5 ;
plus: a = - 1 . 0
Fig. 1 Probability density function of generalized Pareto distribution with

(a) c = 0, b = 1.0, a = 0.5, 0.75, 1.0 and 1.25; and (b) with c = 0, b =
1.0, a = - 0 . 1 , -0.5 and -1.0.
(1)
(2)
Some important properties of the GP distribution are worth mentioning:

By comparison with the exponential distribution, the GP distribution has
a heavier tail for a < 0 (long-tailed distribution) and a lighter tail for
a > 0 (short-tailed distribution). When a < 0, X has no upper limit;
there is an upper bound c < x < oo for a > 0; and c < x < b/a. This
property makes the GP distribution suitable for the analysis of
independent cluster peaks.
In the context of the partial duration series, a truncated GP distribution
remains a GP distribution with the original shape parameter a remaining
unchanged. This property is popularly referred to as the "threshold
Parameter estimation for generalized Pareto distribution
(3)
167
stability" property. Consequently, if X has a GP distribution for a fixed

threshold level Q0, then the conditional distribution of X - c, given
x > c, corresponding to a higher threshold Q0 + c also has a GP distribution. This is one of the properties that justifies the use of GP
distribution to model excesses.
Let Z = max(c, X{, X2, ..., XN), where N > 0 is a number. If Xh i =
1, 2, ..., N, are independent and identically distributed as a GP
distribution, and N has a Poisson distribution, then Z has a generalized
extreme value distribution (GEV) (Smith, 1984; Jin & Stedinger, 1989;
Wang, 1990), as defined by Jenkinson (1955). Thus, a Poisson process
of exceedance times with generalized Pareto excesses implies the
classical extreme value distributions. As a special case, the maximum of
a Poisson number of exponential varites lias a Gumbel distribution. So
exponential peaks lead to Gumbel maxima, and GP distribution peaks
lead to GEV maxima. The GEV can be expressed as:
l
F(Z)
exp \l-Z~y
13
-
0, z > 0
(3a)
-, -
z-y
(3b)
0
(3
where the parameters <5, (3 and y are independent of z. Furthermore,

<5 = a; that is, the shape parameters of the GEV and GP distributions are
the same. Note that Z is not allowed to take on negative values, and
P(Z < 0) = 0 and P(Z = 0) = exp(-X), and only for z > 0 is the
CDF modelled by the GEV distribution. This property makes the GP
distribution suitable for modelling flood magnitudes exceeding a fixed
threshold.
The properties given in (2) and (3) characterize the GP distribution such
that no other family has either property, making it a practical family for
statistical estimation, provided that the threshold is assumed sufficiently
high.
The failure rate r(x) = f(x)l{\ - F(x)} is expressed as:
exp exp
(4)
(5)
r(x) = l/\b - a(x - c)]

and is monotonie in X, decreasing if a < 0, constant if a = 0 and
increasing if a > 0.
LITERATURE REVIEW
The generalized Pareto (GP) distribution was introduced by Pickands (1975)
and has since been applied to a number of areas including socio-economic
phenomena, physical and biological processes (Saksena & Johnson, 1984),
168
reliability studies and the analysis of environmental extremes. Davison & Smith
(1990) pointed out that the GP distribution might form the basis of a broad
modelling approach to high-level exceedances. DuMouchel (1983) applied it to
estimate the stable index a to measure tail thickness, whereas Davison (1984a,
1984b) modelled contamination due to long-range atmospheric transport of
radionuclides, van Montfort & Witter (1985, 1986) and van Montfort & Otten
(1991) applied the GP distribution to model the peaks over a threshold (POT)
streamflows and rainfall series, and Smith (1984, 1987, 1991) applied it to
analyse flood frequencies and wave heights. Similarly, Joe (1987) employed it
to estimate quantiles of the maximum of iV observations. Wang (1991) applied
it to develop a POT model for flood peaks with Poisson arrival time, whereas
Rosbjerg et al. (1992) compared the use of the 2-parameter GP and exponential
distributions as distribution models for exceedances with the parent distribution
being a generalized GP distribution. In an extreme value analysis of the flow
of Burbage Brook, Barrett (1992) used the GP distribution to model the POT
flood series with Poisson inter-arrival times. Davison & Smith (1990) presented
a comprehensive analysis of the extremes of data by use of the GP distribution
for modelling the sizes and occurrences of exceedances over high thresholds.
Methods for estimating the parameters of the 2-parameter GP distribution
were reviewed by Hosking & Wallis (1987). Quandt (1966) used the method
of moments (MOM), while Baxter (1980) and Cook & Mumme (1981) used the
method of maximum likelihood estimation (MLE) for the Pareto distribution.
The MOM, MLE and probability weighted moments (PWM) were included in
the review, van Montfort & Witter (1986) used the MLE to fit the GP distribution to represent the Dutch POT rainfall series and used an empirical
correction formula to reduce bias of the scale and shape parameter estimates.
Davison & Smith (1990) used the MLE, PWM, a graphical method and least
squares to estimate the GP distribution parameters. Wang (1991) derived the
PWM for both known and unknown thresholds.
OBJECTIVE OF STUDY
The objective of this paper is to develop a new competitive method of
parameter estimation based on the principle of maximum entropy (POME), and
to compare it with the MOM, MLE and PWM using Monte Carlo simulated
data. The review of the literature shows that the POME does not appear to
have been employed for estimating parameters of the GP distribution.
DERIVATION OF PARAMETER ESTIMATION METHOD BY
POME
Shannon (1948) defined entropy as a numerical measure of uncertainty, or
conversely the information content associated with a probability distribution,
169
f(x;8), with a parameter vector 0 and used to describe a random variable X. The
Shannon entropy function H(f) for continuous X can be expressed as:
H(f) = - fl.x;6) \nf(x;0)x
with [/(x;0)dx = l
(4)
where H(f) is the entropy off(x;0), and can be thought of as the mean value of
-\nf(x;d).
According to Jaynes (1961), the minimally biased distribution of X is the
one which maximizes entropy subject to given information, or which satisfies
the principle of maximum entropy (POME). Therefore, the parameters of the
distribution can be obtained by achieving the maximum of H(f). The use of this
principle for generating the least-biased probability distributions on the basis
of limited and incomplete data has been discussed by several authors and has
been applied to many diverse problems (e.g. a recent review by Singh &
Fiorentino (1992)). Jaynes (1968) has reasoned that the POME is the logical
and rational criterion for choosing some specific f(x;d) that maximizes H and
satisfies the given information expressed as constraints. In other words, for
given information (e.g. mean, variance, skewness, lower limit, upper limit,
etc.), the distribution derived by the POME would best represent X; implicitly,
this distribution would best represent the sample from which the information
was derived. Inversely, if it is desired to fit a particular probability distribution
to a sample of data, then the POME can uniquely specify the constraints (or the
information) needed to derive that distribution. The distribution parameters are
then related to these constraints. An excellent discussion of the underlying
mathematical rationale is given in Levine & Tribus (1979).
Given m linearly independent constraints Ch i = 1,2, ..., m, in the form
C. = \wfx)f{x;6)x,
i = 1,2,..., m
(5)
where wt(x) are some functions whose averages over f(x;6) are specified, then
the maximum of H subject to equation (5) is given by the distribution:
f(x;6) = exp -a0~ a,-w,-(x)
(6a)
(=i
where ah i = 0, 1, 2, ..., m, are the Lagrange multipliers, and can be

determined from equations (5) and (6a). Inserting equation (6a) in equation (4)
yields the entropy off(x;6) in terms of the constraints and Lagrange multipliers:
m
H(f) = % + Yja,Ci
(6b)
Maximization of H then establishes the relationships between constraints

and Lagrange multipliers. Thus, to derive a method using the POME for the
estimation of the parameters a, b and c of equation (2), three steps are
170
involved: (i) specification of the appropriate constraints; (ii) derivation of the

entropy of the distribution; and (iii) derivation of the relationships between the
Lagrange multipliers and constraints. A complete mathematical discussion of
this method can be found in Tribus (1969), Jaynes (1968), Levine & Tribus
(1979) and Singh & Rajagopal (1986).
Specification of constraints
The entropy of the GP distribution can be derived by inserting equation (1) in
equation (4):
H(f) = lnof/fr;0)dc-
1-1
. __ a(x _ c)
In
f(x;d)dx
( 6c )
a
Comparing equation (6c) with equation (6b), the constraints appropriate for
equation (3) can be written (Singh & Rajagopal, 1986) as:
(7)
\f{x;d) x = 1
In
, _ a(x c)
f(x;6)dx = E In
_ a(x c)
b
(8)
in which E[*] denotes expectation of the bracketed quantity. These constraints

are unique and specify the information that is sufficient for the GP distribution.
The first constraint specifies the total probability. The second constraint
specifies the mean of the logarithm of the inverse ratio of the scale parameter
to the failure rate. Conceptually, this defines the expected value of the negative
logarithm of the scaled failure rate. The distribution parameters are related to
these constraints.
Construction of the entropy function

The PDF of the GP distribution corresponding to the POME and consistent
with equations (7) and (8) takes the form:
f{x;d) = exp aQ fljln 1
a(x - c)
(9)
where aQ and ax are Lagrange multipliers. The mathematical rationale for

equation (9) has been presented by Tribus (1969).
By applying equation (3) to the total probability condition in equation
(7), one obtains:
111
a(x - c)
(10)
exp(a0)
exp
-jln
1-.
dx
which yields the partition function:

b 1
exp(a0) = a 1 -a,
(11)
The zeroth Lagrange multiplier is given by:

a0 = In
b 1
a Ia,
(12)
Inserting equation (11) in equation (9) yields:

Ax-B)
a(\ a,)
1-
a(x c)
(13)
A comparison of equation (13) with equation (3) yields:

1
Ia,
(14)
=
a
Taking logarithms of equation (13) gives:

lnf(x;d) = lna+ln(l - a x )
-Inb-a^n
aix - c)
b
(15)
Therefore, the entropy H(J) of the GP distribution follows:

H(f) = lna ln(l a{) +lnb+alE\ In 1 -
a(x c)
(16)
Relationships between distribution parameters and constraints

According to Singh & Rajagopal (1986), the relationships between the
distribution parameters and constraints are obtained by taking partial derivatives
of the entropy H(f) with respect to the Lagrange multipliers as well as the
distribution parameters, and then equating these derivatives to zero, and making
use of the constraints. To that end, taking partial derivatives of equation (16)
with respect to ax, a, b and c separately and equating each derivative to zero
yields:
dH
da,
dH
=
da
1 + E In
1
I a,
a(x _- c)
(x - c)lb
-~atE 1 - a(x - c)lb
=0
(17)
(18)
172
dH
~db
j j - E
dH
= ajE
dc
(x - c)/6
1 a(x-c)lb
-
(19)
1
=0
1 a(x c)lb
(20)
Simplification of equations (17) to (20) yields, respectively:

In
a(x c)
b
1
I-a,
(x - c)lb
1 a(xc)/b
aa,
(x c)lb
1 a(x c)/ft
aa,
(21)
(22)
(23)
1
1 - a(x c)lb
(24)
Clearly, equation (24) does not hold. Equation (22) is the same as equation
(23). In order to get a unique solution, additional equations are needed which
can be obtained by differentiating the zeroth Lagrange multiplier with respect
to the Lagrange multipliers and equating the derivatives to zero. To that end,
equation (10) is written as:
aQ = In exp fljln 1 - a(x c)
dx
(25)
Differentiating equation (25) with respect to ax:

00
exp{fljln[l a(x c)/b]}ln[l a(x c)/b]dx

da,
exp[a0ln{l a(x c)/b}]dx
^{-o.-aMl-^-cVbmi-aix-Omx
-E{[1-a(x-c)/b]}
(26)
173
Following Tribus (1969):

(27)
var{ln[l - a(x - c)/b]}

da,
where var[] is the variance of the bracketed quantity. From equation (11):
a0 = \n(b/a)-\n(l-al)
(28)
Differentiating equation (28) with respect to a{:

da0 _
dax
(29)
1 flj
d\
da,
a-^r
(30)
Equating equation (29) to equation (26) leads to:

In 1 -
a(x - c)
b
1
I a,
(31)
which is the same as equation (21). When equation (30) is equated to equation
(27), the following is obtained:
var In 1
a(x c)
(l-^)2
(32)
Therefore, the parameter estimation equations for the POME consist of

equations (21), (22) and (32). Inserting ax = 1 - lia from equation (14) into
these three equations, one gets:
1-
a(x c)
1
1 a(x c)lb
var In
a(x _ c)
= a
(33)
(34)
I a
= a
(35)
T H R E E O T H E R M E T H O D S O F P A R A M E T E R ESTIMATION
Three of the most popular methods of parameter estimation are the method of
moments (MOM), the method of probability-weighted moments (PWM), and
the method of maximum likelihood estimation (MLE). The POME does not
174
V, P. Singh & N. Guo
appear to have been used for estimating parameters of the GP distribution.

Therefore, virtually no literature exists on the comparison of parameter
estimates by the POME with those by the MLE, PWM and MOM. For the
sake of completeness, these methods are briefly summarized.
Method of moments (MOM)

Moment estimators of the GP distribution were derived by Hosking & Wallis
(1987). Note that E(l - a(x - c)lb)r = 1/(1 + ar) if 1 + ra > 0. The rth
moment of X exists if a > -1/r. Provided that they exist, then the moment
estimators are:
x = c+Ji-
(36)
l+a
9
S2 =
b2
(l+fl) 2 (l+2a)
G = 2(l-Q)(l+2fl) 0 - 5
1 +3a
(37)
(38)
where x, S2 and G are the mean, variance and skewness, respectively. First,
the moment estimate of a is obtained by solving equation (38). The relation
between G and a is illustrated in Fig. 2. With a calculated, b and c follow
from equation (36) and (37) as:
b = S(l+a)(l+2af5
(39)
c = x--
(40)
b+a
Probability-weighted moments (PWM)

The PWM estimators for the GP distribution (Hosking & Wallis, 1987) are
given as:
a b =
o~SWi-9W2
-W 0 + 4W1-3W2
(Wo-2WJ(Wo-3W2)(-4Wl+6W2)
(-W0+4Wj-3F2)2
2W Q F 1 -6W 0 W 2 + 6W1W2
~W0+4Wl-3W2
(41)
175
i.o-
0.81
xi 0.6
CC 0.4-
tn o.2-i
<
0-0
a:
< -0.2
-0.4-1
-0.6
-0.8
-1.0
SKEWNESS G
Fig. 2 Parameter a vs skewness G for GPD3.
where the rth probability-weighted moment Wr is:

l
Wr = E[x(F)(l~F(x)Y]
1
r+1
ft
{c + - [ l - ( l - F ) a ] } ( l - f ) ' ' d F
a a+r+ 1
r = 0,1,2,..
(44)
Method of maximum likelihood estimation

The MLE estimators can be expressed as:
j,
(Xj-cVb =
frf 1 a(x(. - c)lb
^ _
Ia
(45)
J 2 ln[l - a(x( - c)/b] = na
(46)
A maximum likelihood estimator cannot be obtained for c, because the

likelihood function is unbounded with respect to c, as shown in Fig. 3. Since
c is the lower bound of the random variable X, we may use the constraint
c < xx, the lowest sample value. Clearly, the likelihood function is maximum
with respect to c when c = xv
o-i-2-3g -4o -5ZD -6-7-
CL
-8-9-
X -10_l -IIUJ -12-13-14-15-^

OJO
0.1
0.2
03
0.4
0.6
0.5
0.7
PARAMETER c
Line: a = - 0 . 1 1 6 , b = 0.387, c = 0.562;
dash: a = 0.544, b = 1.116, c = 0.277
Fig. 3 Likelihood function of GPD3 vs parameter c for sample size 10.
APPLICATION TO MONTE CARLO-SIMULATED DATA

Monte Carlo samples
To assess the performance of the POME estimation method by comparison with
the MOM, PWM and MLE, Monte Carlo sampling experiments were conducted. Two distribution population cases, listed in Table 1, were considered.
For each population case, 1000 random samples of size 20, 50 and 100 were
generated, and then parameters and quantiles were estimated.
Table 1 GP distribution population cases considered in the sampling

experiment
Parameters
GP distribution
population
cv
Case 1
Case 2
0.5
0.5
0.5
2.5
C = coefficient of variation.
a
0.554
-0.069
1.116
0.433
0.277
0.536
111
Performance indices
The performance of the POME was evaluated using the following performance
indices:
BIAS = ( * > ~ *
x
Standard bias
Root mean square error RMSE =
(47)
A ,2i0.5
KX-X) 1 '
(48)
where x is an estimate of x (parameter or quantile) and:

N
< 49 >
W) = Njt*i
i
i=
where TV is the number of Monte Carlo samples (N = 1000 in this study). 1000
may arguably not be a large enough number of samples to produce the true
values of BIAS and RMSE, but will suffice to compare the performances of the
estimation methods.
BIAS in parameter estimation

The bias of parameters estimated by the four methods is summarized in
Table 2. For G = 0.5, in absolute terms the MOM produced the least bias of
the four methods for all sample sizes. The MLE had the second least bias in
the parameter estimates. With increasing sample size, there was significant
reduction in bias for all four methods. The POME produced less bias than the
PWM in estimates of b and c for all sample sizes, but that was not uniformly
true in the case of the estimate of parameter a. When G = 2.5, these methods
performed quite differently. For all samples sizes, the MLE and the POME
Table 2 BIAS of parameter estimates
G = 0.5
Sample size Method
G = 2.5
b
20
MOM
PWM
MLE
POME
0.156
0.488
0.217
-0.397
0.094
0.632
0.037
-0.122
-0.053
-0.948
0.215
-0.094
-4.144
-9.141
0.474
0.013
0.509
1.799
-0.077
0.147
-0.143
-0.584
0.034
-0.094
50
MOM
PWM
MLE
POME
0.063
0.230
0.132
-0.407
0.042
0.258
0.060
-0.096
-0.025
-0.396
0.067
-0.156
-1.981
-3.821
0.244
0.009
0.260
0.626
-0.024
0.115
-0.085
-0.231
0.009
-0.079
100
MOM
PWM
MLE
POME
0.040
0.132
0.086
-0.288
0.028
0.138
0.048
-0.060
-0.019
-0.208
0.039
-0.126
-1.196
-1.964
0.185
0.012
0.165
0.304
-0.017
0.099
-0.057
-0.116
0.008
-0.068
178
were comparable, producing the least bias. For the a and c parameter
estimates, the POME had the least bias, but the MLE had the least bias for the
b parameter estimate. The PWM had the highest bias in all three parameter
estimates for all sample sizes. Thus, if the value of G is high, the POME or
MLE may be the preferred method. For lower values of G, the MOM or MLE
may be preferable, especially when the sample size is small.
RMSE in parameter estimation

The values of RMSE of parameters estimated by the four methods are given in
Table 3. For G = 0.5, of the four methods the MOM produced the least
RMSE in the a parameter estimate. However, as the sample size increased, the
MOM, PWM and MLE became comparable. In the cases of the b and c parameter estimates, the MLE had the least RMSE, but all four methods were
comparable. For G = 2.5, the comparative behaviour of the four methods was
markedly different. In absolute terms, the MOM and the PWM produced the
highest RMSE in parameter estimates for all sample sizes, with the POME
having the least bias in the a parameter estimate but the MLE in the b and c
parameter estimates. Thus, it may be concluded that for lower values of G, the
MOM or PWM may be the preferred method, but for higher values of G, the
MLE or POME is the preferred method.
BIAS in quantile estimation

The results of bias in quantile estimates by the GP distribution are summarized
in Table 4. The performance of the four estimation methods varied with the
value of G, and probability of non-exceedance P. For G = 0.5, all four
methods had comparable bias for P < 0.9 for all sample sizes. When P >
0.99, the MOM and the PWM produced the smallest bias and the POME the
Table 3 RMSE of parameter estimates
Method
MOM
PWM
MLE
POME
0.448
0.780
0.502
0.785
0.310
0.820
0.284
0.371
0.336
0.984
0.357
0.348
-5.178
--10.990
-1.926
-0.067
0.688
2.005
2.580
0.394
0.205
0.593
0.053
0.182
MOM
PWM
MLE
POME
0.301
0.419
0.329
0.696
0.213
0.365
0.234
0.271
0.201
0.427
0.146
0.262
-2.785
-4.830
-1.475
-0.061
0.376
0.710
0.177
0.250
0.120
0.236
0.019
0.125
MOM
PWM
MLE
POME
0.203
0.268
0.224
0.590
0.144
0.211
0.176
0.233
0.139
0.237
0.056
0.185
-1.925
-2.710
-1.205
-0.061
0.249
0.360
0.127
0.181
0.083
0.121
0.011
0.097
179
Table 4 BIAS and RMSE of quantile estimates

G = 0.5
G = 2.5
Sample size
Method
BIAS
0.8
20
MOM
PWM
MLE
POME
0.000
0.091
-0.011
-0.030
0.112
0.152
0.118
0.128
0.058
0.169
-0.018
0.046
0.172
0.224
0.134
0.176
50
MOM
PWM
MLE
POME
0.001
0.041
0.010
0.000
0.078
0.090
0.093
0.076
0.037
0.083
-0.004
0.033
0.107
0.125
0.090
0.109
100
MOM
PWM
MLE
POME
-0.012
0.076
-0.037
0.031
0.098
0.131
0.153
0.151
0.024
0.115
-0.068
0.068
0.197
0.225
0.221
0.221
20
MOM
PWM
MLE
POME
-0.021
0.153
-0.082
-0.062
0.149
0.231
0.157
0.158
0.015
0.186
-0.047
0.064
0.273
0.348
0.224
0.271
50
MOM
PWM
MLE
POME
-0.004
0.032
-0.005
0.066
0.065
0.074
0.072
0.126
0.024
0.063
-0.001
0.051
0.123
0.131
0.106
0.137
100
MOM
PWM
MLE
POME
0.000
0.018
0.004
0.048
0.043
0.048
0.055
0.092
0.019
0.038
0.001
0.044
0.084
0.085
0.073
0.098
20
MOM
PWM
MLE
POME
-0.026
0.036
-0.067
0.287
0.113
0.153
0.128
0.484
-0.131
-0.129
0.031
0.104
0.309
0.372
0.286
0.297
50
MOM
PWM
MLE
POME
-0.009
0.007
-0.029
0.323
0.070
0.087
0.063
0.491
-0.059
-0.074
0.031
0.080
0.205
0.235
0.203
0.186
100
MOM
PWM
MLE
POME
-0.005
0.002
0.014
0.230
0.048
0.060
0.039
0.399
-0.031
-0.065
0.023
0.070
0.154
0.165
0.150
0.135
20
MOM
PWM
MLE
POME
-0.022
0.028
-0.063
0.582
0.141
0.192
0.174
0.888
-0.266
-0.296
0.152
0.121
0.427
0.600
0.572
0.332
50
MOM
PWM
MLE
POME
-0.005
0.000
-0.034
0.612
0.090
0.113
0.079
0.906
-0.141
-0.198
0.100
0.093
0.310
0.393
0.406
0.207
100
MOM
PWM
MLE
POME
-0.004
-0.004
-0.019
0.439
0.063
0.078
0.047
0.474
-0.083
-0.120
0.069
0.081
0.252
0.289
0.295
0.151
0.9
0.99
0.999
RMSE
BIAS
RMSE
highest, with the MLE in the intermediate range. However, for G = 2.5, the
POME produced the least bias, especially when P was greater than 0.99. For
all sample sizes, all four methods were somewhat comparable. In conclusion,
180
for lower values of G, anyone of the four methods may be used for P < 0.99,
but the PWM, MOM or MLE may be preferable for P exceeding 0.99. For
higher values of G, all four methods were comparable, but for P exceeding
0.99 the POME is the preferred method.
RMSE in quantile estimation

The values of RMSE in quantile estimates for the four methods are given in
Table 4. For G = 0.5 and P < 0.9, all four methods produced comparable
values of RMSE for all sample sizes; for P > 0.99, the performance of the
POME deteriorated. When G = 2.5, all methods produced comparable values
of RMSE for all sample sizes for P < 0.9; for P > 0.99 the POME had the
least RMSE. Thus, it is inferred that the MOM, PWM or MLE may be used
for smaller values of G, but for higher values of G, the POME may be the
preferred method.
CONCLUSIONS
The following conclusions can be drawn from this study: (1) the POME offers
an alternative method for estimating the parameters of the 3-parameter
generalized Pareto distribution; (2) when the skewness was high (G = 2.5), the
POME yielded superior parameter estimates; (3) for low skewness (G = 0.5),
the POME was better in parameter estimates than the MLE and PWM but
worse than the MOM; however, for large sample size, its performance
improved significantly; (4) the POME produced either better or comparable
quantile estimates as compared with the MOM, MLE and PWM for high
skewness (G = 2.5); (5) for low skewness (G = 0.5), the POME was
comparable to the MOM, the MLE and the PWM for lower probabilities of
nonexceedance which for higher values, the MOM or PWM was better than the
POME.
REFERENCES
Barrett, J. H. (1992) An extreme value analysis of the flow of Burbage Brook. Stochastic Hydrol. Hydraul.
6, 151-165.
Baxter, M. A. (1980) Minimum variance unbiased estimation of the parameter of the Pareto distribution.
Biometrika 27, 133-138.
Cook, W. L. & Mumme, D. C. (1981) Estimation of Pareto parameters by numerical methods. In:
Statistical Distributions in Scientific Work, d. C. Taillie et al. 5, 127-132.
Davison, A. C. (1984a) Modelling excesses over high thresholds, with an application. In: Statistical
Extremes and Applications, ed. J. Tiago de Oliveira, 461-482. Reidel, Dordrecht, The Netherlands.
Davison, A. C. (1984b) A statistical model for contamination due to long-range atmospheric transport of
radionuclides. PhD thesis, Department of Mathematics, Imperial College of Science and
Technology, London, UK.
Davison, A. C. & Smith, R. L. (1990) Models for exceedances over high thresholds. /. Roy. Statist. Soc.
B 52(3), 393-442.
181
DuMouchel, W. (1983) Estimating the stable index a in order to measure tail thickness. Ann. Statist. 11,
1019-1036.
Hosking, J. R. M. & Wallis, J. R. (1987) Parameter and quantile estimation for the generalized Pareto
distribution. Technometrics 29(3), 339-349.
Jaynes, E. T. (1961) Probability Theory in Science and Engineering. McGraw-Hill, New York, USA.
Jaynes, E. T. (1968) Prior probabilities. IEEE Trans. Syst. Man. Cybern. 3(SSC-4), 227-241.
Jenkinson, A. F. (1955) The frequency distribution of the annual maximum (or minimum) of meteorological
elements. Quart. J. Roy. Meteorol. Soc. 81, 158-171.
Jin, M. & Stedinger, J. R. (1989) Partial duration series analysis for a GEV annual flood distribution with
systematic and historical flood information (unpublished paper). Department of Civil Engineering,
Pennsylvania State University, State College, PA, USA.
Joe, H. (1987) Estimation of quantiles of the maximum of N observations. Biometrika 74, 347-354.
Levine, R. D. & Tribus, M. (1979) The Maximum Entropy Formalism. MIT Press, Cambridge,
Massachusetts, USA.
Pickands, J. (1975) Statistical inference using extreme order statistics. Ann. Statist. 3, 119-131.
Quandt, R. E. (1966) Old and new methods of estimation of the Pareto distribution. Biometrika 10, 55-82.
Rosbjerg, D., Madsen, H. & Rasmussen, P. F. (1992) Prediction in partial duration series with generalized
Pareto-distributed exceedances. Wat. Resour. Res. 28(11), 3001-3010.
Saksena, S. K. & Johnson, A. M. (1984) Best unbiased estimators for the parameters of a two-parameter
Pareto distribution. Biometrika 31, 77-83.
Shannon, C. E. (1948) The mathematical theory of communication, I-IV. Bell System Tech. J. 27, 279-428,
612-656.
Singh, V. P. & Fiorentino, M. (1992) A historical perspective of entropy applications in water resources.
In: Entropy and Energy Dissipation in Water Resources, ed. V. P. Singh & M. Fiorentino, 21-61.
Kluwer, Dordrecht, The Netherlands.
Singh, V. P. & Rajagopal, A. K. (1986) A new method of parameter estimation for hydrologie frequency
analysis. Hydrol. Sci. Technol. 2(3), 33-40.
Smith, J. A. (1991) Estimating the upper tail of flood frequency distributions. Wat. Resour. Res. 23(8),
1657-1666.
Smith, R. L. (1984) Threshold methods for sample extremes, In: Statistical Extremes and Applications ed.
J. Trago de Oliveira, 621-638. Reidel, Dordrecht, The Netherlands.
Smith, R. L. (1987) Estimating tails of probability distributions. Ann. Statist., 15, 1174-1207.
Tribus, M. (1969) Rational Descriptions, Decisions and Designs. Pergamon, New York, USA.
van Montfort, M. A. J. & Witter, J. V. (1985) Testing exponentiality against generalized Pareto
distribution. /. Hydrol. 78, 305-315.
van Montfort, M. A. J. & Witter, J. V. (1986) The generalized pareto distribution applied to rainfall
depths. Hydrol. Sci. J. 31(2), 151-162.
van Montfort, M. A. J. & Otten, A. (1991) The first and the second e of the extreme value distribution,
EV1. Stochastic Hydrol. Hydraul. 5, 69-76.
Wang, Q. J. (1990) Studies on statistical methods of flood frequency analysis. PhD dissertation, National
University of Ireland, Galway, Ireland.
Wang, Q. J. (1991) The POT model described by the generalized Pareto distribution with Poisson arrival
rate. X Hydrol. 129, 263-280.
Received 8 February 1993; 22 September 1994

Parameter Estimation For 3-Parameter Pareto

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Parameter Estimation For 3-Parameter Pareto

Загружено:

Авторское право:

Доступные форматы

Hydrological Sciences -Journal- des Sciences Hydrologiques,40,2, April 1995

Parameter estimation for 3-parameter

GENERALIZED PARETO DISTRIBUTION

where c is a location parameter, b is a scale parameter, a is a shape parameter,

V. P. Singh & N. Guo

Line: a = 0.5; plus: a = 0.75;

Fig. 1 Probability density function of generalized Pareto distribution with

Some important properties of the GP distribution are worth mentioning:

Parameter estimation for generalized Pareto distribution

stability" property. Consequently, if X has a GP distribution for a fixed

where the parameters <5, (3 and y are independent of z. Furthermore,

r(x) = l/\b - a(x - c)]

V. P. Singh & N. Guo

Parameter estimation for generalized Pareto distribution

where ah i = 0, 1, 2, ..., m, are the Lagrange multipliers, and can be

Maximization of H then establishes the relationships between constraints

V. P. Singh & N. Guo

involved: (i) specification of the appropriate constraints; (ii) derivation of the

in which E[*] denotes expectation of the bracketed quantity. These constraints

Construction of the entropy function

where aQ and ax are Lagrange multipliers. The mathematical rationale for

Parameter estimation for generalized Pareto distribution

which yields the partition function:

The zeroth Lagrange multiplier is given by:

Inserting equation (11) in equation (9) yields:

A comparison of equation (13) with equation (3) yields:

Taking logarithms of equation (13) gives:

Therefore, the entropy H(J) of the GP distribution follows:

Relationships between distribution parameters and constraints

-~atE 1 - a(x - c)lb

V. P. Singh & N. Guo

Simplification of equations (17) to (20) yields, respectively:

Differentiating equation (25) with respect to ax:

exp{fljln[l a(x c)/b]}ln[l a(x c)/b]dx

Parameter estimation for generalized Pareto distribution

Following Tribus (1969):

var{ln[l - a(x - c)/b]}

Differentiating equation (28) with respect to a{:

Equating equation (29) to equation (26) leads to:

Therefore, the parameter estimation equations for the POME consist of

V, P. Singh & N. Guo

appear to have been used for estimating parameters of the GP distribution.

Method of moments (MOM)

Probability-weighted moments (PWM)

Parameter estimation for generalized Pareto distribution

where the rth probability-weighted moment Wr is:

Method of maximum likelihood estimation

J 2 ln[l - a(x( - c)/b] = na

A maximum likelihood estimator cannot be obtained for c, because the

V. P. Singh & N. Guo

o-i-2-3g -4o -5ZD -6-7-

X -10_l -IIUJ -12-13-14-15-^

Fig. 3 Likelihood function of GPD3 vs parameter c for sample size 10.

APPLICATION TO MONTE CARLO-SIMULATED DATA

Table 1 GP distribution population cases considered in the sampling

Parameter estimation for generalized Pareto distribution

Root mean square error RMSE =

where x is an estimate of x (parameter or quantile) and:

BIAS in parameter estimation

V. P. Singh & N. Guo

RMSE in parameter estimation

BIAS in quantile estimation

Parameter estimation for generalized Pareto distribution