Академический Документы
Профессиональный Документы
Культура Документы
Abstract. We introduce and study the beta exponentiated Nadarajah-Haghighi model, which
has increasing, decreasing, upside-down bathtub and bathtub shaped hazard functions. Some of
its mathematical properties are determined including a power series for the quantile function. We
perform a Monte Carlo simulation study to assess the finite sample behavior of the maximum
likelihood estimates of the parameters. We define a new regression model based on the new
distribution. The potentiality of this regression model is proved empirically by means of a real
dataset related to diabetic retinopathy study.
1. Introduction
The exponential distribution is the first lifetime model for which statistical methods were exten-
sively developed in the life testing literature. In many applied sciences such as medicine, engineering
and finance, among others, modeling and analyzing lifetime data are crucial. Several lifetime dis-
tributions have been used to model these types of data, including the exponential, Weibull, gamma
and Rayleigh distributions and their generalizations (see, e.g., [1, 9]). Each distribution has its
own characteristics due specifically to the shapes of the hazard rate function (hrf), which can be
monotonically decreasing or increasing, bathtub and unimodal.
Nadarajah and Haghighi [10] introduced an extension of the exponential distribution as an alter-
native to the gamma, Weibull and exponentiated exponential (EE) distributions. The cumulative
distribution function (cdf) of the Nadarajah-Haghighi (NH) distribution is
α
G(x) = 1 − e1−(1+λx) , x > 0, (1.1)
where λ > 0 and α > 0 are scale and shape parameters, respectively. The corresponding probability
density function (pdf) and hrf are
α
g(x) = α λ (1 + λ x) α−1 e1−(1+λx) (1.2)
α−1
and h(x) = α λ (1 + λx) , respectively.
Equation (1.2) has two parameters like the gamma, Weibull and EE distributions. Note also
that the NH model has closed-form survival and hrf such as the Weibull and EE distributions.
For α = 1, it becomes the exponential distribution. For general properties about the NH model,
the reader is referred to [10]. The NH distribution has its mode at zero and allows for increasing,
decreasing and constant hrfs.
Lemonte [8] proposed a three-parameter generalization of the NH distribution called the expo-
nentiated NH (ENH) model with cdf
{ α
}θ
G(x; α, λ, θ) = 1 − e1−(1+λx) , (1.3)
where α > 0 and θ > 0 are shape parameters and λ > 0 is the scale parameter. The ENH density
function is
α
{ α
}θ−1
g(x; α, λ, θ) = αλθ(1 + λx)α−1 e1−(1+λx) 1 − e1−(1+λx) . (1.4)
The beta generalized family pioneered by Eugene et al. [3] includes nearly all of well-known
models as special or limiting cases such as those exponentiated distributions. It is a rich class of
generalized distributions, which allows for greater flexibility of its tails and can be widely applied
in many areas such as engineering, biology and medicine, among others. One major benefit of this
family is its ability of fitting skewed data that cannot be properly fitted by existing distributions.
Several models have been investigated in the beta family in the last fifteen years. In fact,
this class has been studied in the literature for some special baselines, among them, we cite the
beta normal (BN) [3], beta exponential (BE) [9], beta generalized half-normal (BGHN) [11], beta
generalized exponential (BGE) [13], beta generalized Weibull (BGW) [12], beta gamma (BG) [7]
and beta exponentiated Weibull (BEW) [1] distributions.
The cdf of the beta-G class takes the form
∫ G(x)
1 B(G(x); a, b)
F (x) = w(a−1) (1 − w)b−1 dw = , (1.5)
B(a, b) 0 B(a, b)
where G(x) denotes the baseline cdf, a > 0 and b > 0 are two additional parameters, B(a, b) =
∫ 1 a−1 ∫z
0
t (1 − t)b−1 dt is the beta function and B(z; a, b) = 0 ta−1 (1 − t)b−1 dt is the incomplete beta
function. The role of these parameters is to control skewness and vary tail weights of the generated
model.
The pdf corresponding to (1.5) is
1 b−1
f (x) = G(x)a−1 {1 − G(x)} g(x), (1.6)
B(a, b)
d
where g(x) = dx G(x). The density f (x) will be most tractable when G(x) and g(x) have simple
forms.
The rest of the paper is organized as follows. In Section 2, we define the beta exponentiated
Nadarajah-Haghighi (BENH) distribution by using the ENH model as baseline in the beta family.
We show the flexibility of its hrf and present some special models. In Section 3, we study some
of its structural properties including ordinary and conditional moments and mean deviations. We
derive a useful power series for its quantile function (qf) in Section 4. In Section 5, we obtain the
maximum likelihood estimates (MLEs) of the unknown parameters. A Monte Carlo simulation
study is performed in Section 6. We define the logarithm of the BENH distribution in Section
7. We propose a linear regression model based on the log-transformed variable in Section 8. In
Section 9, we provide an application from medical data to illustrate that the new regression model
can yield better fits than some other known lifetime regression models. Finally, Section 10 offers
some conclusions.
Henceforth, let X ∼ BEN Ha,b (λ, θ, α) be a random variable with cdf (2.1). The pdf of X is
2.1. Shapes of the density and hazard functions. The BENH density (2.2) is much more
flexible than the ENH density since it allows for greater flexibility of the tails. In fact, it can
approach different distributions when its parameters change. Figures 1 and 2 reveal that the
BENH density function can exhibit different behaviors depending on its parameter values. The
density function (2.2) provides more flexible forms than the ENH model and other extended forms
of the exponential and NH distributions. Further, Figure 3 and 4 display increasing, decreasing,
upside-down bathtub and bathtub shaped forms for the hrf of X, respectively.
5
3
1.0 4
2 3
0.5 2
1
1
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.2 0.4 0.6 0.8 1.0
Figure 1. Plots of the BENH density function. (a) a = 1.9, b = 3.9, λ = 0.5, θ =
2.3 and α = 2 (dotted line), α = 3 (dashed line), α = 4 (solid line), α = 5 (thick
line). (b) a = 2.9, b = 1.9, λ = 1.3, α = 1.3 and θ = 1 (dotted line), θ = 3 (dashed
line), θ = 5 (solid line), θ = 7 (thick line). (c) a = 9, b = 1.5, α = 3, θ = 1.5 and
λ = 1 (dotted line) λ = 1.3, (dashed line), λ = 1.5 (solid line), λ = 1.7 (thick
line).
2.2. Special models of the BENH distribution. The BEN Ha,b (λ, θ, α) model contains as spe-
cial cases some well-known distributions. The BEN Ha,b (λ , 1 , α) model is the Beta NH (BNH) dis-
tribution [2], BEN Ha,b (λ, θ, 1) is the beta generalized exponential (BGE) [13], BEN Ha,b (λ, 1, 1) is
the beta exponential (BE) pioneered by [9], BEN H1,1 (λ , θ , α) is the ENH [8], BEN H1,1 (λ, 1, α)
is identical to the NH model [10] and BEN H1,1 (λ, θ, 1) is the generalized exponential (GE) [5].
2.3. Simulation. The BEHN distribution is easily simulated from (2.1) as follows: if v is a gene-
rated beta variate with shape parameters a and b, then
{[ ( )]1/α }
−1
x=λ 1 − log 1 − v 1/θ
−1 (2.3)
(d) (e)
2.5
1.5
2.0
1.5 1.0
1.0
0.5
0.5
0.6
8
0.5
6
0.4
0.3
4
0.2
2
0.1
0.0 0.5 1.0 1.5 2.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2
Figure 3. Plots of the BENH hazard function. (f) a = 2.5, b = 1.5, α = 1.4, λ =
0.3, θ = 0.6 and (g) a = 2, b = 4, α = 1, λ = 1.1, θ = 0.5.
12 5
10 4
8
3
6
2
4
1
2
2.4. Linear representation. Equations (2.1) and (2.2) are straightforward to be evaluated using
any software with algebraic facilities. In this section, we derive a linear representation for the
BENH density in terms of EHN densities.
THE BETA EXPONENTIATED NADARAJAH-HAGHIGHI DISTRIBUTION 5
where Γ(·) is the gamma function. By using (2.4), the pdf (2.2) can be rewritten as
∞
αλθ ∑ (−1)j Γ(b) 1−(1+λ x)α { α
} θ (a+j)−1
f (x; φ) = e (1 + λx)α−1 1 − e1−(1+λx)
B(a, b) j=0 Γ(b − j)j!
and then
∞
∑
f (x; φ) = vj fEN H (x; α, λ, (a + j)θ), (2.5)
j=0
where fEN H (x; α, λ, (a + j)θ) denotes the ENH density with parameters α, λ and (a + j)θ and the
coefficient vj (for j ≥ 0) is
(−1)j Γ(b)
vj = .
(a + j) j! Γ(b − j) B(a, b)
Equation (2.5) reveals that the BENH density is an infinite linear combination of ENH densities
with parameters α, λ and (a + j)θ (for j ≥ 0) (see [8]). It is the main result of this section.
3. Mathematical properties
We obtain some structural properties of the BENH distribution, specifically moments and mean
deviations. Established algebraic expansions to determine some structural properties of this distri-
bution can be more efficient than computing those directly by numerical integration of the density
function (2.2), which can be prone to rounding off errors among others. The formulae derived
throughout the paper can be easily handled in softwares like Mathematica and Maple.
Theorem 3.1. The rth ordinary moment of X ∼ BEN Ha,b (λ, θ, α) can be expressed as
∞ ( )( ) ( )
θ ∑ ∑ (−1)r+k−i ek+1 vj (a + j)θ − 1
r
′ r r i
µr = E(X ) = r i Γ + 1, k + 1 . (3.1)
λ (k + 1) α +1 k i α
j,k=0 i=0
∫∞
Remark 1. The conditional moments of X, say E(X s | X > t) = t xs f (x; φ)dx, are
∑∞ ( ) ∫ ∞
(a + j)θ − 1
xs (1 + λx)α−1 e−(k+1)(1+λx) dx
α
E(X s | X > t) = αλθ (−1)k ek+1 vj
k t
j,k=0
∞ ( )( ) ( )
θ ∑ ∑ (−1)s+k−i ek+1 vj (a + j)θ − 1
s
s i α
= s i Γ + 1, (k + 1)(1 + λt) .
λ i=0 (k + 1) α +1 k i α
j,k=0
Generally, there has been a great interest on the first incomplete moment of a distribution.
Based on this quantity, we can obtain, for example, the mean deviations that provide important
information about characteristics of a population. Indeed, the amount of dispersion in a population
may be measured to some extent by all the deviations from the mean and median. The mean
deviations of X about the mean µ′1 = E(X) and about the median M can be expressed as δ1 =
2µ′1 F (µ′1 )−2m1 (µ′1 ) and δ2 = µ−2m1 (M ), respectively, where F (µ′1 ) is easily evaluated∫ from (2.1),
z
the median M can follow from (2.1) as the solution of F (M ; φ) = 0.5 and m1 (z) = 0 x f (x)dx.
We can write
∑∞ ( ) ∫ z
(a + j)θ − 1
x (1 + λx)α−1 e−(k+1)(1+λx) dx
α
m1 (z) = α λ θ (−1)k ek+1 vj
k 0
j,k=0
∞ ( )( ) ( ){ }
θ ∑ ∑
1
(a + j)θ − 1 1 i
= (−1)1+k−i ek+1 vj Γ + 1, k + 1 1 − (1 + λz)α .
λ i=0
k i α
j,k=0
4. Quantile function
In this section, we derive a power series for the qf of the BENH distribution by inverting (2.1)
as x = Q(u) = F −1 (u). First, we can expand (2.3) using Mathematica as
∞
∑
x = Q(u) = si v i/θ , (4.1)
i=1
1http://functions.wolfram.com/06.23.06.0004.01
THE BETA EXPONENTIATED NADARAJAH-HAGHIGHI DISTRIBUTION 7
(∑ )i/θ
∞
By expanding j=1 tj uj/a in Taylor series, we can rewrite (4.1) as
i/θ k
∞
∑ ∞
∑ ∞
∑
tj uj/a = fk (i θ−1 ) tj uj/a , (4.4)
j=1 k=0 j=1
∑∞ ( )
where fk (p) = m=k (−1)m−k m k (p)m /m! (for k ≥ 0) and (p)m = p(p − 1) . . . (p − m + 1) is the
descending factorial.
We consider a power series raised to a positive integer k (Gradshteyn and Ryzhik [4])
k
∑∞ ∑∞
aj z j = ck,j z j ,
j=0 j=0
where the coefficients ck,j (for k, j = 1, 2, . . .) are determined from the recurrence equation
∑
k
ck,j = (k a0 )−1 [m(k + 1) − j] am ck,j−m
m=1
and, for k ≥ 0, ck,0 = a0k . The coefficient ck,j can be obtained from ck,0 , . . . , ck,j−1 and then from
the quantities a0 , . . . , ak listed above.
We can write based on the last two equations
k
∞
∑ ∞
∑
u1/a pj uj/a = uk/a dk,j uj/a ,
j=0 j=0
∑k
where pj = tj+1 for j ≥ 0, and, for k ≥ 0, dk,0 = pk0 and, for j ≥ 1, dk,j = (k p0 )−1 m=1 [m(k +
1) − j] pm dk,j−m .
Inserting the last equation in (4.4) gives
i/θ
∑∞ ∞
∑
tj uj/a = fk (i θ−1 ) dk,j u(j+k)/a
j=1 k,j=0
Equations (4.5) and (4.6) are the main results of this section since we can find numerically from
them various BENH structural quantities. They can be determined by using the integral on the
right-hand side for special W (·) functions, which can be simpler than if they are evaluated from
the left-hand integral.
5. Estimation
Several approaches for parameter estimation were proposed in the literature but the maximum
likelihood is the most commonly method employed. The MLEs enjoy desirable properties and
can be used when constructing confidence intervals for the parameters. Given the observed values
x1 , . . . , xn , the MLEs of the parameters of the BENH distribution are determined by maximization
the log-likelihood function
∑
n
ℓ(φ) = ℓ(xi , α, λ, θ, a, b) = n [log α + log θ + log λ − log {B(a, b)}] + (α − 1) log(1 + λxi )
i=1
∑
n ∑
n { α
}
+ {1 − (1 + λxi )α } + (θa − 1) log 1 − e1−(1+λxi )
i=1 i=1
∑
n [ { }θ ]
1−(1+λxi ) α
+ (b − 1) log 1 − 1 − e . (5.1)
i=1
This log-likelihood can be maximized numerically by using the R (optim function), SAS (PROC
NLMIXED), Ox program (MaxBFGS sub-routine), Nmaximize command in Mathematica, among
others.
6. A simulation study
We perform a Monte Carlo simulation study to assess the finite sample behavior of the MLEs
of λ, θ, α, a and b. The results are obtained from 2,000 Monte Carlo simulations carried out
using the R statistical software. In each replication, a random sample of size n is drawn from the
BEN Ha,b (λ, θ, α) distribution and the parameters are estimated by maximum likelihood. The
random variable X is generated using the inversion method. We consider two setups with the
following values for the parameters: λ = 1.5, θ = 10.0, α = 2.0, a = 4.0 and b = 2.0 (setup 1)
and λ = 3.0, θ = 5.0, α = 1.5, a = 2.5 and b = 3.0 (setup 2). The mean estimates of the five
parameters and the corresponding root mean squared errors (RMSEs) for the sample sizes n = 50,
100 and 200 are given in Tables 1 and 2, respectively. In both setups, we note that the biases
and RMSEs of the MLEs of the parameters decay toward zero when the sample size increases in
agreement with first-order asymptotic theory. There is a small sample bias in the estimation of
the these parameters. Future research should be conducted to obtain bias corrections for these
estimators.
Table 1. Monte Carlo simulation results for Setup 1: Mean estimates and RMSEs
of λ, θ, α, a and b.
Table 2. Monte Carlo simulation results for Setup 2: Mean estimates and RMSEs
of λ, θ, α, a and b.
0.8 4
0.15
0.6 3
0.10
0.4 2
0.05
0.2 1
yi = vTi β + zi , i = 1, . . . , n, (8.1)
where the random error zi has density function (7.3) with unknown parameters α > 0, θ > 0,
a > 0 and b > 0. The parameter µi = vTi β ∈ R is the location parameter of yi . The vector
µ = (µ1 , . . . , µn )T of location parameters is represented by a linear model µ = Vβ, where V =
(v1 , . . . , vn )T is a known model matrix. The LBENH model (8.1) opens new possibilities for fitting
many different types of data.
Consider a sample (y1 , v1 ), . . . , (yn , vn ) of n independent observations, where the random res-
ponse variable is defined by yi = min{log(xi ), log(ci )}. We assume non-informative censoring
such that the observed lifetimes and censoring times are independent. Let F and C be the sets
of individuals for which yi is the log-lifetime and log-censoring, respectively. The∑ log-likelihood
function for the vector of parameters η = (β T , θ, α, a, b)T from model (8.1) is l(η) = log[f (yi )]+
∑ i∈F
log[S(yi )], where f (yi ) is the density function (7.1) and S(yi ) is the survival function (7.2) of
i∈C
THE BETA EXPONENTIATED NADARAJAH-HAGHIGHI DISTRIBUTION 11
+ (b − 1) log 1 − 1−e
i∈F
{ ( )α }θ
1+ eyi −vi
T β
B 1−e
1−
; a, b
∑
+ log 1− , (8.2)
B(a, b)
i∈C
Table 3. MLEs of the parameters from the LBENH regression model fitted to
the diabetic retinopathy data, their SEs (given in parentheses), p-values in [·] and
the AIC values.
Model θ β0 β1 α a b AIC
LBENH 10.7832 -3.6222 0.6399 0.4536 2.5057 0.0219 367.9
(0.1247) (1.1698) (0.3984) (0.1887) (2.5872) (0.0327)
[0.4994] [0.0274]
σ β0 β1
LGHN 0.9742 4.4570 0.5947 372.5
(0.1261) (0.5014) (0.3622)
[< 0.001] [< 0.001]
Log-Weibull 1.2644 4.3683 0.6383 370.7
(0.1584) (0.5238) (0.3715)
[< 0.001] [< 0.001]
the log-generalized half-normal (LGHN) (Pescim et al., 2013) and log-Weibull regression models.
Also, the explanatory variable x1 is marginally significant for the model at the significance level of
5%.
In order to assess if the model is appropriate, the empirical survival function and the estimated
survival function (8.3) from the fitted LBENH regression model are plotted in Figure 6. In fact,
the LBENH regression model provides a good fit for these data.
1.0
0.9
S(x|y)
0.8
0.7
Kaplan−Meier
LBENH regression model (X = 0)
LBENH regression model (X = 1)
0.6
0 1 2 3 4
10. Conclusions
We introduce a five-parameter lifetime model called the beta exponentiated Nadarajah-Haghighi
(BENH) distribution, which extends some well-known distributions. The proposed distribution
is useful to model lifetime data with increasing, decreasing, upside-down bathtub and bathtub
shaped hazard functions. We provide some closed-form expressions for the ordinary, incomplete
and conditional moments, mean deviations and quantile function. We prove empirically that the
THE BETA EXPONENTIATED NADARAJAH-HAGHIGHI DISTRIBUTION 13
regression model based on the log-transform of the BENH distribution can be superior to some
models generated from other families in terms of goodness-of-fit by means of a medical application.
Acknowledgement
The research of Abdus Saboor has been supported in part by the Higher Education Commission
of Pakistan under NRPU project No. 3104. The research of Gauss M. Cordeiro has been supported
by CNPq (Brazil).
References
[1] Cordeiro, G.M., Gomes. A. E., da-Silva. C. Q. and Ortega, E.M.M. The beta exponentiated
Weibull distribution. Journal of Statistical Computation and Simulation, (2013) 83, 141–138.
[2] Dias, C. R., Alizadeh, M., and Cordeiro, G. M. The beta Nadarajah-Haghighi distribution.
Hacettepe University Bulletin of Natural Sciences and Engineering Series B: Mathematics and
Statistics, (2018) 47, 1302–1320.
[3] Eugene, N., Lee, C. and Famoye, F. Beta-normal distribution and its applications. Communi-
cations in Statistics: Theory and Methods, (2002), 31, 497–512.
[4] Gradshteyn, I.S. and Ryzhik, I.M. (2007). Table of integrals, series, and products. Academic
Press. San Diego
[5] Gupta, R.D. and Kundu, D. Discriminating between the Weibull and the GE distributions.
Computational Statistics and Data Analysis, (2003), 43, 179–196.
[6] Huster, W.J., Brookmeyer, R., Self, S.G. Modelling paired survival data with covariates. Bio-
metrics, (1989), 45, 145–156.
[7] Kong, L., Lee, C and Sepanski, J.H. On the Properties of Beta–Gamma Distribution, Journal
of Modern Applied Statistical Methods, (2007), 6, 187–211.
[8] Lemonte, A.J. A new exponential-type distribution with constant, decreasing, increasing,
upside-down bathtub and bathtub-shaped failure rate function. Computational Statistics and
Data Analysis, (2013), 62, 149–170.
[9] Nadarajah, S., Kotz, S. The beta exponential distribution, Reliability Engineering and System
Safety, (2006), 91, 689–697.
[10] Nadarajah, S. and Haghighi, F. An extension of the exponential distribution. Statistics, (2011),
45, 543–558.
[11] Pescim, R.R., Demétrio, C.G.B., Cordeiro, G.M., and Ortega, E.M.M. and Urbano, M.R. The
beta generalized half-normal distribution. Computational Statistics and Data Analysis, (2010),
54, 945–957.
[12] Singla, N, Jain, K and Sharma, S.K. The Beta Generalized Weibull distribution:Properties
and applications. Reliability Engineering and System Safety, (2012), 102, 5–15.
[13] Souza, W. B, Alessandro H.S. Santos and Gauss, G.M. The beta generalized exponential
distribution. Journal of Statistical Computation and Simulation, (2010), 80, 159–172.
* Department of Mathematics
Kohat University of Science & Technology
Kohat, 26000
Pakistan
E-mail address: zaybasdf@gmail.com (M.N. Khan)
** Departamento de Estatı́stica
Universidade Federal de Pernambuco
Recife, 50740-540
Brazil
14 A. SABOOR, M. N. KHAN, G. M. CORDEIRO, I. ELBATAL, AND R. R. PESCIM