Академический Документы
Профессиональный Документы
Культура Документы
of Finance and Statistics, East China Normal University, Shanghai 200241, P.R. China
of Statistics, University of British Columbia, Vancouver, BC, Canada V6T 1Z4
2 Department
Key words and phrases: Bartlett correction; bootstrap; coverage probability; empirical likelihood; empty
set problem; estimating equation.
MSC 2010: Primary 62G15
Abstract: Empirical-likelihood-based inference for parameters dened by the general estimating equations of
Qin & Lawless (1994) remains an active research topic. When the sample size is small and/or the dimension
of the accompanying estimating equations is high, the resulting condence regions often have a lower
than nominal coverage probability. In addition, the empirical likelihood can be hindered by an empty set
problem. The adjusted empirical likelihood (AEL) tackles both problems simultaneously. However, the AEL
condence region with high-order precision relies on accurate estimation of the required level of adjustment.
This has proved difcult, particularly in over-identied cases. In this article, we show that the general AEL
is Bartlett-correctable and propose a two-stage procedure for constructing accurate condence regions. A
naive AEL is rst employed to address the empty set problem, and it is then Bartlett-corrected through a
resampling procedure. The nite-sample performance of the proposed method is illustrated by simulations
and an example. The Canadian Journal of Statistics 43: 4259; 2015 2014 Statistical Society of Canada
Re sume : Linference parametrique basee sur la vraisemblance empirique telle que denie par les e quations
destimation de Qin et Lawless (1994) demeure un sujet de recherche actif. Lorsque la taille dechantillon est
faible ou que la dimension des e quations destimation est e levee, les regions de conance obtenues presentent
souvent un taux de couverture inferieur a` leur valeur nominale. De plus, le probl`eme de lensemble vide
peut causer des difcultes. La vraisemblance empirique ajustee resout ces deux probl`emes simultanement.
Les regions de conance decoulant de ces resultats dordre superieur necessitent toutefois une estimation
precise du niveau dajustement requis, ce qui sav`ere difcile, surtout dans les cas de surspecication. Dans
cet article, les auteurs montrent que la correction de Bartlett peut sappliquer a` la vraisemblance empirique
ajustee et proposent une procedure en deux e tapes pour la construction de regions de conance precises.
Ils utilisent dabord une version nave de la vraisemblance empirique ajustee pour regler le probl`eme de
lensemble vide, puis ils appliquent la correction de Bartlett a` laide dune methode de ree chantillonnage.
Ils illustrent la perfomance de leur methode sur des e chantillons nis par des simulations et un exemple. La
revue canadienne de statistique 43: 4259; 2015 2014 Societe statistique du Canada
1. INTRODUCTION
Let X1 , X2 , . . . , Xn be independent and identically distributed (i.i.d.) d-dimensional random
vectors from a distribution F . The problem of interest is inference on the p-dimensional parameter
vector = (F ) dened to be the unique solution to a q-dimensional estimating equation,
EF g(X; ) = 0,
(1)
2015
43
where the expectation is taken under distribution F . The parameters are said to be just-identied
if q = p and over-identied if q > p. The choice of the vector estimating function g(X; ) is
exible and accommodates a wide range of scenarios. Examples can be found in Hansen (1982),
Liang & Zeger (1986), Kitamura & Stutzer (1997), Imbens (2002) and Bravo (2004).
Empirical likelihood (EL; Owen, 1988; Qin & Lawless, 1994) is a broadly applicable platform
for constructing condence regions for the parameters dened by (1). Unlike the condence
regions constructed via normal approximation, the EL condence regions are transformation
invariant, are range respecting, have a data-driven shape, and are free of the burden of estimating
scaling parameters (Owen, 1990). However, when the sample size is small and/or q is large,
the coverage probabilities of the EL condence regions are often lower than the nominal value
(under-coverage problem; DiCiccio, Hall, & Romano, 1991; Owen, 2001; Liu & Chen, 2010). In
addition, the EL may not be properly dened because of the so-called empty set problem (Tsao,
2004; Chen, Variyath, & Abraham, 2008; Grendar & Judge, 2009; Tsao & Wu, 2013).
A number of approaches have been proposed to improve the accuracy of the EL condence
regions and to address the empty set problem. Those for the former problem include bootstrap
calibration (Owen, 1988; Hall & Horowitz, 1996) and Bartlett correction (DiCiccio, Hall, &
Romano, 1991; Chen & Cui, 2007). Those for the latter problem include the AEL of Chen,
Variyath, & Abraham (2008) and Emerson & Owen (2009). When the level of adjustment is
properly chosen (Liu & Chen, 2010; Li et al., 2011; Chen & Liu, 2012), the AEL also improves
the accuracy of the EL condence regions.
Applying Bartlett correction to EL or tuning AEL requires the accurate estimation of a Bartlett
correction factor b. This task is challenging in many situations. For instance, when the parameters
are over-identied, the Bartlett correction factor
has a lengthy analytical expression involving
high-order moments. In theory, replacing b by a n-consistent estimator retains the high-order
precision of the Bartlett correction. However, the actual accuracy seems to contradict this claim.
Estimating b by some straightforward resampling procedures (Chen & Cui, 2007) helps, but they
often lack stability and suffer from the empty set problem. For the asset pricing example given
later, estimating b with satisfactory precision and stability was found difcult by both Liu & Chen
(2010, pp. 1356) and Matsushita & Otsu (2013, pp. 342).
In this article, we propose a two-stage procedure to construct accurate condence regions.
The rst stage completely addresses the empty set problem by employing an AEL with its level of
adjustment tuned as if the estimating functions are observations from a normal distribution. Thus,
the proposed method does not rely on an accurate estimate of the required level of adjustment b
and substantially enhances the applicability of the AEL method. In the second stage, a bootstrap
procedure is applied to this AEL to estimate its Bartlett correction factor and further calibrate the
AEL condence region. Simulation results indicate that the condence regions constructed by the
proposed method have coverage probabilities comparable to or substantially more accurate than
the original EL and other competitors.
The rest of this article is organized as follows. In Section 2, we review the empirical likelihood
method and its extensions. The Bartlett correction of the general AEL condence regions and the
proposed method are given in Section 3. Simulation results are given in Section 4. A real-data
example is given in Section 5, and some conclusions are in Section 6. The technical details are
given in the Appendix.
2. METHODOLOGY REVIEW
In this section, we dene the empirical likelihood, give concrete descriptions of the empty set
problem, and discuss the accuracy of the condence regions. Some existing approaches to these
two problems are introduced and discussed.
DOI: 10.1002/cjs
44
n
pi ,
i=1
n
i=1
pi : pi 0,
n
pi = 1,
i=1
n
pi g(xi ; ) = 0 ,
(2)
i=1
and the EL ratio function of is dened to be Rn () = 2 log nn Ln () . If the convex hull of
g(xi ; ) does not contain the q-dimensional vector 0, the set for pi in (2) is empty. In this case, we
either declare that Ln () is not well dened at or set Ln () = 0. If Ln () = 0 for all , we have
the empty set problem discussed by Grendar & Judge (2009) and many others. Even if the set of
Ln () = 0 is not empty, we may have difculty locating this set. Asymptotically, the empty set
problem occurs with probability tending to 0 as n if the model is correctly specied. Yet
this result does not solve all the problems arising in applications.
Let Wn () = Rn () inf Rn (). When q p and under some general conditions (Chen &
Cui, 2007), it is well known that
Pr Wn (0 ) x = Pr p2 x + O(n1 ) as n ,
where 0 is the unique solution to (1) and p2 is the 2 -distributed random variable with p degrees
of freedom. An approximate (1 ) EL condence region of is
IEL () = : Wn () p2 (1 ) ,
(3)
DOI: 10.1002/cjs
2015
45
1 1 rrss
1 rst rst
.
rr
ss
p 2 r,s
3 r,s,t rr ss tt
(4)
n+1
i=1
pi : pi 0,
n+1
pi = 1,
i=1
n+1
pi gi () = 0
(5)
i=1
and the corresponding AEL ratio function of is dened as Rn (; a) = 2 log (n +
1)n+1 Ln (; a) . That is, the adjusted empirical likelihood is the usual empirical likelihood formed
on the augmented data set {g1 (), . . . , gn (), gn+1 ()} obtained by adding the pseudo-observation
gn+1 (). Note that p i = a/{n(1 + a)} for i = 1, . . . , n and p n+1 = 1/(1 + a) satisfy the restrictions in (5), which implies that Ln (; a) is well dened at all . Let
Wn (; a) = Rn (; a) inf Rn (; a).
(6)
Chen, Variyath, & Abraham (2008) proved that inference based on Wn (; a) shares all rst-order
asymptotic properties with Wn (). Furthermore, when a = b/2 with b being the Bartlett correction
factor, Liu & Chen (2010) showed that
Pr Wn (0 ; b/2) x = Pr p2 x + O(n2 ) as n .
DOI: 10.1002/cjs
46
(7)
(8)
so that the level of adjustment decreases as g n () moves away from 0. There is no theory behind the
choice of the constant 0.1 in K(), but it worked well in simulation studies. When g n () deviates
from 0 by one unit measured by Sn1 , this choice recommends a 10% reduction from the initial
level of adjustment a. Because K(0 ) = 1 + Op (n1 ), the high-order asymptotic conclusions for
AEL are not altered at the true parameter value 0 when the model is correct. In particular, when
a = b/2, this AEL retains the high-order precision. The two-stage procedure that we propose
relies on a resampling procedure to improve the accuracy of the AEL used at the rst stage. Thus,
both the size of a and the tuning constant 0.1 are of secondary importance. After this modication,
the AEL ratio function becomes Rn (; aK()).
3. BARTLETT-CORRECTING THE ADJUSTED EMPIRICAL LIKELIHOOD
In theory, the AEL condence region with second-order precision completely solves the undercoverage and empty set problems when the level of adjustment is correctly specied. However,
when q > p, accurately estimating b is a serious challenge. This prevents the AEL condence
region from achieving its full potential.
To overcome this difculty, we propose a two-stage procedure. Given any level of adjustment
according to (8), we nd that the corresponding AEL is Bartlett-correctable. Based on this result,
we recommend an adjustment level of a = p /2, where p is the Bartlett correction factor when
the equations are from the p-dimensional standard normal distribution. A bootstrap procedure is
then introduced to estimate the Bartlett correction factor of this AEL. The rst stage allows the
AEL to avoid the empty set problem. We choose p so that all the equations are somewhat normal.
Since this is not the case, the second stage combines the Bartlett correction and the resampling
method to improve the precision.
The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2015
47
Theorem 2.
where Rn (; (p /2)K()) is the AEL ratio function based on the bootstrap sample.
DOI: 10.1002/cjs
48
Step 2. Repeat Step 1 B times and obtain Wn1 ( n ; (p /2)K( n )), . . ., WnB ( n ; (p /2)K( n )).
Step 3. Estimate bp by
B
b Bp = n
Wn ( n ; (p /2)K( n ))/(Bp) 1 .
(10)
=1
: Wn (; (p /2)K())/(1 + b Bp /n) p2 (1 ) .
(11)
4. SIMULATION STUDIES
In this section, we conduct simulation studies to examine the nite-sample performance of the
Bc AEL condence regions and a number of potential competitors. In particular, we obtain the
simulated coverage probability (CP), average size (AS) and median size (MS) of the condence
regions. We also compute the coefcient of variation (CV), dened to be the ratio of the standard
deviation to the average size of the condence regions. A large CV value is an indication of
unstable performance. All the simulated results are based on 5,000 replications, and the number
of bootstrap replications is set to B = 200.
4.1. Condence Regions for Population Mean
A classical problem is the construction of condence regions for the population mean based on
n i.i.d. observations. The most widely used method is based on Hotellings T 2 (Hotelling, 1931):
Tn2 () = n(xn )T Sn1 (xn ),
where x n and Sn are the sample mean and the sample variancecovariance matrix, respectively.
If the population distribution is multivariate normal of dimension d and 0 is the true parameter
value, then (n d)Tn2 (0 )/[d(n 1)] has an F -distribution with d and n d degrees of freedom.
Hence, a (1 ) Hotellings T 2 condence region is given by
IT 2 () = : Tn2 () [d(n 1)/(n d)]Fd,nd (1 ) ,
where Fd,nd (1 ) is the (1 )th quantile of the F -distribution with d and n d degrees of
freedom. When d = 1, Hotellings T 2 statistic becomes the square of the well-known Students
t-statistic. We use Hotellings T 2 as a yardstick for competing methods.
We study the performance of the following six methods: (a) Hotellings T 2 , denoted T 2 ; (b)
the usual empirical likelihood, denoted EL; (c) the bootstrap empirical likelihood, denoted BEL;
(d) the Bartlett-corrected empirical likelihood with a moment estimate of b, denoted Bc EL; (e) the
adjusted empirical likelihood with a = b/2 and a moment estimate of b (Chen & Huang, 2013),
denoted AEL; (f) the Bartlett-corrected adjusted empirical likelihood Bc AEL proposed in this
article.
In the univariate case (d = 1), we generated data from each of the following three distributions:
(1) the standard normal distribution, denoted N(0, 1); (2) an exponential distribution with one
degree of freedom, denoted Exp(1); (3) a 2 distribution with one degree of freedom, denoted
12 . For the sample sizes n = 20, 50 we constructed condence intervals at the nominal levels of
90%, 95%, and 99%.
In the bivariate case (d = 2), we used the simulation settings of Liu & Chen (2010). We
obtained condence regions at the nominal levels 1 = 95% and 99%. The area of the
The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2015
49
condence region is approximated by partitioning the region into triangles, as in Chen & Huang
(2013). For the sample sizes n = 20 and 50 we generated data from each of the following four
bivariate distributions with D Uniform(1, 2): (1) the bivariate standard normal distribution
N(0, I2 ); (2) the distribution (X1 , X2 ) where X1 N(0, D2 ) and X2 (D, 1); (3) the
distribution (X1 , X2 ) where X1 (D1 , 1) and X2 (D, 1); (4) the distribution (X1 , X2 )
where X1 = D and X2 (D, 1).
Simulation results for d = 1 and 2 are presented in Tables 13.
It is clear that Hotellings T 2 condence regions are superior when the data are from a
normal population. The EL condence regions have mild under-coverage in these cases. The
other condence regions have comparable performance in terms of the precision of the coverage
probability and the mean and median sizes. The BEL and Bc AEL, however, have larger CV
values. Clearly, resampling has led to additional variation in these two methods. Thus, when the
data appear normal, there is no reason to use any method other than Hotellings T 2 .
When the population distribution deviates from normal, the performance of T 2 deteriorates.
The EL condence regions have even lower coverage probabilities. In these cases, a correction of
some form on the EL becomes useful when n = 20 and d = 1 or when d = 2 with both sample
sizes. The proposed Bc AEL is generally a strong competitor.
4.2. Condence Regions with Just-Identied Estimating Equations
Example 1.
The parameters are dened by the vector estimating function g(y, x; ) =
x(y xT ) with the data generated from the linear regression model
yi = xiT + i , 1 i n,
with the error distribution for i chosen as N(0, 1) in the rst case and as an exponential distribution with mean 1 in the second case. We used the xed design points xi given by Chen
(1993).
Example 2. The parameters are dened by the vector estimating function g(x; , 2 ) =
(x , x2 2 2 )T with the data x1 , . . . , xn generated i.i.d. from N(, 2 ). We refer to
this as the mean-variance model.
Under the linear regression model, Chen (1993) showed that the EL condence region for
is Bartlett-correctable. In the simulation, we followed Chen (1993) with the sample sizes n = 30
and 50, p = 2, and 0 = (1, 1)T . For the second example, we chose (0 , 02 )T = (0, 1)T with the
sample sizes n = 20 and 50. We constructed condence regions at the nominal coverage levels
of 90% and 95% in both examples. The simulation results are in Table 4.
Under the linear regression model, the EL has low coverage probabilities in all cases. The
Bc EL and AEL have low coverage probabilities when the error distribution is exponential. The
BEL and the proposed Bc AEL generally have closer to nominal coverage probabilities. Of these
two methods, the Bc AEL is more stable since it has lower CV values.
In terms of coverage rates, the two strong competitors are BEL and the proposed Bc AEL.
Of these two methods, Bc AEL has a slightly more accurate coverage probability, and it is much
more stable as indicated by its lower CV values.
In conclusion, the proposed Bc AEL is preferred in the just-identied cases.
4.3. Condence Regions with Over-Identied Estimating Equations
This subsection examines the performance with over-identied parameters. To implement the
Bc EL and AEL, the estimate of b is obtained by the robust modication bootstrap method (Liu
& Chen, 2010).
DOI: 10.1002/cjs
50
90%
Methods
CP
AS
MS
95%
CV
CP
AS
MS
99%
CV
CP
AS
MS
CV
X N(0, 1)
20 T 2
EL
BEL
Bc EL
AEL
Bc AEL
EL
BEL
50 T 2
Bc EL
AEL
Bc AEL
20 T 2
EL
BEL
Bc EL
AEL
Bc AEL
50 T 2
EL
X Exp(1 )
BEL
Bc EL
88.84, 0.469, 0.459, 0.205 94.10, 0.564, 0.550, 0.208 98.40, 0.753, 0.734, 0.215
94.18, 0.566, 0.552, 0.209
AEL
Bc AEL
89.32, 0.488, 0.472, 0.238 94.28, 0.587, 0.566, 0.244 98.56, 0.786, 0.754, 0.253
X 12
20 T 2
EL
BEL
Bc EL
AEL
Bc AEL
DOI: 10.1002/cjs
2015
51
Table 1: Continued.
50
T2
EL
BEL
Bc EL
AEL
Bc AEL
CP, coverage probability; AS, average size; MS, median size; CV, coefcient of variation.
r(x, )
x2 r(x, )
(x3 1)r(x, )
g(x; ) =
,
..
(xq 1)r(x, )
where r(x, ) = exp{4.5 2 (x1 + x2 ) + 3x2 } 1. We generated x = (x1 , x2 , . . . , xq ) as a
vector with independent entries such that x1 , x2 are from N(0, 2 ) and x3 , . . . , xq are from 12 .
In the QinLawless example, we chose 0 = 1 with the sample sizes n = 30 and 60 in the
simulation. In the asset pricing model, we xed = 0.4. The current choice of r(x, ) and the
data distributions make 0 = 3 the unique solution to the dening vector estimating function
for any > 0. We conducted simulations with the sample sizes n = 100, 200 and q = 2, 5. We
obtained condence regions with nominal coverage levels 95% and 99%. The simulation results
are presented in Table 5.
It can be seen that the coverage probabilities of the EL are quite low in both examples. The
BEL condence intervals improve the EL in terms of the accuracy of the coverage probability.
However, the empty set problem is particularly serious for BEL. The AEL condence regions
have accurate coverage probabilities. However, the sizes and variations of the AEL condence
regions are much larger. Overall, the proposed Bc AEL is the best choice.
5. REAL DATA
We illustrate the proposed method with a real-data example from Larsen & Marx (2006). Fourteen
men without a history of coronary incidents were examined in a heart-disease study. Their weights
(in pounds) and blood cholesterol levels (in mg/dl) were measured. Figure 1 shows these records,
DOI: 10.1002/cjs
52
95%
Methods
99%
CP
AS
MS
CV
CP
AS
MS
CV
T2
94.30
1.115
1.094
0.231
98.60
1.886
1.851
0.231
EL
90.04
0.880
0.860
0.235
96.64
1.362
1.332
0.237
BEL
93.24
1.193
1.150
0.306
98.04
2.217
2.036
0.419
Bc EL
92.10
0.955
0.933
0.237
97.20
1.477
1.443
0.240
AEL
92.46
0.968
0.946
0.238
97.40
1.513
1.476
0.242
Bc AEL
94.20
1.112
1.079
0.268
98.46
1.757
1.702
0.273
T2
94.98
0.400
0.396
0.144
99.10
0.636
0.631
0.144
EL
94.02
0.376
0.373
0.145
98.64
0.591
0.587
0.146
BEL
94.44
0.397
0.393
0.188
98.30
0.621
0.605
0.229
Bc EL
94.52
0.391
0.388
0.146
98.84
0.614
0.610
0.147
AEL
94.52
0.391
0.389
0.146
98.86
0.615
0.611
0.147
Bc AEL
94.78
0.400
0.395
0.166
98.96
0.629
0.622
0.168
0.324
97.98
1.644
1.575
0.324
50
50
T2
93.20
0.972
0.931
EL
88.70
0.764
0.728
0.328
95.78
1.179
1.126
0.330
BEL
93.70
1.205
1.029
0.692
98.00
2.504
1.930
0.834
Bc EL
90.66
0.832
0.792
0.334
96.44
1.284
1.219
0.336
AEL
91.18
0.847
0.805
0.336
96.68
1.324
1.252
0.342
Bc AEL
94.36
1.052
0.953
0.440
98.26
1.685
1.507
0.516
94.18
0.355
0.350
0.204
98.20
0.564
0.556
0.204
EL
93.22
0.337
0.331
0.211
98.20
0.531
0.521
0.215
BEL
94.40
0.371
0.355
0.276
98.60
0.594
0.558
0.333
Bc EL
93.92
0.352
0.345
0.216
98.44
0.555
0.544
0.220
AEL
93.98
0.353
0.346
0.216
98.48
0.557
0.545
0.220
Bc AEL
94.76
0.370
0.356
0.250
98.66
0.584
0.562
0.256
CP, coverage probability; AS, average size; MS, median size; CV, coefcient of variation.
together with 95% condence regions for the mean based on Hotellings T 2 , EL and Bc AEL.
It can be seen that the condence region based on the proposed method, Bc AEL, preserves the
data-driven shape. It has also expanded the EL to ensure the coverage probability.
6. CONCLUSION
In this article, we have shown that the AEL with any given level of adjustment can be Bartlettcorrected. Thus, the AEL can be used to address both the empty set and under-coverage problems.
The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2015
53
95%
Methods
99%
CP
AS
MS
CV
CP
AS
MS
CV
T2
89.14
1.108
1.024
0.434
95.18
1.875
1.732
0.434
EL
86.90
0.861
0.794
0.440
93.70
1.323
1.217
0.443
BEL
93.10
1.765
1.283
1.045
96.90
4.309
2.754
1.229
0.452
50
Bc EL
88.40
0.942
0.863
0.449
94.76
1.445
1.323
AEL
88.82
0.964
0.880
0.455
95.02
1.507
1.366
0.467
Bc AEL
92.66
1.387
1.151
0.672
96.86
2.371
1.832
0.926
T2
91.96
0.412
0.395
0.284
96.96
0.655
0.629
0.284
EL
92.10
0.394
0.376
0.296
97.72
0.620
0.593
0.301
BEL
93.30
0.466
0.426
0.411
98.24
0.790
0.701
0.500
Bc EL
92.66
0.415
0.396
0.304
97.96
0.654
0.622
0.310
AEL
92.68
0.416
0.397
0.306
98.04
0.657
0.625
0.313
Bc AEL
94.00
0.455
0.424
0.365
98.76
0.719
0.667
0.375
92.72
0.387
0.374
0.291
97.70
0.655
0.633
0.291
EL
90.10
0.295
0.286
0.288
95.96
0.451
0.437
0.287
BEL
93.50
0.433
0.379
0.542
97.90
0.897
0.706
0.828
50
T2
Bc EL
91.46
0.318
0.308
0.291
96.50
0.486
0.471
0.290
AEL
91.64
0.323
0.312
0.293
96.62
0.498
0.481
0.294
Bc AEL
93.76
0.392
0.367
0.376
98.36
0.619
0.573
0.421
94.14
0.141
0.138
0.182
98.60
0.224
0.220
0.182
EL
94.06
0.130
0.128
0.187
98.56
0.203
0.198
0.188
BEL
94.20
0.145
0.136
0.247
98.60
0.222
0.209
0.311
Bc EL
94.52
0.135
0.132
0.190
98.76
0.210
0.206
0.192
AEL
94.56
0.135
0.132
0.191
98.78
0.211
0.206
0.193
Bc AEL
94.88
0.141
0.136
0.227
99.08
0.219
0.212
0.230
CP, coverage probability; AS, average size; MS, median size; CV, coefcient of variation.
However, particularly in over-identied cases, there is no simple and accurate method for estimating the Bartlett correction factor. We hence propose a two-stage procedure for constructing
accurate condence regions. The simulation results show that the new method is competitive at
constructing condence regions for the population mean and works better in just-identied and
over-identied cases, compared with the EL and its variants. The real-data example also provides
evidence that the proposed method has some advantages over the EL.
DOI: 10.1002/cjs
54
90%
Methods
CP
AS
95%
MS
CV
CP
AS
MS
CV
50
EL
84.68
0.187
0.180
0.287
90.86
0.245
0.236
0.286
BEL
89.00
0.244
0.231
0.355
94.04
0.342
0.321
0.369
Bc EL
86.40
0.202
0.194
0.292
92.14
0.265
0.254
0.291
AEL
86.74
0.205
0.196
0.293
92.38
0.269
0.258
0.293
Bc AEL
90.16
0.242
0.230
0.334
94.68
0.319
0.302
0.334
EL
86.60
0.071
0.069
0.225
92.70
0.093
0.091
0.225
BEL
89.34
0.079
0.077
0.265
94.36
0.107
0.103
0.278
Bc EL
88.24
0.075
0.073
0.228
93.52
0.098
0.096
0.228
AEL
88.32
0.075
0.073
0.228
93.60
0.098
0.096
0.228
Bc AEL
89.86
0.080
0.078
0.254
94.70
0.105
0.102
0.254
50
EL
76.60
0.346
0.282
0.732
83.60
0.455
0.371
0.726
BEL
87.36
0.897
0.473
2.212
92.98
1.472
0.737
2.540
Bc EL
78.90
0.382
0.309
0.750
85.38
0.502
0.408
0.743
AEL
79.28
0.386
0.315
0.743
85.60
0.512
0.416
0.743
Bc AEL
87.96
0.724
0.469
1.160
92.18
1.010
0.623
1.238
0.643
EL
82.86
0.142
0.119
0.644
89.44
0.189
0.158
BEL
88.14
0.235
0.157
2.245
93.40
0.348
0.229
2.037
Bc EL
84.94
0.155
0.128
0.665
90.44
0.206
0.170
0.666
AEL
85.00
0.154
0.129
0.658
90.44
0.205
0.171
0.664
Bc AEL
89.16
0.221
0.159
1.036
93.48
0.295
0.212
1.140
Mean-variance model
20
50
EL
78.04
0.407
0.386
0.382
84.20
0.526
0.497
0.382
BEL
87.10
0.943
0.640
1.508
92.50
1.784
1.031
1.588
Bc EL
80.00
0.442
0.418
0.395
85.54
0.571
0.539
0.395
AEL
80.34
0.452
0.426
0.402
85.74
0.588
0.553
0.406
Bc AEL
88.16
0.859
0.659
1.118
91.76
1.301
0.873
1.307
0.255
EL
85.72
0.191
0.187
0.253
91.36
0.250
0.243
BEL
88.50
0.232
0.216
0.375
94.04
0.319
0.291
0.418
Bc EL
86.74
0.201
0.196
0.263
92.04
0.263
0.256
0.264
AEL
86.84
0.202
0.197
0.265
92.20
0.265
0.257
0.267
Bc AEL
89.86
0.235
0.219
0.362
93.96
0.308
0.287
0.364
CP, coverage probability; AS, average size; MS, median size; CV, coefcient of variation.
DOI: 10.1002/cjs
2015
55
90%
Methods
CP
AS
95%
MS
CV
CP
AS
MS
CV
QinLawless model
30
60
EL
84.76
0.563
0.548
0.221
90.42
0.673
0.655
0.218
BEL
88.88
0.709
0.637
0.446
93.00
0.890
0.793
0.454
Bc EL
86.76
0.637
0.587
0.364
91.35
0.762
0.701
0.368
AEL
89.40
1.311
0.646
1.034
93.72
1.606
0.789
0.938
Bc AEL
90.42
0.721
0.656
0.401
93.14
0.863
0.784
0.401
EL
86.78
0.411
0.407
0.141
92.12
0.491
0.486
0.141
BEL
88.68
0.443
0.427
0.233
93.02
0.537
0.525
0.243
Bc EL
88.50
0.435
0.423
0.222
92.90
0.520
0.513
0.222
AEL
90.22
0.711
0.441
1.159
94.36
0.904
0.537
1.092
Bc AEL
89.70
0.451
0.434
0.225
93.94
0.519
0.522
0.227
200
EL
80.40
0.774
0.759
0.181
87.12
0.923
0.906
0.178
BEL
85.94
0.956
0.874
0.363
92.24
1.164
1.072
0.357
Bc EL
84.24
0.883
0.831
0.285
90.28
1.050
0.990
0.280
AEL
87.86
2.087
0.949
1.054
92.50
2.587
1.206
0.934
Bc AEL
88.58
0.972
0.892
0.347
92.92
1.154
1.064
0.341
EL
84.96
0.574
0.564
0.150
91.60
0.683
0.671
0.147
BEL
88.36
0.657
0.615
0.284
93.56
0.794
0.744
0.285
Bc EL
87.22
0.628
0.602
0.240
92.68
0.747
0.717
0.235
AEL
89.26
1.376
0.639
1.283
93.96
1.751
0.775
1.153
Bc AEL
89.40
0.664
0.622
0.277
93.70
0.789
0.741
0.270
EL
72.90
0.729
0.713
0.183
80.76
0.870
0.850
0.184
BEL
87.42
1.140
1.027
0.383
92.70
1.413
1.274
0.386
Bc EL
86.12
1.036
0.956
0.336
90.98
1.238
1.142
0.333
AEL
97.00
5.661
5.492
0.444
98.30
6.520
5.947
0.393
Bc AEL
88.74
1.178
1.054
0.388
93.46
1.411
1.265
0.386
q=5
100
200
EL
79.02
0.538
0.531
0.134
84.94
0.641
0.633
0.133
BEL
87.38
0.711
0.662
0.290
93.02
0.864
0.806
0.294
Bc EL
86.22
0.682
0.641
0.276
92.20
0.812
0.765
0.274
AEL
94.12
4.981
3.938
0.669
96.56
5.540
4.451
0.590
Bc AEL
88.64
0.723
0.677
0.283
93.74
0.861
0.809
0.283
CP, coverage probability; AS, average size; MS, median size; CV, coefcient of variation.
DOI: 10.1002/cjs
56
250
300
350
data
Hotellings T2
EL
BcAEL
150
200
Blood cholesterol
140
160
180
200
220
240
260
Weight
APPENDIX
We provide sketch proofs of Theorems 1 and 2 for Wn (0 ; a) instead of Wn (0 ; aK(0 )). Because
K(0 ) = 1 + Op (n1 ), the results remain valid for Wn (0 ; aK(0 )).
Proof of Theorem 1. Our proof is built on existing results on Bartlett correction with just-identied
and over-identied estimating equations. In both cases and under conditions (C1)(C3), the EL
ratio function Rn (0 ) has a signed root decomposition
Rn (0 ) = n(R1 + R2 + R3 )T (R1 + R2 + R3 ) + Op (n3/2 ),
with R1 , R2 , R3 having certain properties. For instance, Rj = Op (nj/2 ) for j = 1, 2, 3. The
exact expressions are not important here but can be found in Chen & Cui (2007). According to
Theorem 1 of Liu & Chen (2010), the AEL ratio function dened by (6) has
Rn (0 ; a) = n(R1 + R2 + Ra )T (R1 + R2 + R3a ) + Op (n3/2 ),
with R3a = R3 n1 aR1 .
Denote Qn = n(R1 + R2 + R3a ) and let r,s,...,t (Qn ) denote the joint cumulant of the rth,
sth, . . . , tth components of Qn . Without loss of generality, suppose rs = I(r = s) at = 0 .
According to Liu & Chen (2010), for the just-identied case,
r (Qn ) = n1/2 r + O(n3/2 ), r,s (Qn ) = I(r = s) + n1 rs + O(n2 ),
r,s,t (Qn ) = O(n3/2 ), r,s,t,u (Qn ) = O(n2 ),
where I() is the indicator function, r = 16 rss , and
rs =
DOI: 10.1002/cjs
2015
57
Here, we have used the summation convention, according to which if an index occurs more than
once in an expression, summation over the index is understood.
Let fQn (z) and (z) be the density functions of Qn and the p-variate standard normal distribution, respectively. By a formal Edgeworth expansion,
4
fQn (z) = 1 +
ni/2 i (z) + o(n2 ) (z),
i=1
with
1 rs
( + ur us ) zr zs I(r = s) ,
2
1r (z) = ur zr , 2r (z) =
and for some polynomials 3 (z) and 4 (z) that are of order no more than four, the former is odd
and the latter is even.
The above expansion implies that
Pr
QTn Qn
x =
zT zx
4
1+
ni/2 i (z) (z) dz + o(n2 ).
i=1
Because 1 (z) and 3 (z) are odd functions, their integrations over the symmetry region are zero.
In addition, for each i = 1, 2, . . . , p,
zT z<x
(z2i 1)(z) dz =
2x
fp (x),
p
p2
zT z<x
p
r=1
1
x + n1
( rr + r r )
2
p
r=1
zT z<x
x
( rr + r r ) + O(n2 )
= Pr p2 x n1 fp (x)
p
r=1
2
= Pr p x ba xfp (x)n1 + O(n2 ),
p
(A.1)
1 rr
1 1 rrss 1 rst rst
( + r r ) =
2a = b 2a. This b is the
p
p 2 r,s
3 r,s,t
r=1
Bartlett correction factor given in DiCiccio, Hall, & Romano (1991). We note in particular that
the remainder term in (A.1) is O(n2 ) rather than O(n3/2 ). See Barndorff-Nielsen & Hall (1988)
for an account of this phenomenon.
For the over-identied case, the above proof remains applicable except for the expression for
the Bartlett correction factor b, which is now given in Chen & Cui (2007). This completes the
proof of Theorem 1.
p
where ba =
DOI: 10.1002/cjs
58
(A.3)
and
fp x(1 + n1 ba ) = fp (x) + O(n1 ).
(A.4)
DOI: 10.1002/cjs
2015
59
DiCiccio, T., Hall, P., & Romano, J. (1991). Empirical likelihood is Bartlett correctable. The Annals of
Statistics, 19, 10531061.
Emerson, S. C. & Owen, A. B. (2009). Calibration of the empirical likelihood method for a vector mean.
Electronic Journal of Statistics, 3, 11611192.
Grendar, M. & Judge, G. (2009). Empty set problem of maximum empirical likelihood methods. Electronic
Journal of Statistics, 3, 15421555.
Hall, P. & Horowitz, J. L. (1996). Bootstrap critical values for tests based on generalized-method-of-moments
estimators. Econometrica, 64, 891916.
Hall, P. & Presnell, B. (1999). Intentionally biased bootstrap methods. Journal of the Royal Statistical Society,
Series B, 61, 143158.
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica,
50, 10291054.
Hotelling, H. (1931). The generalization of Students ratio. The Annals of Mathematical Statistics, 2, 360378.
Imbens, G. W. (2002). Generalized method of moments and empirical likelihood. Journal of Business and
Economic Statistics, 20, 493506.
Kitamura, Y. & Stutzer, M. (1997). An information-theoretic alternative to generalized method of moments
estimation. Econometrica, 65, 861874.
Larsen, R. J. & Marx, M. L. (2006). An Introduction to Mathematical Statistics and its Applications, 4th
ed., Pearson Prentice Hall, Upper Saddle River, NJ.
Li, X., Chen, J., Wu, Y., & Tu, D. (2011). Constructing nonparametric likelihood condence regions with
high order precisions. Statistica Sinica, 21, 17671783.
Liang, K. Y. & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika,
73, 1322.
Liu, Y. & Chen, J. (2010). Adjusted empirical likelihood with high-order precision. The Annals of Statistics,
38, 13411362.
Matsushita, Y. & Otsu, T. (2013). Second-order renement of empirical likelihood for testing overidentifying
restrictions. Econometric Theory, 29, 324353.
Owen, A. B. (1988). Empirical likelihood ratio condence intervals for a single functional. Biometrika, 75,
237249.
Owen, A. B. (1990). Empirical likelihood ratio condence regions. The Annals of Statistics, 18, 90120.
Owen, A. B. (2001). Empirical Likelihood, Chapman & Hall/CRC Press, New York.
Qin, J. & Lawless, J. (1994). Empirical likelihood and general equations. The Annals of Statistics, 22,
300325.
Tsao, M. (2004). Bounds on coverage probabilities of the empirical likelihood ratio condence regions. The
Annals of Statistics, 32, 12151221.
Tsao, M. & Wu, F. (2013). Empirical likelihood on the full parameter space. The Annals of Statistics, 41,
21762196.
DOI: 10.1002/cjs