4 views

Uploaded by Daniel Marcelino

- Top Ten Math Initiatives for the Next MCPS Superintendent
- Model Assisted Survey Sampling
- Solution Manual for Introduction to Econometrics - Gary Koop
- Random subcubes as a toy model for constraint satisfaction problems. Mora Zdeborova
- 90638 Complex Numbers Exam-03
- TRIAL MATE Pmr 2011 Perak Answer
- Tips From Iitjee Topper
- Ref Books for JEE
- Ch2Introduction to Polynomials
- Fuzzy Modelling Using Sector Non
- Chapter 9 and Mixed Review MC W_ Answers
- Efficiency of the Philippine Stock Market
- ALGEBRAIC EXPRESSIONS.ppt
- PhysRevB.13.5188
- CHAPTER 1 Estimation S2
- Assessment of Spatial Variability and Mapping of Soil Properties for Sustainable Agricultural Production Using Geographic Information System
- Form 4 Mid Term
- Derivation of Formulas for Position Change in Kola Analysis
- SMOGD User Manual Vsn2.6
- Covariance Matrix Estimation

You are on page 1of 11

JAMES-STEIN ESTIMATOR

By Yuzo Maruyama

The University of Tokyo

We consider estimation of a multivariate normal mean vector under sum of squared error loss. We propose a new class of smooth

estimators parameterized by dominating the James-Stein estimator. The estimator for = 1 corresponds to the generalized Bayes

estimator with respect to the harmonic prior. When goes to infinity,

the estimator converges to the James-Stein positive-part estimator.

Thus the class of our estimators is a bridge between the admissible

estimator ( = 1) and the inadmissible estimator ( = ). Although

the estimators have quasi-admissibility which is a weaker optimality

than admissibility, the problem of determining whether or not the

estimator for > 1 admissible is still open.

distribution Np (, Ip ). Then we consider the problem of estimating the mean

vector by (X) relative to quadratic loss. Therefore every estimator is

evaluated based on the risk function

h

R(, ) = E k(X) k

Rp

k(x) k2

kx k2

exp

2

(2)p/2

dx.

The usual estimator X, with the constant risk p, is minimax for any dimension p. It is also admissible when p = 1 and 2, as shown in [1] and [13],

respectively. [13] showed, however, that when p 3, there exists an estimator dominating X among a class of equivariant estimators relative to the

orthogonal transformation group which have the form

(1.1)

JS (X) = (1 (p 2)/kXk2 )X,

which is called the James-Stein estimator. More generally, a large class of

better estimators than X has been proposed in the literature. The strongest

tool for this is [14]s identity as follows.

AMS 2000 subject classifications: Primary 62C15; secondary 62C10

Keywords and phrases: The James-Stein estimator, quasi-admissibility

such that E[|h (Y )|] < , then

E[h(Y )(Y )] = E[h (Y )].

(1.2)

By using the identity (1.2), the risk function of the estimator of the form

g (X) = X + g(X) = (X1 + g1 (X), . . . , Xp + gp (X)) is written as

h

R(, g ) = E kg (X) k2

h

= E kX k2 + E kg(X)k2 + 2

h

= p + E kg(X)k

+2

p

X

i=1

p

X

E (X ) g(X)

i=1

gi (X) ,

xi

where gi (x) is assumed to be differentiable and E|(/xi )gi (X)| < for

given by

(1.3)

p + kg(X)k2 + 2

p

X

i=1

xi

gi (X)

=

R(, ), it is called the unbiased estimator of risk. Clearly if

(1.4)

kg(x)k2 + 2

p

X

i=1

xi

gi (x) 0,

To make the structure of estimators improving on X more comprehensible,

we consider a class of orthogonally equivariant estimators of a form given in

(1.1). Assigning g(x) = (kxk2 )x/kxk2 in (1.4), (1.4) becomes

(1.5)

for w = kxk2 . The inequality (1.5) is, for example, satisfied if (w) is monotone nondecreasing and within [0, 2(p 2)] for any w 0.

Since (w) = c for 0 < c < 2(p 2) satisfies the inequality (1.5),

(1.6)

(1 c/kXk2 )X

is the best estimator among a class of estimators (1.6) because the risk

function of (1.6) is given by

p + c(c 2(p 2))E[kXk2 ]

and hence minimized by c = p 2.

It is however noted that when kxk2 < p 2, the James-Stein estimator

yields an over-shrinkage and changes the sign of each component of X. The

James-Stein positive-part estimator

+

JS

(X) = max(0, 1 (p 2)/kXk2 )X,

eliminates this drawback and dominates the James-Stein estimator. We notice here that the technique for proving the inadmissibility of JS is not from

the Stein identity or the unbiased estimator of risk given in (1.3). The risk

difference between JS and is given by

#

"

((kXk2 ) p + 2)2

+ 4 (kXk2 ) ,

R(JS , ) R( , ) = E

kXk2

but +

JS (w) = min(w, p 2) which makes the James-Stein positive-part

estimator does not satisfy the inequality

(1.7)

(1.7) for any w 0, that is, the Stein identity itself is not useful for finding

estimators dominating JS . We call such optimality for the James-Stein estimator quasi-admissibility. We will explain the concept and give a sufficient

condition for quasi-admissibility in Section 2.

In spite of such difficulty, some estimators which dominate the JamesStein estimator have been given by several authors. [11] and [6] considered

the class of estimators of forms LK (X) = (1 LK (kXk2 )/kXk2 )X where

LK (w) = p 2

n

X

ai wbi ,

i=1

where ai 0 for any i and 0 < b1 < b2 < < bn . For example when

n = 1, they both showed that, LK (X) for 0 < b1 < 41 (p 2) and

a1 = 2b1 2b1 (p/2 b1 1)/(p/2 2b1 1) is superior to the James-Stein

estimator. [10] gave two estimators which shrink toward the ball with center

i (X) = (1 i (kXk2 )/kXk2 )X for i = 1, 2 where

0, KT

KT

1KT (w)

2KT (w)

0

P

1/2 )i

p 2 p2

i=1 (r/w

w r2

w > r2,

0

w {(p 1)/(p 2)}2 r 2

p 2 r/(w1/2 r) w > {(p 1)/(p 2)}2 r 2 .

They showed that when r is sufficiently small, these two estimators dominate

the James-Stein estimator. However, these estimators are not appealing since

they are inadmissible. In our setting, the estimation of a multivariate normal

mean, [3] showed that any admissible estimator should be generalized Bayes.

Since the shrinkage factor (1 LK (w)/w) becomes negative for some w and

i

iKT for i = 1, 2 fail to be analytic, neither LK nor KT

can be generalized

Bayes or admissible.

In general, when we propose an estimator ( , say) dominating a certain inadmissible estimator, it is extremely important to find it among admissible estimators. If not, a more difficult problem (finding an estimator

improving on ) just occurs. To the best of our knowledge, the sole admissible estimator dominating the James-Stein estimator is [8]s estimator

K (X) = (1 K (kXk2 )/kXk2 )X where

(1.8)

K (w) = p 2 2 R 1

0

exp(w/2)

p/22 exp(w/2)d

The estimator is generalized Bayes with respect to the harmonic prior density kk2p which was originally suggested by [14]. The only fault is, however,

that it does not improve upon JS (X) at kk = 0. (See Section 3 for the

detail.) Shrinkage estimators like (1.1) make use of the vague prior information that kk is close to 0. It goes without saying that we would like

to get the significant improvement of risk when the prior information is

accurate. Though K (X) is an admissible generalized Bayes estimator and

thus smooth, it has no improvement on JS (X) at the origin kk = 0. On

+

(X) improves on JS (X) significantly at the origin, but

the other hand, JS

it is not analytic and is thus inadmissible by [3]s complete class theorem.

Therefore a more challenging open problem is to find admissible estimators

dominating the James-Stein estimator especially at kk = 0. In this paper,

we will consider a class of estimators

(1.9)

(X) =

R1

0

X

R1

(p/21)1 exp(kXk2 /2)d

0

James-Stein estimator and that it has strictly risk improvement at kk = 0.

+

as goes to . Since with

Furthermore we see that approaches JS

= 1 corresponds to K , the class of with 1 is a bridge between K

+

which is inadmissible. Although we show that

which is admissible and JS

with > 1 is quasi-admissible, which is introduced in Section 2, we have

no idea on its admissibility at this stage.

2. Quasi-admissibility. In this section, we introduce the concept of

quasi-admissibility and give a sufficient condition for quasi-admissibility. We

deal with a reasonable class of estimators which have the form

m = X + log m(kXk2 )

(2.1)

where is a differential operator (/x1 , . . . , /xp ) and m is a positive

function. If m(w) = wc for c > 0, (2.1) becomes the James-Stein type

estimator (1.6). Any (generalized) Bayes estimator with respect to spherical

symmetric measure should also have the form (2.1) because it is written

as

exp(kx k2 /2)(d)

2

Rp exp(kx k /2)(d)

R

2

p ( x) exp(kx k /2)(d)

=x+ R R

2

Rp exp(kx k /2)(d)

RR

exp(kx k2 /2)(d)

2

Rp exp(kx k /2)(d)

=x+ R

Rp

= x + log

Rp

exp(kx k2 /2)(d)

= x + log m (kxk2 )

where m (kxk2 ) = Rp exp(kxk2 /2)(d). [2] and [5] called an estimator

of the form (2.1) pseudo-Bayes and quasi-Bayes, respectively. If, for given

m, there exists a nonnegative measure which satisfies

R

(2.2)

m(kxk2 ) =

Rp

the estimator of the form (2.1) is truly generalized Bayes. However it is often

difficult to determine whether or not m has such an exact integral form.

Then substituting g(x) = log m(kxk2 ) = 2xm (kxk2 )/m(kxk2 ) and g(x) =

2xm (kxk2 )/m(kxk2 ) + k(kxk2 )x in (1.3) respectively, we have the unbiased

estimators of risk as

m ) = p 4w m (w)

R(

m(w)

(2.3)

2

+ 4p

m (w)

m (w)

+ 8w

m(w)

m(w)

m,k ) = R(

m ) + (m, mk) where

for w = kxk2 and R(

(2.4)

(m, mk) = 4w

m (w)

k(w) + 2pk(w) + 4wk (w) + wk2 (w).

m(w)

If there exists k such that (m, mk) 0 for any w 0 with strict inequality for some w , that implies that m is inadmissible, that is, R(, mk )

R(, m ) for all with strict inequality for some . If there does not exist such k, m is said to be quasi-admissible. Hence quasi-admissibility is a

weaker optimality than admissibility. Now we state a sufficient condition for

quasi-admissibility. The idea is originally from [4], but the paper is not so

accessible. See also [12], where quasi-admissibility is called permissibility.

Theorem 2.1.

Z

p/2

1

m(w)

dw = as well as

wp/2 m(w)1 dw = .

m,k ) R(

m ), that is, (m, mk)

Proof. We have only to show that R(

p/2

0 implies k(w) 0. Let M (w) = w m(w) and h(w) = M (w)k(w). Then

we have

(m, mk) = 4w

h2 (w)

4wh2 (w)

d

h (w)

+w 2

=

M (w)

M (w)

M (w)

dw

1

h(w)

1

.

4M (w)

First we show that h(w) 0 for all w 0. Suppose to the contrary that

h(w) < 0 for some w0 . Then h(w) < 0 for all w w0 since h (w) should be

negative. For all w > w0 , the inequality

(2.5)

d

dw

1

h(w)

1

4M (w)

1

1

1

h(w ) h(w0 )

4

w0

M 1 (t)dt.

this provides a contradiction since the left-hand side is less than 1/h(w0 ).

Thus we have g(w) 0 for all w.

Similarly we can show that g(w) 0 for all w. It follows that h(w) is

zero for all w, which implies that k(w) 0 for all w. This completes the

proof.

Combining Theorem 2.1 and [3]s sufficient condition for admissibility, we

see that a quasi-admissible estimator of the form (2.1) is admissible if it is

truly generalized Bayes, that is, there exists a nonnegative measure such

that

2

m(kxk ) =

(2.6)

Rp

exp(kx k2 /2)(d).

such an integral form. Furthermore, even if we find that m does not have an

integral form like (2.6), that is, the estimator is inadmissible, it is generally

very difficult to find an estimator dominating the inadmissible estimator.

The function m for the James-Stein estimator is m(w) = w2p , which

satisfies the assumptions of Theorem 2.1.

Corollary 2.1.

taking positive part. But it is not easy to find a large class of estimators

dominating the James-Stein estimator. In Section 3, we introduce an elegant

sufficient condition for domination over JS proposed by [9].

3. A class of quasi-admissible estimators improving upon the

James-Stein estimator. In this section, we introduce [9]s sufficient condition for improving upon the James-Stein estimator and propose a class of

smooth quasi-admissible estimators satisfying it.

[9] showed that if limw (w) = p2 then the difference of risk functions

between JS and can be written as

R(, JS ) R(, )

(3.1)

=2

2f (w, )

(w) (w) (p 2) + R w 1

0 y fp (y, )dy

y 1 fp (y, )dydw,

distribution with p degrees of freedom and non-centrality parameter .

Moreover by the inequality

fp (w; )/

y

0

w

0

y 1 fp (y)dy,

where fp (y) = fp (y; 0), which can be shown by the correlation inequality,

we have

R(, JS ) R(, )

2

Z

where

0 (w) = p 2 + 2fp (w)/

y 1 fp (y)dy.

p 2, we have the following result.

Theorem 3.1 ([9]). If (w) is nondecreasing and within [K (w), p

2] for any w 0, then (X) of form (1.1) dominates the James-Stein

estimator.

The assumption of the theorem above is satisfied by K and +

JS . By

(3.1), we see that the risk difference at kk = 0 between JS and , the

limit of which is p 2, is given by

(3.2)

Z

y 1 fp (y)dy dw.

an admissible generalized Bayes estimator and thus smooth. On the other

+

(X) improves on JS (X) significantly at the origin, but it is not

hand, JS

analytic and is thus inadmissible by [3]s complete class theorem. Therefore

a more challenging problem is to find admissible estimators dominating the

James-Stein estimator especially at kk = 0.

In this paper, we propose a class of estimators

(3.3)

where

R1

(3.4)

(p/21) exp(w/2)d

(w) = w R 10

(p/21)1 exp(w/2)d

0

2 exp(w/2)

= p2 R1

0 (p/21)1 exp(w/2)d

2

= p2 R1

.

(p/21)1

exp(w/2)d

0 (1 )

Theorem 3.2.

1. (X) dominates JS (X) for 1.

2. The risk of (X) for > 1 at kk = 0 is strictly less than the risk of

the James-Stein estimator at kk = 0.

3. (X) approaches the positive-part James-Stein estimator when tends

to infinity, that is,

+

lim (X) = JS

(X).

+

K which is admissible and JS

which is inadmissible.

Proof. [part 1] We shall verify that (w) for 1 satisfies assumptions in Theorem 3.1. Applying the Taylor expansion to a part of (3.4), we

have

(1 )(p/21)1 exp(w/2)d

X

i=0

i

Y

(p 2 + 2j/)1 = (, w)

(say.)

j=0

, for any 1, it is clear that limw (w) = p 2. In order to show

that (w) K (w) = 1 (w) for 1, we have only to check that (, w)

is increasing in . It is easily verified because the coefficient of each term of

(, w) is increasing in . We have thus proved the theorem.

[part 2] Since for > 1 is strictly greater than K and strictly increasing in w, the risk difference of for > 1 and JS at kk = 0, which

is given in (3.2), is strictly positive.

10

(p 2)1

i

X

w

i=0

p2

two cases: w < ()p 2, we obtain lim (w) = w if w < p 2; = p 2

otherwise. This completes the proof.

The estimator is expressed as X + log m (kXk2 ) where

m (w) =

Z

(p/21)1 exp

w

d

2

1/

0 < m (0) < , we have the following result by Theorem 2.1.

Corollary 3.1.

(X) for > 1 is admissible. Since (X) with > 1 is quasi-admissible,

it is admissible if it is generalized Bayes, that is, there exists a measure

which satisfies

Z

Rp

Z

1/

find that there is no , which implies is inadmissible, it is very difficult

to find an estimator dominating for > 1.

References.

[1] Blyth, C. R. (1951). On minimax statistical decision procedures and their admissibility. Ann. Math. Statist. 22, 2242. MRMR0039966 (12,622f)

[2] Bock, M. E. (1988). Shrinkage estimators: pseudo-Bayes rules for normal mean

vectors. In Statistical decision theory and related topics, IV, Vol. 1 (West Lafayette,

Ind., 1986). Springer, New York, 281297. MRMR927108 (89b:62106)

[3] Brown, L. D. (1971). Admissible estimators, recurrent diffusions, and insoluble

boundary value problems. Ann. Math. Statist. 42, 855903. MRMR0286209 (44 #3423)

[4] Brown, L. D. (1988). The differential inequality of a statistical estimation problem.

In Statistical decision theory and related topics, IV, Vol. 1 (West Lafayette, Ind., 1986).

Springer, New York, 299324. MRMR927109 (89d:62010)

[5] Brown, L. D. and Zhang, L. H. (2006). Estimators for gaussian models having a

block-wise structure. Mimeo,

(available at: http://www-stat.wharton.upenn.edu/ lbrown/teaching/Shrinkage/).

11

[6] Guo, Y. Y. and Pal, N. (1992). A sequence of improvements over the James-Stein

estimator. J. Multivariate Anal. 42, 2, 302317. MRMR1183849 (93h:62009)

[7] James, W. and Stein, C. (1961). Estimation with quadratic loss. In Proc. 4th

Berkeley Sympos. Math. Statist. and Prob., Vol. I. Univ. California Press, Berkeley,

Calif., 361379. MRMR0133191 (24 #A3025)

[8] Kubokawa, T. (1991). An approach to improving the James-Stein estimator. J.

Multivariate Anal. 36, 1, 121126. MRMR1094273 (92b:62070)

[9] Kubokawa, T. (1994). A unified approach to improving equivariant estimators. Ann.

Statist. 22, 1, 290299. MRMR1272084 (95h:62011)

[10] Kuriki, S. and Takemura, A. (2000). Shrinkage estimation towards a closed

convex set with a smooth boundary.

J. Multivariate Anal. 75, 1, 79111.

MRMR1787403 (2001h:62034)

[11] Li, T. F. and Kuo, W. H. (1982). Generalized James-Stein estimators. Comm.

Statist. ATheory Methods 11, 20, 22492257. MRMR678683 (84b:62078)

[12] Rukhin, A. L. (1995). Admissibility: Survey of a concept in progress. International

Statistical Review 63, 95115.

[13] Stein, C. (1956). Inadmissibility of the usual estimator for the mean of a multivariate

normal distribution. In Proceedings of the Third Berkeley Symposium on Mathematical

Statistics and Probability, 19541955, vol. I. University of California Press, Berkeley

and Los Angeles, 197206. MRMR0084922 (18,948c)

[14] Stein, C. (1974). Estimation of the mean of a multivariate normal distribution. In

Proceedings of the Prague Symposium on Asymptotic Statistics (Charles Univ., Prague,

1973), Vol. II. Charles Univ., Prague, 345381. MRMR0381062 (52 #1959)

Center for Spatial Information Science,

The University of Tokyo

5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8568, Japan

E-mail: maruyama@csis.u-tokyo.ac.jp

- Top Ten Math Initiatives for the Next MCPS SuperintendentUploaded byParents' Coalition of Montgomery County, Maryland
- Model Assisted Survey SamplingUploaded byAnonymous UE1TSL
- Solution Manual for Introduction to Econometrics - Gary KoopUploaded byMasoomeh Akbarzadeh
- Random subcubes as a toy model for constraint satisfaction problems. Mora ZdeborovaUploaded byImperatur Peregrinus
- 90638 Complex Numbers Exam-03Uploaded byUmer Iftikhar Ahmed
- TRIAL MATE Pmr 2011 Perak AnswerUploaded bywaichunko
- Tips From Iitjee TopperUploaded bysatyam_hritesh
- Ref Books for JEEUploaded byArun Baroka
- Ch2Introduction to PolynomialsUploaded byStar Pudding
- Fuzzy Modelling Using Sector NonUploaded byAhmad Elsheemy
- Chapter 9 and Mixed Review MC W_ AnswersUploaded byGabby Tanaka
- Efficiency of the Philippine Stock MarketUploaded byfirebirdshockwave
- ALGEBRAIC EXPRESSIONS.pptUploaded bysuhaila
- PhysRevB.13.5188Uploaded byKiran Dasari
- CHAPTER 1 Estimation S2Uploaded bysupercarlaaa
- Assessment of Spatial Variability and Mapping of Soil Properties for Sustainable Agricultural Production Using Geographic Information SystemUploaded byJoji Ann Uayan
- Form 4 Mid TermUploaded byAizat Zakaria
- Derivation of Formulas for Position Change in Kola AnalysisUploaded byIJSRP ORG
- SMOGD User Manual Vsn2.6Uploaded byJaqueline Figuerêdo Rosa
- Covariance Matrix EstimationUploaded byUma Mageshwari
- DCAS10_slides1Uploaded byRajeev Jain
- asdgwagagUploaded byjohnq
- Sociological Methods & Research 2008 Himmelfarb 495 514Uploaded bylucianaeu
- Basic Stat 000Uploaded byahmed22gouda22
- The asymptotic behaviour of threshold-based classification rules in case of three prescribed classesUploaded byJAM
- Criterion sUploaded byCART11
- Effort Estimate.xlsxUploaded byAde Bademosi
- Effort Estimate.xlsxUploaded byAde Bademosi
- Analysis of Repeated Measures Data With Missing Values- An Overview of MethodsUploaded byEcaterina Adascalitei
- A multivariate version of Hoeffding’s Phi-SquareUploaded byXiaojun Song

- Quantifying Probabilistic ExpressionsUploaded byDaniel Marcelino
- Ahn Flavornet 2011Uploaded byAgustina Vitola
- Piketty response to FT data concerns | Money SupplyUploaded byDaniel Marcelino
- Probabilities of Bush Election Outcomes in Congressional Districts | Decisions Based on EvidenceUploaded byDaniel Marcelino
- Dalton & Klingemann-Citizens and Political BehaviorUploaded byDaniel Marcelino
- SciencesPoUploaded byDaniel Marcelino
- BallUploaded byDaniel Marcelino
- ElectoralRules Final wCvrs OnlineUploaded byDaniel Marcelino

- 30034765Uploaded by陳奕儒
- DSX3 InstallUploaded bySagar Mishra
- Wildfly Swarm Users GuideUploaded byThiago Monteiro
- Hilbert Transform and Empirical Mode Decomposition as Tools for Data Analysis - SuzUploaded byjoy4river
- eng ramadan rashad .pdfUploaded byRamadan Rashad
- Layer 2 Support Nexus_ch02Uploaded byTwankie81
- Order to Cash Business FlowUploaded byjayanthm
- Phone Directory Program in C++ using linked listUploaded byCode Seekers
- Catia to Ansys ImportingUploaded byErhan Topaktas
- Chapter 1Uploaded byVettri Array
- HP EliteBook 8440pUploaded bychil21mx6968
- 12Uploaded byChandra Shekhar Singh Bisht
- IP Based Iub InterfaceUploaded byIral25
- May 12 2015 Advanced MatlabUploaded byyasir hamid
- How to Bypass BIOS PasswordUploaded byapi-26502404
- 2 ENG757s2lect2_FormalEvaluationUploaded byEmmanuel Owusu
- String Operation in Python by Ganesh KavharUploaded byganesh kavhar
- HRMS11i_QA2_02-Nov-2011Uploaded byRounaq Kanodia
- acsl 16-17 contest 4 notes - graph theory de assemblyUploaded byapi-328824013
- Https---tabs Ultimate-guitar Com-tab-stone Temple Pilots-trippin on a Hole in a Paper Heart Bass 1250000Uploaded bySebastian Ortiz Reyes
- Keyboard ShortcutsUploaded bynetplanet
- What is Mission StatementsUploaded bySubramaniam Maestro
- Format for Lab11Uploaded byIshwarDattMishra
- Intel® Dual Band Wireless-AC 3160- Product BriefUploaded bySuprio Nandy
- C# CodingUploaded byPrakash Phonde
- Cpcu 520 TocUploaded byshanmuga89
- IT Management Infrastructure ChallengeUploaded byirfan_fani-1
- IT OLYMPICS 2015 Final Mechanics 2Uploaded byAllen Bilar
- Ethical WorkshopUploaded byVikas Chaudhary
- Giai Bai Tap Xac Suat Thong Ke Dao Huu HoUploaded byCậu Bé Buồn